Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize tagging controller workqueue handling #1091

Merged
merged 1 commit into from
Jan 23, 2025

Conversation

kmala
Copy link
Member

@kmala kmala commented Jan 16, 2025

What type of PR is this?

/kind bug

What this PR does / why we need it:

The tagging controller can add the same node multiple times to the workqueue since the node is updated frequently by the kubelet if the add event for that node is not processed yet. This is because for the workqueue add we are not adding an object which is different for each event. When a large number of nodes are added at a time, the queue can become so big that it can take hours/days for all the events to be handled.
Removed the time and retry count from the workqueue object as they are provided by default by the workqueue library and made the workqueue object simple such that multiple add/updates to a node will only add one item to the workqueue.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

cloudprovider_aws_tagging_controller_work_item_duration_seconds is no longer published by the controller and instead the same details can be fetched from the default work queue metrics(https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/component-base/metrics/prometheus/workqueue/metrics.go#L40)

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jan 16, 2025
@k8s-ci-robot k8s-ci-robot added needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jan 16, 2025
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If cloud-provider-aws contributors determine this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jan 16, 2025
@dims
Copy link
Member

dims commented Jan 16, 2025

/hold

give me a bit please to review this!

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 16, 2025
@dims
Copy link
Member

dims commented Jan 16, 2025

/hold cancel

Looks good to me @kmala but some questions in line.

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 16, 2025
)

var register sync.Once

var (
workItemDuration = metrics.NewHistogramVec(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we ok dropping this metric?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have an equivalent metric from the workqueue package itself via:

_ "k8s.io/component-base/metrics/prometheus/workqueue" // enable prometheus provider for workqueue metrics

but please make sure we actually scrape that one @kmala !

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

work queue metrics are available by default https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/component-base/metrics/prometheus/workqueue/metrics.go#L40 and i had verified the same the metrics endpoint of the CCM will provide the tagging controller metrics

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we've had those metrics for a while because of the import above (you have to import it for these to be registered: https://github.com/kubernetes/kubernetes/blob/d06398aac3e378da0c95472cd39713998206e9ff/staging/src/k8s.io/component-base/metrics/prometheus/workqueue/metrics.go#L26)

But do we actually scrape those? I'm pretty sure we've been scraping this other metric that we're removing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, these metrics are available and i had verified the same by querying the metrics endpoint.
what do you mean by scrape those?

for _, resource := range tc.resources {
switch resource {
case opt.Instance:
err := tc.tagEc2Instance(node)
v1node, err := tc.nodeInformer.Lister().Get(node.name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to try to capture all necessary information in taggingControllerNode itself to avoid this lookup?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(in a follow up!)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to have a lookup if we want to avoid having multiple items in workqueue since the labels can be different on different updates.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The *Node should point to the same struct though, right? the watcher/lister should only have one copy of the Node in memory

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Jan 16, 2025
Comment on lines +384 to +389
// if the struct has fields which are all comparable then the workqueue add will handle make sure multiple adds of the same object
// will only have one item in the workqueue.
item := workItem{
name: node.GetName(),
providerID: node.Spec.ProviderID,
action: action,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The action func wasn't comparable, so this change allows multiple workitems for the same node, with different actions. Is that desirable? Why change action to a string?

Copy link
Contributor

@ndbaker1 ndbaker1 Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking the same. Seems we're relying on queuing-order to determine the final state which boils down to a last-write-wins situation in the workqueue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this change allows multiple workitems for the same node, with different actions. Is that desirable?

Before we are adding to workitems for every update if the add hasn't happened and there could be work item for the delete also. With this change we would only have maximum of 2 work items in the queue which should be the same behavior as before as as single or multiple add tag work items would mean the same behavior at the end.

the action to a string is made so that its comparable and will prevent duplicate updates to not add to the workqueue.

Comment on lines -49 to -50
requeuingCount int
enqueueTime time.Time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So these 2 fields were the crux of the issue, right? If a workItem already existed in the queue, it's enqueueTime would always be different, even if it's requeingCount happened to match (0).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, mostly

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I am wrong, the requeuingCount looks like there is a maximum retry. Are we good to remove it?
Is there a option to keep those fields and only compare the node identity fields when enqueue?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can get the requeue count from the workqueue itself and i changed the code to use that. So the functionality remains the same as before.

@cartermckinnon
Copy link
Contributor

@ndbaker1 can you review this one?

@cartermckinnon
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 23, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cartermckinnon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot added a commit that referenced this pull request Jan 23, 2025
…stream-release-1.32

Automated cherry pick of #1091: optimize tagging controller workqueue handling
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants