Skip to content

[ClusterPipeline] ExecutorService/thread is created and destroyed too frequently in ClusterPipeline #4141

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
xrayw opened this issue Apr 11, 2025 · 9 comments

Comments

@xrayw
Copy link
Contributor

xrayw commented Apr 11, 2025

The code at here

ExecutorService executorService = Executors.newFixedThreadPool(MULTI_NODE_PIPELINE_SYNC_WORKERS);

It will create a ExecutorService for each pipeline sync, This will cause huge cost when there are many pipeline operations

@ggivo
Copy link
Collaborator

ggivo commented Apr 11, 2025

Thanks for reporting it. Multiple sources have raised this valid concern.
Link to PR #4070 trying to address the same issue with ongoing discussions

@xrayw
Copy link
Contributor Author

xrayw commented Apr 17, 2025

@ggivo thank you, and by the way

pipelinedResponses.putIfAbsent(nodeKey, new LinkedList<>());

is it should use computeIfAbsent? instead of putIfAbsent.

pipelinedResponses.putIfAbsent(nodeKey, new LinkedList<>());
queue = pipelinedResponses.get(nodeKey);

-> 
queue = pipelinedResponses.computeIfAbsent(nodeKey, (_) -> new LinkedList<>())

@ggivo
Copy link
Collaborator

ggivo commented Apr 17, 2025

is it should use computeIfAbsent? instead of putIfAbsent.

Do you have anything particular in mind?
Otherwise I would say, No.
We have already checked if (pipelinedResponses.containsKey(nodeKey)) and also not using the result.

@xrayw
Copy link
Contributor Author

xrayw commented Apr 17, 2025

Because each putIfAbsent will create a list object, but if the key already exists in the map, this list doesn't need to be created.

@ggivo
Copy link
Collaborator

ggivo commented Apr 18, 2025

From a performance optimization perspective, I think there is not much of a difference, since we will execute putIfAbsent code only if the pipelinedResponses does not contain the nodeKey.

The current implementation of MultiNodePipelineBase is not thread-safe. And since we already check for if (pipelinedResponses.containsKey(nodeKey)), we will run trough pipelinedResponses.putIfAbsent(nodeKey, new LinkedList<>()); only the first time when we need to initialise it.

@xrayw
Copy link
Contributor Author

xrayw commented Apr 19, 2025

@ggivo Yes, You're right. we can just even use put. because we checked it not contains.

@xrayw
Copy link
Contributor Author

xrayw commented Apr 23, 2025

And if pipelinedResponses.size() == 1, we don't need to submit it to ExecutorService. but run it on the work thread directly. to avoid the cost of thread context switch.

and we can run the last Map.Entry<HostAndPort, Queue<Response<?>>> on the work thread to save resources.

If you think so, I'd like to create a PR.
Thanks.

@ggivo
Copy link
Collaborator

ggivo commented Apr 23, 2025

@xrayw
Assuming there are pipelines with all commands ending up on the same node, it sounds reasonable. The only drawback I see is adding a bit of complexity to the code.

Note: Since it does not address the main concern described in this Issue, I suggest opening a dedicated Issue to track this optimization.

@xrayw
Copy link
Contributor Author

xrayw commented Apr 23, 2025

Thank you ,i createed a new issue to track it. #4148

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants