Skip to content

[core] Split raylet cython file into multiple files (DynamicObjectRefGenerator) #52095

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

machichima
Copy link
Contributor

@machichima machichima commented Apr 8, 2025

Why are these changes needed?

As mentioned in #51080, separate DynamicObjectRefGenerator class from the large _raylet.pyx file.

Related issue number

Closes #52029

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@machichima
Copy link
Contributor Author

Hi @dentiny ,

I'm new to cython and might make some mistake here. Please have a look and let me know if anything should be modified.

Thanks!

@dentiny
Copy link
Contributor

dentiny commented Apr 8, 2025

I think you need to change all python caller side?

@hainesmichaelc hainesmichaelc added the community-contribution Contributed by the community label Apr 9, 2025
@machichima
Copy link
Contributor Author

It seems like the import error happens in _raylet.pyx. As originally the class is defined as the python style, so I assume that it is fine to keep it like so, only create .pyx file and use import in raylet.pyx?

[2025-04-08T15:10:13Z] Traceback (most recent call last):
[2025-04-08T15:10:13Z]   File "/root/.cache/bazel/_bazel_root/1df605deb6d24fc8068f6e25793ec703/execroot/com_github_ray_project_ray/bazel-out/k8-opt/bin/python/ray/autoscaler/v2/tests/test_sdk.runfiles/com_github_ray_project_ray/python/ray/autoscaler/v2/tests/test_sdk.py", line 11, in <module>
[2025-04-08T15:10:13Z]     import ray
[2025-04-08T15:10:13Z]   File "/rayci/python/ray/__init__.py", line 85, in <module>
[2025-04-08T15:10:13Z]     import ray._raylet  # noqa: E402
[2025-04-08T15:10:13Z]   File "python/ray/_raylet.pyx", line 1, in init ray._raylet
[2025-04-08T15:10:13Z] ModuleNotFoundError: No module named 'ray.cython_components.object_ref_generator'

-e
Signed-off-by: machichima <[email protected]>
@jcotant1 jcotant1 added the core Issues that should be addressed in Ray Core label Apr 10, 2025
@machichima
Copy link
Contributor Author

Hi @dentiny ,
Could you please have a look at this? I put it in a .py file as the class is simply a python class, not cython

Thanks!

@dentiny dentiny added the go add ONLY when ready to merge, run all tests label Apr 12, 2025
Copy link
Contributor

@dentiny dentiny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There're two main purpose for the issue (split classes and functions out of the giant cython file as possible):

  • (major) speed up compilation speed, it's a painpoint for any dev work related to FFI
  • (minor) increase code maintainability by moving some pieces out of the file (you could tell the file is so large)

I think your change fits in the second purpose, so it looks ok to me.
CC @jjyao / @edoakes / @dayshah if you have other thoughts.

@@ -0,0 +1,17 @@
from ray.util.annotations import DeveloperAPI
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not cython code so there's no need to put it under cython_components. how about just ray/_private/dynamic_object_ref_generator.py?

Copy link
Contributor Author

@machichima machichima Apr 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! Just moved it to _private.
Thanks!

-e
Signed-off-by: machichima <[email protected]>
Comment on lines 6 to 17
def __init__(self, refs):
# TODO(swang): As an optimization, can also store the generator
# ObjectID so that we don't need to keep individual ref counts for the
# inner ObjectRefs.
self._refs = refs

def __iter__(self):
for ref in self._refs:
yield ref

def __len__(self):
return len(self._refs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while you're at it, do you mind adding type hints for this code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @edoakes ,
I just tried and notice that we cannot import ObjectRef here, as this is the .py file, and from ray import ObjectRef cause the circular import. Would you mind if I use list[Any] here for refs? Or is there any way that I can use ObjectRef as type here?

Thanks!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can use a string type hint instead: List["ray.ObjectRef"]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@edoakes I just added the type hint. PTAL
Thanks!

…cython-dynamic-obj-ref-gen

-e
Signed-off-by: machichima <[email protected]>
…cython-dynamic-obj-ref-gen

-e
Signed-off-by: machichima <[email protected]>
…cython-dynamic-obj-ref-gen

-e
Signed-off-by: machichima <[email protected]>
-e
Signed-off-by: machichima <[email protected]>
@machichima machichima force-pushed the 52029-split-cython-dynamic-obj-ref-gen branch from 98a3c5f to d894486 Compare April 17, 2025 14:01
@edoakes edoakes merged commit ca826ec into ray-project:master Apr 18, 2025
5 checks passed
@edoakes
Copy link
Collaborator

edoakes commented Apr 18, 2025

thanks @machichima !

@machichima machichima deleted the 52029-split-cython-dynamic-obj-ref-gen branch April 18, 2025 23:14
ktyxx pushed a commit to ktyxx/ray that referenced this pull request Apr 29, 2025
…Generator) (ray-project#52095)

As mentioned in ray-project#51080, separate `DynamicObjectRefGenerator` class from
the large `_raylet.pyx` file.

Closes ray-project#52029
@wingkitlee0
Copy link
Contributor

wingkitlee0 commented May 3, 2025

Sorry for leaving comment in a merged PR. This could be just a minor change..
Just curious if it's intentional that the while-loop is replaced by a for-loop:

while self._refs:
  yield self._refs.pop(0)

I am asking this because the len/size changes after each iteration/next() in the old behavior, while it keeps the full list and keeps the len in the new behavior.
Happy to open a new issue for discussion.

@dayshah
Copy link
Contributor

dayshah commented May 3, 2025

Sorry for leaving comment in a merged PR. This could be just a minor change.. Just curious if it's intentional that the while-loop is replaced by a for-loop:

while self._refs:
  yield self._refs.pop(0)

I am asking this because the len/size changes after each iteration/next() in the old behavior, while it keeps the full list and keeps the len in the new behavior. Happy to open a new issue for discussion.

cc @edoakes not that familiar with how DynamicObjectRefGenerator works, but any concern of the object sticking around longer in obj store bc the ref sticks around in the list without the pop?

vickytsang pushed a commit to ROCm/ray that referenced this pull request May 5, 2025
…Generator) (ray-project#52095)

As mentioned in ray-project#51080, separate `DynamicObjectRefGenerator` class from
the large `_raylet.pyx` file.

Closes ray-project#52029
@dayshah
Copy link
Contributor

dayshah commented May 7, 2025

Sorry for leaving comment in a merged PR. This could be just a minor change.. Just curious if it's intentional that the while-loop is replaced by a for-loop:

while self._refs:
  yield self._refs.pop(0)

I am asking this because the len/size changes after each iteration/next() in the old behavior, while it keeps the full list and keeps the len in the new behavior. Happy to open a new issue for discussion.

@wingkitlee0 I confirmed this did introduce an actual bug and opened a pr to fix it. Thanks a lot for noticing it and bringing it up and being a great member of the Ray community!

edoakes pushed a commit that referenced this pull request May 7, 2025
In #52095, the while loop in
DynamicObjectRefGenerator that pops the ref was just changed to a for
loop that iterates through the refs. This results in the refs sticking
around and using up memory when they can be destroyed.

Also changing the list to a deque for efficiency on popping from the
front.
---------

Signed-off-by: dayshah <[email protected]>
zhaoch23 pushed a commit to Bye-legumes/ray that referenced this pull request May 14, 2025
…Generator) (ray-project#52095)

As mentioned in ray-project#51080, separate `DynamicObjectRefGenerator` class from
the large `_raylet.pyx` file.

Closes ray-project#52029

Signed-off-by: zhaoch23 <[email protected]>
zhaoch23 pushed a commit to Bye-legumes/ray that referenced this pull request May 14, 2025
In ray-project#52095, the while loop in
DynamicObjectRefGenerator that pops the ref was just changed to a for
loop that iterates through the refs. This results in the refs sticking
around and using up memory when they can be destroyed.

Also changing the list to a deque for efficiency on popping from the
front.
---------

Signed-off-by: dayshah <[email protected]>
Signed-off-by: zhaoch23 <[email protected]>
vickytsang pushed a commit to ROCm/ray that referenced this pull request Jun 3, 2025
In ray-project#52095, the while loop in
DynamicObjectRefGenerator that pops the ref was just changed to a for
loop that iterates through the refs. This results in the refs sticking
around and using up memory when they can be destroyed.

Also changing the list to a deque for efficiency on popping from the
front.
---------

Signed-off-by: dayshah <[email protected]>
Signed-off-by: Vicky Tsang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community-backlog community-contribution Contributed by the community core Issues that should be addressed in Ray Core go add ONLY when ready to merge, run all tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[core] Split raylet cython file into multiple files (DynamicObjectRefGenerator)
7 participants