Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat][kubectl-plugin] forward GCS 6379 port #2993

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

davidxia
Copy link
Contributor

in kubectl ray session command for RayClusters
so we can use the ray CLI, e.g. ray status --address localhost:6379.

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

in `kubectl ray session` command for RayClusters
so we can use the `ray` CLI, e.g. `ray status --address localhost:6379`.

Signed-off-by: David Xia <[email protected]>
@davidxia davidxia marked this pull request as ready for review February 11, 2025 04:29
Copy link
Member

@kevin85421 kevin85421 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The GCS port is not designed to be user-facing. It's better not to expose it. Currently, some observability APIs communicate directly with GCS, which is considered technical debt. The preferred approach in the newer observability API in Ray is: user → dashboard → GCS.

  2. Does ray status actually work after kubectl port-forward?

@davidxia
Copy link
Contributor Author

davidxia commented Feb 11, 2025

The preferred approach in the newer observability API in Ray is: user → dashboard → GCS.

@kevin85421 I see. Do you have a link to docs with more details? In particular, what's the recommended way to run ray CLI commands from a local workstation against a remote K8s RayCluster, e.g. one running on GKE?

2: yes various commands that target the GCS port on a remote RayCluster work with this change, e.g. ray status and ray list nodes.

full output
$ kubectl ray session -n NAMESPACE CLUSTER_NAME

[in another terminal window]
$ ray status --address localhost:6379
======== Autoscaler status: 2025-02-11 14:57:44.422695 ========
Node status
---------------------------------------------------------------
Active:
 1 node_1d0e4465da4c338515d9fdf7552f656a185fae7f694c88555eaef0d5
 1 node_3aa4a5e431b77311e74deb920d973c325e762d4cd1da5daf06815334
Pending:
 (no pending nodes)
Recent failures:
 (no failures)

Resources
---------------------------------------------------------------
Usage:
 0.0/4.0 CPU
 0B/8.00GiB memory
 0B/2.27GiB object_store_memory

Demands:
 (no resource demands)


$ ray list nodes --address localhost:6379

======== List: 2025-02-11 14:57:34.360799 ========
Stats:
------------------------------
Total: 2

Table:
------------------------------
    NODE_ID                                                   NODE_IP        IS_HEAD_NODE    STATE    STATE_MESSAGE    NODE_NAME      RESOURCES_TOTAL                 LABELS
 0  1d0e4465da4c338515d9fdf7552f656a185fae7f694c88555eaef0d5  10.169.24.125  False           ALIVE                     10.169.24.125  CPU: 2.0                        ray.io/node_id: 1d0e4465da4c338515d9fdf7552f656a185fae7f694c88555eaef0d5
                                                                                                                                      memory: 4.000 GiB
                                                                                                                                      node:10.169.24.125: 1.0
                                                                                                                                      object_store_memory: 1.169 GiB
 1  3aa4a5e431b77311e74deb920d973c325e762d4cd1da5daf06815334  10.169.9.38    True            ALIVE                     10.169.9.38    CPU: 2.0                        ray.io/node_id: 3aa4a5e431b77311e74deb920d973c325e762d4cd1da5daf06815334
                                                                                                                                      memory: 4.000 GiB
                                                                                                                                      node:10.169.9.38: 1.0
                                                                                                                                      node:__internal_head__: 1.0
                                                                                                                                      object_store_memory: 1.101 GiB

@davidxia
Copy link
Contributor Author

Currently, some observability APIs communicate directly with GCS, which is considered technical debt. The preferred approach in the newer observability API in Ray is: user → dashboard → GCS.

Ah do you mean that in a future release, ray status will use the 8265 port or that the user should use the dashboard web UI to get info instead of a CLI-based way?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants