As your workloads surge, or decrease, on your Ray clusters on Vertex AI, you can manually scale the number of replicas to match demand. For example, if you have excess capacity you can scale down your worker pools to save costs. This page describes how to change the number of replicas for existing worker pools.
Limitations
When you scale clusters, you can change only the number of replicas in your existing worker pools. You can't, for example, add or remove worker pools from your cluster or change the machine type of your worker pools. Also, the number of replicas for your worker pools can't be lower than one.
If you are using a VPC peering connection to connect to your clusters, there's a limitation on the maximum number of nodes. The maximum number of nodes depends on the number of nodes the cluster had when it was created. For more information, see Max number of nodes calculation. This maximum number includes not just your worker pools but also your head node. If you use the default network configuration, the number of nodes cannot exceed the upper limits as described in the create clusters documentation.
Max number of nodes calculation
If you're using private services access (VPC peering) to connect
to your nodes, use the following formulas to check that you don't exceed the
maximum number of nodes (M
), assuming f(x) = min(29, (32 -
ceiling(log2(x)))
:
f(2 * M) = f(2 * N)
f(64 * M) = f(64 * N)
f(max(32, 16 + M)) = f(max(32, 16 + N))
The maximum total number of nodes in the Ray on Vertex AI cluster you can
scale up to (M
) depends on the initial total number of nodes you set up (N
).
After you create the Ray on Vertex AI cluster, you can scale the total
number of nodes to any amount between P
and M
inclusive, where P
is the
number of pools in your cluster.
The initial total number of nodes in the cluster and the scaling up target number must be in the same color block.
Update replica count
You can use the Google Cloud console or Vertex AI SDK for Python to update your worker pool's replica count. If your cluster includes multiple worker pools, you can individually change each of their replica counts in a single request.
Console
In the Google Cloud console, go to the Ray on Vertex AI page.
From the list of clusters, click the cluster to modify.
On the Cluster details page, click Edit cluster.
In the Edit cluster pane, select the worker pool to update and then modify the replica count.
Click Update.
Wait a few minutes for your cluster to update. When the update is complete, you can see the updated replica count on the Cluster details page.
Ray on Vertex AI SDK
import vertexai import vertex_ray vertexai.init() cluster = vertex_ray.get_ray_cluster("CLUSTER_NAME") # Get the resource name. cluster_resource_name = cluster.cluster_resource_name # Create the new worker pools new_worker_node_types = [] for worker_node_type in cluster.worker_node_types: worker_node_type.node_count = REPLICA_COUNT # new worker pool size new_worker_node_types.append(worker_node_type) # Make update call updated_cluster_resource_name = vertex_ray.update_ray_cluster( cluster_resource_name=cluster_resource_name, worker_node_types=new_worker_node_types, )