[nexus] Affinity and Anti-Affinity Groups #7076

smklein · 2024-11-15T01:38:01Z

This PR is a partial implementation of RFD 522

It adds:

Affinity and Anti-Affinity groups, contained within projects. These groups are configured with a policy and failure domain can currently contain zero or more members. Affinity groups attempt to co-locate members, anti-affinity groups attempt to avoid co-locating members.
- Policy describes "what to do if we cannot fulfill the co-location request". Currently, these options are "fail" (reject the request) or "allow" (continue with provisioning of the group member regardless).
- Failure Domain describes the scope of what is considered "co-located". In this PR, the only option is "sled", but in the future, this may be expanded to e.g. "rack".
- Members describe what can be added to affinity/anti-affinity groups. In this PR, the only option is "instance". RFD 522 describes how "anti-affinity groups may also contain affinity groups" -- which is why this "member" terminology is introduced -- but it is not yet implemented.
(anti-)Affinity groups are exposed by the API, through a CRUD interface
(anti-)Affinity groups are considered during "sled reservation", where instances are placed on a sled. This is most significantly implemented (and tested) within nexus/db-queries/src/db/datastore/sled.rs.

common/src/api/external/mod.rs

smklein · 2024-11-15T01:44:41Z

common/src/api/external/mod.rs

+
+#[derive(Clone, Debug, Deserialize, Serialize, JsonSchema)]
+pub enum AffinityGroupMember {
+    Instance(Uuid),


I know this is an "enum of one", but RFD 522 discusses having anti-affinity groups which contain "either instances or affinity groups". I figured I'd just define these as "members" to be flexible for future work

We respond with a member ID here, which isn't the most readable way for the user to see the group members. They'd need to then fetch each to get the name – and whilst we could string those queries on the console, the CLI wouldn't do that. Any thoughts? Perhaps responding with: ID, name and project?

Filed #7625 to track this.

#7572 will complicate this a little bit, when we do add "group memberships of things that are not just instances".

Since affinity groups are already project-scoped, I don't really want to return the project name for each. But the names of members seems totally reasonable.

common/src/api/external/mod.rs

hawkw

Hopefully a quick drive-by review wasn't too premature, but this stuff is relevant to my interests, so I wanted to take a peek. Overall, everything looks very straightforward and reasonable --- I commented on a few small things, but it's quite possible you were planning to get to all of them and just hadn't gotten around to it yet.

common/src/api/external/mod.rs

nexus/db-model/src/affinity.rs

hawkw · 2024-11-15T18:10:54Z

nexus/db-model/src/affinity.rs

+#[derive(Queryable, Insertable, Clone, Debug, Selectable)]
+#[diesel(table_name = affinity_group_instance_membership)]
+pub struct AffinityGroupInstanceMembership {
+    pub group_id: DbTypedUuid<AffinityGroupKind>,
+    pub instance_id: DbTypedUuid<InstanceKind>,
+}
+
+#[derive(Queryable, Insertable, Clone, Debug, Selectable)]
+#[diesel(table_name = anti_affinity_group_instance_membership)]
+pub struct AntiAffinityGroupInstanceMembership {
+    pub group_id: DbTypedUuid<AntiAffinityGroupKind>,
+    pub instance_id: DbTypedUuid<InstanceKind>,
+}


I note that these lack created/deleted timestamps, implying that:

we intend to hard-delete rather than soft-delete them, and,

we don't presently record when instances were added to affinity/anti-affinity groups, so we can't present that in UIs in the future.

I'm not sure if either of these matter to us, but I figured I'd comment on it.

When I wrote this I was kinda planning on just using hard deletion for membership. I could definitely add the "time_modified" and "time_deleted" columns, and use soft-deletion here too, but as usual, we'll need to be more cautious with our indexing.

("Why hard delete" -> this was kinda arbitrary, my decision here isn't super strong, but I'm more familiar with us using soft deletion for user-facing objects that have full CRUD APIs, and hard-deletion for more internal-facing stuff, to avoid the cost of garbage collecting later, which we haven't really done at all)

That's totally fair!

The reason I brought up deletion was because if we start out with a schema that uses hard-deletion, and decide to switch to soft-deletion later in order to do something like display affinity group histories in the UI, we can't get back records from before that change, since...they've been deleted. On the other hand, if we started with soft-deletion, we could switch to hard-deletion and blow away any soft-deleted records if we decide to not use them in that way.

On the other hand, maybe the problem of displaying historical affinity group changes is better solved by other things, like audit logging! I dunno.

nexus/db-model/src/schema.rs

nexus/external-api/src/lib.rs

common/src/api/external/mod.rs

…sting

common/src/api/external/http_pagination.rs

smklein · 2025-01-23T21:14:56Z

nexus/db-queries/src/db/datastore/sled.rs


+    async fn sled_reservation_create_inner(


FYI @hawkw , you asked about this in the Hypervisor Sync.

The original API of the function pub async fn sled_reservation_create still exists above, but I made this "inner" function to get better testing of errors.

But we could also report these errors through e.g. an alerting system or something.

smklein · 2025-01-29T02:14:54Z

common/src/api/external/http_pagination.rs

@@ -312,6 +312,19 @@ pub type PaginatedByNameOrId<Selector = ()> = PaginationParams<
 pub type PageSelectorByNameOrId<Selector = ()> =
    PageSelector<ScanByNameOrId<Selector>, NameOrId>;

+pub fn id_pagination<'a, Selector>(


I'm using this in http_entrypoints.rs to paginate over "group members", which only have UUIDs, not names.

smklein · 2025-01-29T02:19:23Z

nexus/db-queries/src/db/datastore/sled.rs

+                        instance_id,
+                    ).get_results_async::<(AffinityPolicy, Uuid)>(&conn).await?;
+
+                    // We use the following logic to calculate a desirable sled,


This is sorta the "meat and potatoes" of this PR -- specifically, updating the logic for instance placement based on affinity groups.

This file also contains several tests for various combinations of affinity groups / instances, and I'm always happy to add more.

smklein · 2025-01-29T02:23:06Z

schema/crdb/dbinit.sql

@@ -213,7 +213,7 @@ CREATE INDEX IF NOT EXISTS lookup_sled_by_policy_and_state ON omicron.public.sle
 );

 CREATE TYPE IF NOT EXISTS omicron.public.sled_resource_kind AS ENUM (
-    -- omicron.public.instance
+    -- omicron.public.vmm ; this is called "instance" for historical reasons.


This tripped me up. This is not the UUID of an instance, it's the UUID of a propolis VMM that has been allocated. These are very subtly different objects.

smklein · 2025-01-29T02:25:44Z

schema/crdb/dbinit.sql

+
+    -- The UUID of an instance, if this resource belongs to an instance.
+    instance_id UUID


... I do, however, want to know the instance UUID of this object, if it exists.

I don't think I can rely on the omicron.public.vmm table existing by the time reservations are made - note that e.g. for instance start, "allocating a server" happens before "VMM record creation":

omicron/nexus/src/app/sagas/instance_start.rs

Lines 136 to 138 in 7f05fce

builder.append(alloc_server_action());

builder.append(alloc_propolis_ip_action());

builder.append(create_vmm_record_action());

So - when we create a sled_resource record on behalf of an instance, we need a way for other requests to identify "what instance does this VMM belong to?". Hence the addition of this optional field.

smklein · 2025-01-30T21:20:20Z

The contents of this PR should be unchanged, but as a heads-up, I'm going to be splitting it out into a few equivalent smaller PRs to make this easier to review:

#7443) Pulled out of #7076 Exposes Affinity/Anti-Affinity Groups in the API, but as unimplemented.

Pulled out of #7076 Updates schema to include Affinity/Anti-Affinity groups, but does not use these schemas yet.

#7443) Pulled out of #7076 Exposes Affinity/Anti-Affinity Groups in the API, but as unimplemented.

Pulled out of #7076 Updates schema to include Affinity/Anti-Affinity groups, but does not use these schemas yet.

…ups (#7445) Pulled out of #7076 Adds auth structures for Affinity/Anti-Affinity Groups, as well as datastore methods for implementing CRUD (Create, Read, Update Delete) methods on these groups.

benjaminleonard · 2025-02-24T17:21:50Z

Sean, is there a fixed limit to the number of members in either affinity or anti-affinity groups? Other than the natural limit of sufficient sleds for anti-affinity and sufficient resources for affinity?

smklein · 2025-02-24T17:25:14Z

Sean, is there a fixed limit to the number of members in either affinity or anti-affinity groups? Other than the natural limit of sufficient sleds for anti-affinity and sufficient resources for affinity?

As currently implemented, no limit. We could definitely add one though.

benjaminleonard · 2025-02-24T17:32:07Z

I ask because there are places where we soft-validate in the console, before a request is even sent to the control plane. But I guess this is dependent on how many sleds and capacity is currently available, and if we set the limit to the maximum a user could potentially add, then it's unlikely they'd ever see that validation. (Talked myself out of that 🙂)

…placement (#7446) Pulled out of #7076 Modifies the instance placement logic to consider affinity and anti-affinity groups. This is still technically "internal-only", because the HTTP endpoints to create affinity groups are, as of this PR, still unimplemented.

…#7447) Pulled out of #7076 This PR is a partial implementation of RFD 522 It adds: - Affinity and Anti-Affinity groups, contained within projects. These groups are configured with a **policy** and **failure domain** can currently contain zero or more **members**. Affinity groups attempt to co-locate members, anti-affinity groups attempt to avoid co-locating members. - **Policy** describes "what to do if we cannot fulfill the co-location request". Currently, these options are "fail" (reject the request) or "allow" (continue with provisioning of the group member regardless). - **Failure Domain** describes the scope of what is considered "co-located". In this PR, the only option is "sled", but in the future, this may be expanded to e.g. "rack". - **Members** describe what can be added to affinity/anti-affinity groups. In this PR, the only option is "instance". RFD 522 describes how "anti-affinity groups may also contain affinity groups" -- which is why this "member" terminology is introduced -- but it is not yet implemented. - (anti-)Affinity groups are exposed by the API, through a CRUD interface - (anti-)Affinity groups are considered during "sled reservation", where instances are placed on a sled. This is most significantly implemented (and tested) within `nexus/db-queries/src/db/datastore/sled.rs`, within #7446 Fixes #1705

smklein · 2025-02-25T19:31:04Z

Gonna close this PR - I'm trying to track things through smaller PRs and other issues.

#7626 is my tracking issue for affinity follow-up work

[wip] Sketching out API, DB models for affinity and anti-affinity

9e736b2

smklein commented Nov 15, 2024

View reviewed changes

hawkw self-requested a review November 15, 2024 18:03

hawkw reviewed Nov 15, 2024

View reviewed changes

smklein added 23 commits November 18, 2024 09:46

affinity distance -> failure domain

472668a

Docs for affinity policy

f0caff8

failure domain docs

7ed38e4

Query params

a500448

Merge branch 'main' into affinity

8cc9d0a

Plumbing through lookup, auth

afe4175

Wiring up more of the API, db side still unimplemented

f677d0b

CRUD impl for affinity groups in db

729e67d

Merge branch 'main' into affinity

0d3162e

schema tweaks

ddbd9d4

starting tests, adding indexes

9df029f

Add affinity tests, cleanup during db deletion, OSO registration

80da442

Merge branch 'main' into affinity

e44c210

Merge branch 'main' into affinity

ff7054a

Merge branch 'main' into affinity

ee03721

Integrated affinity groups into instance selection, barely started te…

6ec80b1

…sting

Expectorate tests

c3b1596

Improve queries, testing

5c9252a

Fixing bug (difference not symmetric) and improving testing

7e89c5c

preference tweaking, more tests

75d106f

more tests

f4a846f

more testing, better errors internally

5777058

cleanup

6d88ea0

smklein commented Jan 23, 2025

View reviewed changes

smklein added 2 commits January 23, 2025 14:46

Add tag for affinity

12e4fd2

Making the nexus API work

5e74942

smklein added 5 commits January 28, 2025 17:29

Anti-affinity integration tests

fc62869

clippy

ef45e2d

Merge branch 'main' into affinity

27fa1b2

Patch some idempotency tests

cdc033d

Better UUID typing

e295db6

smklein changed the title ~~[wip] Sketching out API, DB models for affinity and anti-affinity~~ [nexus] Affinity and anti-affinity Jan 29, 2025

smklein commented Jan 29, 2025

View reviewed changes

smklein marked this pull request as ready for review January 29, 2025 17:37

smklein changed the title ~~[nexus] Affinity and anti-affinity~~ [nexus] Affinity and Anti-Affinity Groups Jan 29, 2025

Merge branch 'main' into affinity

e21fac8

benjaminleonard mentioned this pull request Feb 10, 2025

Affinity and Anti-Affinity Groups Designs oxidecomputer/console#2683

Closed

smklein added a commit that referenced this pull request Feb 19, 2025

(1/5) [nexus] Add Affinity/Anti-Affinity Groups to API (unimplemented) (

18e317b

#7443) Pulled out of #7076 Exposes Affinity/Anti-Affinity Groups in the API, but as unimplemented.

smklein added a commit that referenced this pull request Feb 21, 2025

(2/5) [nexus] Add Affinity/Anti-Affinity groups to database (#7444)

2850102

Pulled out of #7076 Updates schema to include Affinity/Anti-Affinity groups, but does not use these schemas yet.

hawkw pushed a commit that referenced this pull request Feb 21, 2025

(1/5) [nexus] Add Affinity/Anti-Affinity Groups to API (unimplemented) (

2add0db

#7443) Pulled out of #7076 Exposes Affinity/Anti-Affinity Groups in the API, but as unimplemented.

hawkw pushed a commit that referenced this pull request Feb 21, 2025

(2/5) [nexus] Add Affinity/Anti-Affinity groups to database (#7444)

ad530fa

Pulled out of #7076 Updates schema to include Affinity/Anti-Affinity groups, but does not use these schemas yet.

smklein closed this Feb 25, 2025

smklein deleted the affinity branch February 25, 2025 19:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[nexus] Affinity and Anti-Affinity Groups #7076

[nexus] Affinity and Anti-Affinity Groups #7076

smklein commented Nov 15, 2024 •

edited

Loading

smklein Nov 15, 2024

benjaminleonard Feb 25, 2025

smklein Feb 25, 2025

hawkw left a comment

hawkw Nov 15, 2024

smklein Nov 18, 2024

hawkw Nov 18, 2024

smklein Jan 23, 2025

smklein Jan 29, 2025

smklein Jan 29, 2025

smklein Jan 29, 2025

smklein Jan 29, 2025

smklein commented Jan 30, 2025

benjaminleonard commented Feb 24, 2025

smklein commented Feb 24, 2025

benjaminleonard commented Feb 24, 2025

smklein commented Feb 25, 2025


		-- The UUID of an instance, if this resource belongs to an instance.
		instance_id UUID

	builder.append(alloc_server_action());
	builder.append(alloc_propolis_ip_action());
	builder.append(create_vmm_record_action());

[nexus] Affinity and Anti-Affinity Groups #7076

[nexus] Affinity and Anti-Affinity Groups #7076

Conversation

smklein commented Nov 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hawkw left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

smklein commented Jan 30, 2025

benjaminleonard commented Feb 24, 2025

smklein commented Feb 24, 2025

benjaminleonard commented Feb 24, 2025

smklein commented Feb 25, 2025

smklein commented Nov 15, 2024 •

edited

Loading