Skip to content

Add hashing for activity ids#6366

Open
uOJackDu wants to merge 5 commits intoLemmyNet:mainfrom
uOJackDu:outbox-activity-id
Open

Add hashing for activity ids#6366
uOJackDu wants to merge 5 commits intoLemmyNet:mainfrom
uOJackDu:outbox-activity-id

Conversation

@uOJackDu
Copy link

For issue #6341.

Add hashing for activity ids so that the ids of the activity objects do not change on every request.


let mut ordered_items = vec![];
for post_view in post_views {
let post_ap_id = post_view.post.ap_id.clone();
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only place where I am passing in the object_id. I think it makes the ids for Announce and Create stable. Not sure where else I could use it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also needs to be in crates/apub/activities/src/create_or_update/post.rs so both code paths generate the same id.

@uOJackDu uOJackDu marked this pull request as ready for review February 27, 2026 09:10
enum_delegate = "0.2.0"
either = { workspace = true }
lemmy_diesel_utils = { workspace = true }
md5 = "0.8.0"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use sha2 which we are already using as dependency.


let create_or_update =
CreateOrUpdatePage::new(post.into(), &person, &community, kind, &context).await?;
CreateOrUpdatePage::new(post.into(), &person, &community, kind, None, &context).await?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CreateOrUpdatePage::new(post.into(), &person, &community, kind, None, &context).await?;
CreateOrUpdatePage::new(post.into(), &person, &community, kind, Some(post.ap_id), &context).await?;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this can be a problem when editing the post, then each Update activity will have the same id. So you also need to hash the timestamp (published_at or updated_at).

.into();

let id = generate_activity_id(kind.clone(), &context)?;
let id = generate_activity_id(kind.clone(), None, &context)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let id = generate_activity_id(kind.clone(), None, &context)?;
let id = generate_activity_id(kind.clone(), Some(comment.ap_id), &context)?;

fn generate_activity_id<T>(kind: T, context: &LemmyContext) -> Result<Url, ParseError>
fn generate_activity_id<T>(
kind: T,
object_id: Option<&str>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
object_id: Option<&str>,
object_id: Option<&Url>,

/// Generate a unique ID for an activity, in the format:
/// `http(s)://example.com/receive/create/202daf0a-1489-45df-8d2e-c8a3173fed36`
fn generate_activity_id<T>(kind: T, context: &LemmyContext) -> Result<Url, ParseError>
fn generate_activity_id<T>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid passing None in so many places, you could do:

fn generate_activity_id_with_object_id(kind, context) {
  generate_activity_id(kind, None, context)
}

fn generate_activity_id(kind, object_id, context) {
  generate_activity_id(kind, context)
}

@dessalines
Copy link
Member

dessalines commented Mar 2, 2026

What is the minimum amount of info needed to create uniqueness for these generated ap_ids?

Seems like kind/action, numeric_object_id should be enough right?

@Nutomic
Copy link
Member

Nutomic commented Mar 3, 2026

That is not enough, because if you edit a post, it is federated as Update<Page> activity each time. So the kind and object id will be identical every time, but the generated ap_id must be different (otherwise it will be ignored as duplicate on the receiving side). So the object created_at/updated_at timestamp also needs to go into the hash.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants