Skip to content

bucket notifications - facilitate notif conf to override connection #8932

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

alphaprinz
Copy link
Contributor

@alphaprinz alphaprinz commented Apr 3, 2025

Describe the Problem

We want to allow user to describe an external Kafka server in a connection file, but allow to specify to specify Kafka topic per notification conf (instead of in the connection).

Explain the Changes

TopicConfigurations' TopicArn field can be of this syntax "kafka:::topic//", where:
-if connection is not specified, the default account's connection is used.
-Kafka topic in TopicArn takes precedence over Kafka topic in connection.
-backward compatible - you can still put just the connection in TopicArn, like now

Issues: Fixed #xxx / Gap #xxx

Testing Instructions:

  1. Set "default_connection" field in NC user.

  2. Create a connection file for kafka (with or without topic field).
    {
    "notification_protocol": "kafka",
    "name": "k",
    "topic": "mytopic",
    "kafka_options_object": "{"metadata.broker.list":"localhost:9092"}"
    }

  3. Configure a notification with override object:
    {
    "TopicConfigurations": [
    {
    "Id": "first",
    "TopicArn": "kafka:::topic//override",
    "Events": []
    }
    ]
    }

  4. Create an event (ie upload object to bucket with notification).

  5. The notification will be sent with user's default connection with topic "override".

  • Doc added/updated - TBD
  • Tests added

@alphaprinz alphaprinz requested a review from guymguym April 3, 2025 03:19
@alphaprinz alphaprinz force-pushed the notif_kafka_connect2 branch 2 times, most recently from 391fe76 to b32378a Compare April 3, 2025 05:39
Copy link
Member

@guymguym guymguym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should follow the arn structure format, see comments posted above.

Comment on lines 191 to 193
const filename_parts = connect_filename_with_overrides.split('?');
const connect_filename_no_overrides = filename_parts[0];
const overrides_str = filename_parts[1];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this ad hoc arn structure is not a good idea, we should try to map it to an arn format.

For kafka I expect the kafka arn format like in aws where ClusterName is our connection name, ClusterUuid should match the kafka cluster uuid , and TopicName would be optional:

arn:aws:kafka:::cluster/${ClusterName}/${ClusterUuid}
arn:aws:kafka:::topic/${ClusterName}/${ClusterUuid}/${TopicName}

For http webhook there is no matching arn in aws, however there is api-gateway or lambda. I would go with something in the spirit of api-gateway execute-api arn where ApiId is our connection name, Stage is less relevant as we do not manage the webhook lifecycle but I would still keep it as part of the arn structure, and finally method and path are very relevant:

arn:aws:execute-api:::${ApiId}/${Stage}/${Method}/${ApiSpecificResourcePath}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made the topic for kafka to optionally be "kafka:::topic".
Still haven't found any mention of cluster UUID in the client.
Note that using '/' separator in containerized is somewhat confusing because I also need secret name in addition to connection name (currently, at least).

@alphaprinz alphaprinz force-pushed the notif_kafka_connect2 branch from b32378a to 5d5ee14 Compare April 10, 2025 18:08
@pull-request-size pull-request-size bot added size/S and removed size/M labels Apr 10, 2025
@alphaprinz alphaprinz requested a review from guymguym April 10, 2025 21:28
@alphaprinz alphaprinz force-pushed the notif_kafka_connect2 branch from 5d5ee14 to f1e6bd3 Compare May 6, 2025 21:00
@pull-request-size pull-request-size bot added size/M and removed size/S labels May 6, 2025
@alphaprinz alphaprinz force-pushed the notif_kafka_connect2 branch 4 times, most recently from 56759b8 to 3871f68 Compare May 9, 2025 17:49
@alphaprinz alphaprinz marked this pull request as ready for review May 9, 2025 18:12
@alphaprinz alphaprinz force-pushed the notif_kafka_connect2 branch from 3871f68 to 6447e8e Compare May 9, 2025 18:13
@alphaprinz alphaprinz requested a review from nadavMiz May 13, 2025 14:52
@@ -46,8 +46,8 @@ const FROM_FILE = 'from_file';
const ANONYMOUS = 'anonymous';

const VALID_OPTIONS_ACCOUNT = {
'add': new Set(['name', 'uid', 'gid', 'supplemental_groups', 'new_buckets_path', 'user', 'access_key', 'secret_key', 'fs_backend', 'allow_bucket_creation', 'force_md5_etag', 'iam_operate_on_root_account', FROM_FILE, ...CLI_MUTUAL_OPTIONS]),
'update': new Set(['name', 'uid', 'gid', 'supplemental_groups', 'new_buckets_path', 'user', 'access_key', 'secret_key', 'fs_backend', 'allow_bucket_creation', 'force_md5_etag', 'iam_operate_on_root_account', 'new_name', 'regenerate', ...CLI_MUTUAL_OPTIONS]),
'add': new Set(['name', 'uid', 'gid', 'supplemental_groups', 'new_buckets_path', 'user', 'access_key', 'secret_key', 'fs_backend', 'allow_bucket_creation', 'force_md5_etag', 'iam_operate_on_root_account', 'default_connection', FROM_FILE, ...CLI_MUTUAL_OPTIONS]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this part of the account configuration. shouldn't it be per bucket?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently in Marc's project the user and connection creation would be done by one admin user (with manage_nsfs cli), while buckets and their notif conf would be done by other (not-admin) users (with s3 rest api). So the admin would be able to do a once-per-system users and connection creation (including user's default connection), and then the other users wouldn't need to know about connections at all.

@alphaprinz alphaprinz force-pushed the notif_kafka_connect2 branch from 6447e8e to 9d199ad Compare May 14, 2025 00:32
@alphaprinz alphaprinz requested a review from nadavMiz May 14, 2025 05:45

//adapt to db shcema
if (topic_configuration) {
for (const conf of topic_configuration) {
conf.id = conf.Id;
conf.event = conf.Event;
conf.topic = conf.Topic;
//handle Kafka's topic synax, if present
if (conf.Topic && conf.Topic.length > 0 && conf.Topic[0].startsWith('kafka:::topic/')) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't it be also good to have a validation on the cli level. that you wouldn't be able to configure default_connection of this format?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants