Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-32416] initial implementation of DynamicKafkaSource with bound… #44

Merged
merged 1 commit into from
Jan 12, 2024

Conversation

mas-chen
Copy link
Contributor

@mas-chen mas-chen commented Aug 9, 2023

…ed/unbounded support and unit/integration tests

What is the purpose of the change

FLIP-246 describes the changes and this PR follows the design illustrated in the doc.

Brief Change Log

  • New Public API: DynamicKafkaSource, DynamicKafkaSourceBuilder, KafkaMetadataService and related classes, etc.
  • Supports multi cluster read, dynamic cluster/topic read
  • Supports multi cluster, dynamic metrics
  • Supports StoppableKafkaEnumContextProxy to workaround and provide single threaded guarantees
  • Supports unbounded and bounded mode
  • Only supports configurations/deserializers defined uniformly across clusters, can be extended later on
  • 3 stream subscription methods
  • 1 KafkaMetadataService that supports single cluster, can expose more implementations in public API depending on discussion
  • Supports metadata discovery interval as well as error handling strategy
  • Add support in test utils to bring up 2 kafka clusters for testing (tried to make least changes as possible)

Verifying this change

  • Unit Tests
  • Integration Tests (including covering migration scenarios)
  • Connector Test Framework Integration

@mxm
Copy link
Contributor

mxm commented Oct 10, 2023

Hey Mason! Do you want a review for this?

@mas-chen
Copy link
Contributor Author

@mxm Yes please! I've rebased the PR and working on fixing some flaky unit tests

Copy link
Contributor

@mxm mxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mas-chen! The changes look very solid to me and they capture what has been proposed in the FLIP. The only thing we are missing here is documentation. It doesn't have to be much but we should at least outline some basic usage examples.

Copy link
Contributor

@tzulitai tzulitai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mas-chen I've been testing this with a real, simple implementation of the KafkaMetadataService over the winter break and it has worked nicely, matching what is described in the FLIP. Added test time to the build is also acceptable. The code overall looks pretty solid as well.

Since this new connector is marked as @Experimental, I think we're in a good state to merge it to be included for Kafka connector 3.1.0 release. What do you think @mas-chen?

@mas-chen
Copy link
Contributor Author

mas-chen commented Jan 9, 2024

@tzulitai Thanks for the review. I would like get some more input from at least another reviewer from merging this and I agree this should be included in the 3.1.0 release. Perhaps @mxm?

After this PR, I still need to complete the documentation for this new connector.

@mas-chen mas-chen force-pushed the mason-FLINK-32416 branch 2 times, most recently from fdc9f8a to 5c229b3 Compare January 9, 2024 22:16
Copy link
Contributor

@mxm mxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Cosmetic comment: #44 (comment)

Let me know when you want to merge.

@mas-chen
Copy link
Contributor Author

Thanks @tzulitai @mxm for your review! I've addressed all the comments and squashed the commits. Can you please merge it at your convenience?

I'll start on the docs immediately and hope to use this connector directly in the 3.1.0 release!

@mxm
Copy link
Contributor

mxm commented Jan 11, 2024

Sounds good. Can we get a bare-minimum version of the documentation into this PR?

@mas-chen
Copy link
Contributor Author

Are the Javadocs sufficient in the bare minimum case? Otherwise, I'll just write the documentation into this PR

@mxm
Copy link
Contributor

mxm commented Jan 11, 2024

A brief one-pager of the feature would be good. We can expand on it later. Just to have something in the web docs.

@mas-chen
Copy link
Contributor Author

mas-chen commented Jan 11, 2024

one-pager docs are here: docs/content/docs/connectors/datastream/dynamic-kafka.md. Marked with experimental status.

Also copied this into the chinese doc directory.

(https://issues.apache.org/jira/browse/FLINK-32417) I will still need to expand on the available config options, observability, and popular setter methods

…ed/unbounded support and unit/integration tests

add dynamic kafka source docs
@mxm
Copy link
Contributor

mxm commented Jan 12, 2024

Awesome, thanks a lot! Merging once tests pass.

@mxm mxm merged commit eaeb781 into apache:main Jan 12, 2024
8 checks passed
@MartijnVisser
Copy link
Contributor

MartijnVisser commented Jan 16, 2024

NB: after this was merged, the CI doesn't work for main anymore CC @mas-chen @mxm https://github.com/apache/flink-connector-kafka/actions/runs/7504290046

@mxm
Copy link
Contributor

mxm commented Jan 16, 2024

We also noticed that. @mas-chen is looking into fixing this.

@mxm
Copy link
Contributor

mxm commented Jan 16, 2024

Debug info:

2024-01-12T16:49:32.5164583Z [ERROR] Failures: 
2024-01-12T16:49:32.5166818Z [ERROR]   DynamicKafkaSourceITTest$IntegrationTests>SourceTestSuiteBase.testMultipleSplits:211->SourceTestSuiteBase.checkResultWithSemantic:743 Expected to have exactly 40 records in result, but only received 30 records

https://github.com/apache/flink-connector-kafka/actions/runs/7504290046/job/20433204351

@mas-chen
Copy link
Contributor Author

Sorry about that. ^The linked PR should fix this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants