Skip to content

[FLINK-34466][LINEAGE] Support dataset type facet for Avro#171

Closed
pawel-big-lebowski wants to merge 1 commit into
apache:mainfrom
pawel-big-lebowski:lineage/avro-schema-support
Closed

[FLINK-34466][LINEAGE] Support dataset type facet for Avro#171
pawel-big-lebowski wants to merge 1 commit into
apache:mainfrom
pawel-big-lebowski:lineage/avro-schema-support

Conversation

@pawel-big-lebowski
Copy link
Copy Markdown
Contributor

With an implementation of lineage interfaces, Kafka source and sink provide TypeDatasetFacet which contains type information of data processed. This was working with extracting type information from SerializationSchema:

TypeExtractor.getParameterType(SerializationSchema.class, valueSerialization.getClass(), 0)

However, this ain't working for ConfluentRegistryAvroSerializationSchema due to type erasure.

As a workaround, TypeDatasetFacet can be extended to contain SerializationSchema when direct extraction of TypeInformation is not possible. This allows handling this case on the lineage listener side, while keeping flink-connector-kafka unaware of Avro type extraction logic.

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
@pawel-big-lebowski pawel-big-lebowski force-pushed the lineage/avro-schema-support branch from 1b75ef0 to 86f3d3c Compare April 16, 2025 12:27
@pawel-big-lebowski pawel-big-lebowski changed the title [LINEAGE] Support dataset type facet for Avro [FLINK-34466][LINEAGE] Support dataset type facet for Avro Apr 16, 2025
*
* @return
*/
default Optional<SerializationSchema> getSerializationSchema() {
Copy link
Copy Markdown

@davidradl davidradl May 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this interface live in Flink? And the logic to fallback to the serialization schema also be in Flink.

As I assume a File connector using the Avro format could hit this issue as well.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I strongly agree. The whole TypeDatasetFacet should be removed from here. As far as i can tell, this PR should be revived and merged beforehand: apache/flink#25712


return Optional.of(
new DefaultTypeDatasetFacet(TypeExtractor.createTypeInfo(type)));
} catch (InvalidTypesException e) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't a type check beforehand be less expensive resource-wise? (Depending on how often do you expect to hit this fallback)

SerializationSchema.class, valueSerialization.getClass(), 0);

return Optional.of(new DefaultTypeDatasetFacet(TypeExtractor.createTypeInfo(type)));
} catch (InvalidTypesException e) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 5, 2026

This PR is being marked as stale since it has not had any activity in the last 90 days.
If you would like to keep this PR alive, please leave a comment asking for a review.
If the PR has merge conflicts, update it with the latest from the base branch.

If you are having difficulty finding a reviewer, please reach out to the
community, contact details can be found here: https://flink.apache.org/what-is-flink/community/

If this PR is no longer valid or desired, please feel free to close it.
If no activity occurs in the next 30 days, it will be automatically closed.

@github-actions github-actions Bot added the stale label Apr 5, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 5, 2026

This PR has been closed since it has not had any activity in 120 days.
If you feel like this was a mistake, or you would like to continue working on it,
please feel free to re-open the PR and ask for a review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants