Skip to content

feat: Flink 2.2 compatibility and dual-dist packaging#4307

Open
macdoor wants to merge 4 commits intoapache:masterfrom
macdoor:feature/flink22-compat
Open

feat: Flink 2.2 compatibility and dual-dist packaging#4307
macdoor wants to merge 4 commits intoapache:masterfrom
macdoor:feature/flink22-compat

Conversation

@macdoor
Copy link

@macdoor macdoor commented Mar 6, 2026

Summary

  • Add Flink 2.2 compatibility layer for runtime and connectors, including reflection-based fixes for Sink V2 two-phase commit.
  • Keep all connectors compiled against Flink 1.20, and introduce two dist JARs (1.20 and 2.2) as launchers.
  • Enable users to run a single set of connector JARs on both Flink 1.20 and Flink 2.2 clusters.

Design rationale

The key design decision is to keep connector modules compiled against Flink 1.20 APIs, and only switch the small set of version-specific compat modules and the dist JAR to Flink 2.2 when needed.

  • Connectors stay on Flink 1.20 API

    • Most connector code only uses APIs that are still present in Flink 2.2.
    • Compiling them against Flink 1.20 produces artifacts that are binary compatible with both Flink 1.20 and Flink 2.2 (for the API subset that Flink 2.2 preserves).
    • This allows one connector JAR set to be reused across Flink 1.20 and 2.2, which is friendlier for users and operators.
  • Two dist JARs as "launchers"

    • For Flink 1.20, we build flink-cdc-dist-<version>-1.20.jar bundling Flink1PipelineBridge.
    • For Flink 2.2, we build flink-cdc-dist-<version>-2.2.jar bundling Flink2PipelineBridge plus a reflection-based fix in DataSinkWriterOperator to ensure two-phase committing sinks (like Paimon) still emit committables when SupportsCommitter is not implemented.
    • At runtime, users simply choose the corresponding dist JAR for the target Flink version, while reusing the same connector JARs.
  • Why not compile everything against Flink 2.2?

    • If we switched all connectors to compile against Flink 2.2, the resulting artifacts would only run on Flink 2.2 because of binary-incompatible API changes (removed SinkFunction/SourceFunction, moved/renamed classes and methods, etc.).
    • This would lock out Flink 1.20 users and increase maintenance cost without real benefit, since most connector logic is already compatible with both versions when compiled against 1.20.

This PR therefore keeps 1.20 as the compilation baseline for connectors, and uses a small, well-isolated compat layer to bridge the behavioral differences between Flink 1.20 and 2.2.

Build instructions

To build both Flink 1.20 and Flink 2.2 dist artifacts from this branch (assuming Java 17):

# Step 1: full build against Flink 1.20 (connectors + dist-1.20)
mvn clean install -DskipTests -Drat.skip=true

# Step 2: build Flink 2.2 compat and dist (no clean, reusing 1.20 artifacts)
mvn install -Pflink-2.2 \
  -pl flink-cdc-flink-compat/flink-cdc-flink-compat-flink2,flink-cdc-dist \
  -DskipTests

Deployment (recommended)

  • On Flink 1.20 clusters, put flink-cdc-dist-<version>-1.20.jar into $FLINK_CDC_HOME/lib/ and add the desired CDC connector JARs (MySQL, PostgreSQL, Paimon, OpenGauss, etc.) to the same $FLINK_HOME/lib/ (or your preferred plugin/lib directory).
  • On Flink 2.2 clusters, put flink-cdc-dist-<version>-2.2.jar into $FLINK_CDC_HOME/lib/ and reuse the same set of connector JARs as for Flink 1.20.
  • Do not place multiple versions of flink-cdc-dist-*.jar in the same $FLINK_CDC_HOME/lib/ to avoid class loading conflicts.

段晓雄 added 3 commits March 6, 2026 19:24
Add Flink 2.2 compatibility layer:
- flink-cdc-flink-compat modules (flink1/flink2) for bridging API
  differences between Flink 1.x and 2.x
- DataSinkTranslator: reflection-based two-phase commit support for
  sinks using TwoPhaseCommittingSink (Flink 1.x) or SupportsCommitter
  (Flink 2.x)
- DataSourceTranslator: compat for source function provider changes
- DataSinkWriterOperator: reflection-based SinkWriterOperator creation
  compatible with Flink 2.2 constructor changes
- Runtime serializers: TypeSerializerSchemaCompatibility compat
- dist assembly: separate Flink 1.20 and 2.2 distribution packaging
- Updated connectors for Flink 2.2 API changes

Made-with: Cursor
- Paimon connector: paimon-flink as provided, remove shade; thin jar.
  Put paimon-flink, paimon-s3, etc. in flink/lib.
- MySQL: flink-connector-mysql-cdc excludes driver from debezium,
  adds mysql-connector-j as provided. Put mysql-connector-j in flink/lib.
- OpenGauss: flink-connector-opengauss-cdc opengauss-jdbc as provided;
  pipeline-connector-opengauss removes opengauss-jdbc from shade.
  Put opengauss-jdbc in flink/lib.

Made-with: Cursor
Flink 2.2 SinkWriterOperator sets emitDownstream based on whether the
sink implements SupportsCommitter. Paimon uses the older
TwoPhaseCommittingSink interface, causing emitDownstream=false and
committables to be silently discarded. This results in data files
written to storage but no snapshot/manifest created, making data
unqueryable.

Fix: after wrapping SinkWriterOperator, force emitDownstream=true and
fill committableSerializer via reflection when the sink supports
two-phase commit but does not implement SupportsCommitter.

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment