-
Notifications
You must be signed in to change notification settings - Fork 2.8k
[ZEPPELIN-6284] Added Flink 1.20 interpreter. #5032
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
@Neuw84 Thank you for your contribution. By the way, flink test is failing now. Could you please check and fix them? |
# Conflicts: # flink/flink1.20-shims/src/main/java/org/apache/zeppelin/flink/Flink120Shims.java # flink/flink1.20-shims/src/main/java/org/apache/zeppelin/flink/shims120/CollectStreamTableSink.java
@jongyoul I have fixed Copilot comments and added the corresponding test file. If all is good let me know if I need to clean the "commit log". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for Apache Flink 1.20 interpreter to Zeppelin, bringing compatibility with the latest LTS release of Flink 1.x series.
Key changes include:
- Creation of Flink 1.20 shims implementation with necessary adapters and utilities
- Addition of Python environment configuration for Flink 1.20 testing
- Updates to build configuration and CI workflows to include Flink 1.20 support
Reviewed Changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
testing/env_python_3_with_flink_120.yml | New conda environment configuration for Python 3 with Flink 1.20.0 |
flink/pom.xml | Adds flink1.20-shims module and version property |
flink/flink1.20-shims/ | New module containing Flink 1.20 specific implementations and adapters |
flink/flink-shims/src/main/java/org/apache/zeppelin/flink/FlinkShims.java | Updates shim loader to recognize Flink 1.20 |
flink/flink-scala-2.12/pom.xml | Updates default Flink version and adds Flink 1.20 profile |
flink/README.md | Updates documentation to include flink1.20-shims module |
.github/workflows/core.yml | Updates CI to test with Flink 1.20 |
Comments suppressed due to low confidence (2)
flink/flink1.20-shims/src/main/java/org/apache/zeppelin/flink/PrintUtils.java:109
- Using a Unicode character literal '\uea60' (60000) for milliseconds conversion is unclear and error-prone. Replace with the explicit constant 60000 or a named constant like MILLISECONDS_PER_MINUTE.
formattedArray[i] =
flink/flink1.20-shims/src/main/java/org/apache/zeppelin/flink/PrintUtils.java:110
- Using a Unicode character literal '\uea60' (60000) for milliseconds conversion is unclear and error-prone. Replace with the explicit constant 60000 or a named constant like MILLISECONDS_PER_MINUTE.
formattedTimestamp(array.get(i), elementType, sessionTimeZone);
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
} | ||
|
||
public static int localTimeToUnixDate(LocalTime time) { | ||
return time.getHour() * 3600000 + time.getMinute() * '\uea60' + time.getSecond() * 1000 + time.getNano() / 1000000; |
Copilot
AI
Aug 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a Unicode character literal '\uea60' (60000) for milliseconds conversion is unclear and error-prone. Replace with the explicit constant 60000 or a named constant like MILLISECONDS_PER_MINUTE.
return time.getHour() * 3600000 + time.getMinute() * '\uea60' + time.getSecond() * 1000 + time.getNano() / 1000000; | |
return time.getHour() * 3600000 + time.getMinute() * MILLISECONDS_PER_MINUTE + time.getSecond() * 1000 + time.getNano() / 1000000; |
Copilot uses AI. Check for mistakes.
Unmmm, looking at the docs I see this weird error. Seems like it is not being able to launch the cluster, but a little bit lost here.
Any help is appreciated as the Exception do not provide much info.... Indeed the code that is present here is working at least for Scala and SQL interfaces ( checked with some Notebooks) with Java 11. |
Hopefully this will fix the NPE with log4j.
Seems that the error comes from a NPE Exception from Log4j. Flink 1.20.2 seems to address the issue. https://flink.apache.org/2025/07/10/apache-flink-1.20.2-release-announcement/?utm_source=chatgpt.com |
Log4j issue solved with upgrade to 1.20.2, now I am getting a fancy scala error with some version mismatch.
to reproduce them.
Tried to fix it but without luck yet. Using a docker-compose setup to connect Zeppelin to a Flink cluster on 1.20.2 with this code work as expected ( Scala , SQL interfaces). |
@zjffdu Could you please help to fix the error if you could share your time to review it? |
After some research I wasn´t able to identify the offending library. Spend some time time trying to fix scala versions without luck via maven. The code works well as said on my previous comment, should be a thing of scala-reflect/compiler used on the tests but no clue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are now using a newer JDK version and should also be using the new Java tools.
...flink1.20-shims/src/main/java/org/apache/zeppelin/flink/shims120/CollectStreamTableSink.java
Outdated
Show resolved
Hide resolved
flink/flink1.20-shims/src/main/java/org/apache/zeppelin/flink/Flink120Shims.java
Outdated
Show resolved
Hide resolved
Avoid Scala reflection issues on testing.
} else { | ||
String flinkHome = System.getenv("FLINK_HOME"); | ||
if (flinkHome != null && !flinkHome.isBlank()) { | ||
pyDir = Paths.get(flinkHome).resolve("opt").resolve("python"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you use Paths.resolve() in various places?
I would have shortened it to Paths.get(flinkHome, “opt”, "python")
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
Refactoring of the code.
After digging and digging with scala libraries I think that the error is not solvable.... I don't see any scala library collisions and tried forcing many versions. Seems that they changed how you need to create the environments and that should updated on ScalaInterpreter. We should have a codepath for 1.17 and below and one for 1.20 now. |
File "/usr/share/miniconda/envs/python_3_with_flink/lib/python3.9/site-packages/google/protobuf/descriptor.py", line 796, in __new__ _message.Message._CheckCalledFromGeneratedFile() TypeError: Descriptors cannot not be created directly. If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0. If you cannot immediately regenerate your protos, some other possible workarounds are: 1. Downgrade the protobuf package to 3.20.x or lower. 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).
This is why I abandoned the previous PR ... recompiling Flink with a higher Scala version does not work either ... the key point here is FLINK-32560 added a |
FLINK-36327 replaces the |
That would be great! |
I dont know if you have noticed or not, but the error isn't actually fatal. If you try to run the cell a second time it surprisingly will work without any problems.
|
I also noticed that, but have no idea why, due to my limited knowledge of scala internal :( |
I think there are two directions now:
|
The problem is that Flink 2.0 deletes all scala APIs, that would mean that we need to rewrite the Flink interpreter and see what are the ideal options there. SQL interface ok, but then... python+java repl? It would need a huge rearchitecture.... I will try to spend more time on this (Flink 1.20) but I can't promise anything, any help would be appreciated. As it is know it works but I would like this to be merged to have at least LTS support before migrate to Flink 2.0... |
What is this PR for?
A few sentences describing the overall goals of the pull request's commits.
First time? Check out the contributing guide - https://zeppelin.apache.org/contribution/contributions.html
Add Flink 1.20 support for Apache Flink interpreter. Being the latest 1.X release and LTS one it is important to be able to support it. For example we can use latest Iceberg versions with it.
What type of PR is it?
Feature
Todos
What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-6284
How should this be tested?
Screenshots (if appropriate)
Questions: