This guide demonstrates how to start a job with a python application using pyFlink without deploying Apache Beam
Apache Flink does not provide any official docker images for pyFlink, you will need to build and host your own image. A sample docker file is provided
in images/flink/python.Dockerfile
Alternatively:
- Create your own Dockerfile: please follow DockerSetup#enableing-python in the flink docs
- Deploy the Dockerfile to any docker registry
You can start a job with a python file by specifying the .spec.job.pyFile property. The .spec.job.pyFile is transformed to python
as an argument in the flink command.
Make sure you update .spec.image.name to point to your pyFlink Docker Image and registry.
apiVersion: flinkoperator.k8s.io/v1beta1
kind: FlinkCluster
metadata:
name: flinkjobcluster-sample
spec:
...
image:
name: <your_dockerfile>
...
job:
pyFile: "examples/python/table/word_count.py"If you wrote the application with multiple python files, specify .spec.job.pyModule and .spec.job.pyFiles.
These properties are transformed to pyModule and pyFiles as arguments in the flink command, respectively.
Refer to the pyFlink CLI Docs for further
information.
apiVersion: flinkoperator.k8s.io/v1beta1
kind: FlinkCluster
metadata:
name: flinkjobcluster-sample
spec:
...
image:
name: <your_dockerfile>
...
job:
pyModule: "word_count"
pyFiles: "examples/python/table"