This repo contains examples on how to configure PySpark logs in the local Apache Spark environment and when using Databricks clusters.
Link to the blogpost with details.
Provide your logging configurations in conf/local/log4j.properties and pass this path via SPARK_CONF_DIR when initializing the Python session.
- Describe your logging configurations in
conf/databricks/driver-log4j.properties. - Provide your
DATABRICKS_CLI_PROFILEenvironment variable in the.envfile - Upload the configurations to DBFS via
make upload-log-configuration - Add the init script in the cluster properties