Skip to content

fix: propagate reader options to Hadoop conf in driver paths#1040

Open
josecsotomorales wants to merge 2 commits into
nightscape:mainfrom
Qualytics:main
Open

fix: propagate reader options to Hadoop conf in driver paths#1040
josecsotomorales wants to merge 2 commits into
nightscape:mainfrom
Qualytics:main

Conversation

@josecsotomorales
Copy link
Copy Markdown
Contributor

The V2 scan path already builds its Hadoop Configuration with
newHadoopConfWithOptions(...), but driver-side schema inference
(ExcelTable.inferSchema) and the V1 read/write relations in
DefaultSource still used plain newHadoopConf(). As a result,
reader options such as bucket-scoped fs.s3a.* credentials passed
via DataFrameReader.options(...) were ignored on the driver,
causing Excel reads to fail during inference before executors got
a chance to apply the scoped options.

Use newHadoopConfWithOptions in all three sites so every Hadoop
Configuration created by spark-excel sees the option map.

sshpuntoff and others added 2 commits May 18, 2026 10:44
The V2 scan path already builds its Hadoop Configuration with
`newHadoopConfWithOptions(...)`, but driver-side schema inference
(`ExcelTable.inferSchema`) and the V1 read/write relations in
`DefaultSource` still used plain `newHadoopConf()`. As a result,
reader options such as bucket-scoped `fs.s3a.*` credentials passed
via `DataFrameReader.options(...)` were ignored on the driver,
causing Excel reads to fail during inference before executors got
a chance to apply the scoped options.

Use `newHadoopConfWithOptions` in all three sites so every Hadoop
`Configuration` created by spark-excel sees the option map.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fix: propagate reader options to Hadoop conf in driver paths
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants