You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix: Prevent truncated Parquet files in S3 after failed CreateMultipartUpload (#2993)
During a call to s3.to_parquet(), if the size of the data exceeds 5MB a multi-part upload
operation will be initiated.
If the S3 call to CreateMultipartUpload fails (such as with a 503 SlowDown error) then
the incomplete Parquet file data was being written to S3 using 'put_object' during close().
This resulted in broken Parquet files in S3, causing errors when queried by services like Athena.
Now, the data buffer is cleared at the end of the call to flush() -- even when an exception occurs.
0 commit comments