Skip to content

[SPARK-56411][SQL] Register Decimal in KryoSerializer so cached-batch spill works#55287

Closed
LuciferYang wants to merge 1 commit intoapache:masterfrom
LuciferYang:SPARK-56411
Closed

[SPARK-56411][SQL] Register Decimal in KryoSerializer so cached-batch spill works#55287
LuciferYang wants to merge 1 commit intoapache:masterfrom
LuciferYang:SPARK-56411

Conversation

@LuciferYang
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Register Decimal, Decimal[], java.math.BigDecimal, and java.math.BigInteger in KryoSerializer.loadableSparkClasses so that Kryo strict registration mode(spark.kryo.registrationRequired=true) can serialize these types without throwing Class is not registered.

Why are the changes needed?

DefaultCachedBatchSerializer writes cached batch stats via kryo.writeClassAndObject(output, batch.stats). The stats row (GenericInternalRow) can contain Decimal values (e.g. min/max stats for a DecimalType column). When Kryo walks the row it encounters:

  • org.apache.spark.sql.types.Decimal -- the value class itself.
  • java.math.BigDecimal / java.math.BigInteger -- used internally when the value overflows Long precision (> 18 digits).

None of these were registered. Under strict mode, any cache spill or eviction of a batch with Decimal stats crashes:

com.esotericsoftware.kryo.KryoException: Class is not registered: org.apache.spark.sql.types.Decimal

This can be reproduced by caching a table with DecimalType columns(e.g. CACHE TABLE store_sales) and triggering a memory-pressure spill under Kryo strict mode.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

  • Added new tests DefaultCachedBatchKryoSerializerSuite

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code

… spill works

Register `Decimal`, `Decimal[]`, `java.math.BigDecimal`, and
`java.math.BigInteger` in `KryoSerializer.loadableSparkClasses` so that
Kryo strict registration mode can serialize these types without throwing
`Class is not registered`.

`DefaultCachedBatchSerializer` writes cached batch stats via
`kryo.writeClassAndObject(output, batch.stats)`. The stats row can
contain `Decimal` values (e.g. min/max for a DecimalType column).
None of these classes were registered, causing any cache spill or
eviction under strict mode to crash with KryoException.
@dongjoon-hyun
Copy link
Copy Markdown
Member

Could you revise the PR title, @LuciferYang ?

@LuciferYang LuciferYang changed the title [SPARK-56411][SQL] SPARK-56411: Register Decimal in KryoSerializer so cached-batch spill works [SPARK-56411][SQL] Register Decimal in KryoSerializer so cached-batch spill works Apr 10, 2026
@LuciferYang LuciferYang changed the title [SPARK-56411][SQL] Register Decimal in KryoSerializer so cached-batch spill works [SPARK-56411][CORE] Register Decimal in KryoSerializer so cached-batch spill works Apr 10, 2026
@LuciferYang LuciferYang changed the title [SPARK-56411][CORE] Register Decimal in KryoSerializer so cached-batch spill works [SPARK-56411][SQL] Register Decimal in KryoSerializer so cached-batch spill works Apr 10, 2026
@LuciferYang
Copy link
Copy Markdown
Contributor Author

Could you revise the PR title, @LuciferYang ?

done

Copy link
Copy Markdown
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @LuciferYang .

@LuciferYang
Copy link
Copy Markdown
Contributor Author

Thank you @dongjoon-hyun

@LuciferYang
Copy link
Copy Markdown
Contributor Author

Merged into master. Thanks @dongjoon-hyun

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants