Skip to content

Conversation

@AmitKulkarni23
Copy link

Motivation and Context

Fixes #312

Smithy defines bigInteger and bigDecimal types for arbitrary-precision numbers, but smithy-rs had TODO placeholders instead of implementations. This prevented users from working with services that use these types.

Description

  • Added BigInteger and BigDecimal runtime types in aws-smithy-types as string wrappers
  • Implemented JSON serialization/deserialization in codegen
  • String-based approach allows users to choose their preferred big number library (e.g., num-bigint, rust_decimal)
  • Added unit tests and integration tests with protocol test coverage

Testing

  • Added unit tests in SymbolVisitorTest.kt
  • Created integration test model big-numbers.smithy with protocol tests
  • All codegen-core tests pass
  • All codegen-client-test integration tests pass

Checklist

  • [ x] For changes to the smithy-rs codegen or runtime crates, I have created a changelog entry Markdown file in the .changelog directory, specifying "client," "server," or both in the applies_to key.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@AmitKulkarni23 AmitKulkarni23 requested review from a team as code owners November 21, 2025 16:59
Copy link
Collaborator

@rcoh rcoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! we need to clean up those representation in aws-smithy-types so that are forward compatible with eventual improvements

Comment on lines 24 to 27
/// Returns the string representation.
pub fn as_str(&self) -> &str {
&self.0
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we shouldn't expose methods like this since they will probably be impossible to implement if we eventually switch to using a real internal representation that's not a string

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Remove as_str() method - it's redundant and limiting
  2. Keep AsRef trait - works with any internal representation
  3. Update codegen to use .as_ref() instead of .as_str()

^^ Does this work?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed this in the latest revision of PR


impl BigInteger {
/// Creates a new `BigInteger` from a string.
pub fn new(value: impl Into<String>) -> Self {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should implement FromStr instead

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Will use FromStr instead of new in next revision.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed this in the latest revision of PR

when (val target = model.expectShape(memberShape.target)) {
is StringShape -> deserializeString(target)
is BooleanShape -> rustTemplate("#{expect_bool_or_null}(tokens.next())?", *codegenScope)
is BigIntegerShape -> deserializeBigInteger()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we probably need to support this for more than just json protocols. also need protocol tests. does smithy have any protocol tests for these yet?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I will add serialization and deserialization code for XML, CBOR protocols in the next revision.

Copy link
Author

@AmitKulkarni23 AmitKulkarni23 Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does smithy have any protocol tests for these yet?

Protocol tests exist in misc.smithy but BigInteger/BigDecimal are commented out - https://github.com/smithy-lang/smithy-rs/blob/main/codegen-core/common-test-models/misc.smithy#L100. I will uncomment them now that the implementation is complete. However, it seems like misc.smithy only tests JSON. I will look at references and add protocol tests.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed this in the latest revision of PR


private fun RustWriter.deserializeBigInteger() {
rustTemplate(
"#{expect_string_or_null}(tokens.next())?.map(|s| s.to_unescaped().map(|u| #{BigInteger}::new(u.into_owned()))).transpose()?",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also many (all?) of the supported JSON protocols represent these as regular JSON numbers: https://smithy.io/2.0/aws/protocols/aws-json-1_0-protocol.html#shape-serialization

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reference. I will use expect_number_or_null instead of expect_string_or_null

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed this in the latest revision of PR

Comment on lines 29 to 32
/// Consumes the `BigInteger` and returns the inner string.
pub fn into_inner(self) -> String {
self.0
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the same reason as

if we eventually switch to using a real internal representation that's not a string

We could consider delaying adding this to leave an option for the future, unless this conversion is required right now.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Will remove this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed this in the latest revision of PR

@AmitKulkarni23 AmitKulkarni23 force-pushed the add-biginteger-bigdecimal-support branch from 343e932 to 019c154 Compare November 25, 2025 16:48

is TimestampShape -> rust("decoder.timestamp()")

is BigIntegerShape ->
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewers:

I have added serialization and parsing logic for the following protocols:

  • JSON
  • CBOR
  • XML
  • AWS Query
  • AWS EC2

Let me know if there are any other protocols.


impl Default for BigInteger {
fn default() -> Self {
Self("0".to_string())
Copy link
Author

@AmitKulkarni23 AmitKulkarni23 Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question for reviewrs: Default Values for BigInteger/BigDecimal

I've implemented Default trait for both types to support error correction in client codegen:

Context:
ErrorCorrection.kt line 67 generates Some(Default::default()) for all NumberShape types, including BigInteger/BigDecimal, when required fields are missing during deserialization.

Are "0" and "0.0" appropriate defaults for error correction scenarios?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should match whatever we do for normal integers

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought process:

BigInteger/BigDecimal Default implementations match primitive number behavior:

i32::default()  = 0
i64::default()  = 0
f32::default()  = 0
f64::default()  = 0
u32::default()  = 0
u64::default()  = 0
i8::default()   = 0
i16::default()  = 0
  • BigInteger::default() returns BigInteger("0") (string "0" representing zero)
  • BigDecimal::default() returns BigDecimal("0.0") (string "0.0" representing zero)

All number types default to their zero representation. BigInteger/BigDecimal use string storage for arbitrary precision, but semantically represent the same zero value as primitive numbers.

Copy link
Collaborator

@rcoh rcoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks pretty good. We can decide how we want to handle large numbers in JSON -- currently as you have implemented there will be a loss of precision (but there is no inherent need for that since we control aws-smithy-json and can have it parse a number as a string directly.

bodyMediaType: "application/xml",
headers: {"Content-Type": "application/xml"},
params: {
bigInt: 987654321,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we actually use numbers that don't fit into int/decimals?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test values are limited by Smithy's Java-based model parser, which converts numeric literals to Java Number types. When these are serialized back to strings, Java uses scientific notation for large values.

Example of the parser limitation:

params: {
   bigDec: 123456789012345.123456789
}

After Smithy parses this, the codegen sees: 1.2345678901234512E14 (scientific notation with precision loss)

However, this is only a test limitation - the actual runtime code handles arbitrary precision correctly:

  • XML/JSON input is parsed as strings from the wire
  • BigDecimal/BigInteger use FromStr to parse directly from those strings
  • Serialization writes the string back via .as_ref()
  • No precision loss occurs in production code

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gotcha. then we should write an integration test for this against the generated code manually. You can do this by writing a test in kotlin that generates the service, then utilizes the serializers

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have done this as part of https://github.com/smithy-lang/smithy-rs/pull/4418/files#diff-ae00c010398af1c58c9129aef4393407a833d72c73b2063ad28e7754d3ae4eedR385

let input = crate::test_input::BigNumberOpInput::builder().payload(
    crate::test_model::BigNumberData::builder()
        .big_int("12345678901234567890".parse().unwrap())
        .big_dec("3.141592653589793238".parse().unwrap())
        .build()
).build().unwrap();
let serialized = ${format(operationSerializer)}(&input.payload.unwrap()).unwrap();
let output = std::str::from_utf8(&serialized).unwrap();
assert!(output.contains("<bigInt>12345678901234567890</bigInt>"));
assert!(output.contains("<bigDec>3.141592653589793238</bigDec>"));

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to remove this unit test which has numbers that don't fit into int/decimals in next revison of the PR


is BigIntegerShape ->
rustTemplate(
"<#{BigInteger} as ::std::str::FromStr>::from_str(decoder.str()?.as_ref()).map_err(|_| #{Error}::custom(\"infallible\", decoder.position()))",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is what the spec said for CBOR?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current implementation uses decoder.str() / encoder.str() (CBOR text strings, Major Type 3) as a temporary approach.

According to the Smithy RPC v2 CBOR spec, BigInteger/BigDecimal should use:

  • BigInteger: Major Type 6, tags 2 (unsigned bignum) or 3 (negative bignum)
  • BigDecimal: Major Type 6, tag 4 (decimal fraction)

However, aws-smithy-cbor doesn't currently expose methods for these CBOR tags. The underlying minicbor library supports tags (used internally for timestamps), but we'd need to add public methods like encoder.bignum() and encoder.decimal() to properly implement the spec.

Any suggestions on how to address this? How do you recommend I proceed here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modify aws-smithy-cbor — you can find the source of it in this repo

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need some direction:

The CBOR spec requires binary encoding (tags 2/3/4), but BigInteger/BigDecimal are string wrappers to avoid choosing a bignum library.

Should we:

  1. Keep current text string encoding (non-compliant but simple)
  2. Document that BigInteger/BigDecimal don't work with CBOR
  3. Add num-bigint dependency to aws-smithy-cbor for spec compliance

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a dependency to aws-smithy-cbor seems OK.

Option 1 is a non-starter — we can't have non-compliant code in smithy-rs.

I would prefer 3, but for simplicity, we could have 2 (but it must FAIL to codegen at runtime, it can't be only a documented feature).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going ahead with Option 2 - failing codegen at runtime

)
is BigDecimalShape ->
rustTemplate(
"<#{BigDecimal} as ::std::str::FromStr>::from_str(decoder.str()?.as_ref()).map_err(|_| #{Error}::custom(\"infallible\", decoder.position()))",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this error is infallible is it? wouldn't this happen if the string wasn't a valid big decimal? this error seems worth preserving?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we validate the string is a valid number format?

Options:

  1. Keep infallible - Accept any string, let users validate when they parse it
  2. Add validation - Check the string is a valid number format, return error if not

What do you recommend?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I see...a bit of a can of worms. I forgot we were basically doing nothing with the numeric values. We can punt this for now.


rustTemplate(
"""
#{expect_number_or_null}(tokens.next())?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think do actually do this properly you need to add some additional code to aws-smithy-json to parse a number as a string? not sure how hard that would be.

As it is, this isn't terrible, but its not ideal since it defeats the point


impl Default for BigInteger {
fn default() -> Self {
Self("0".to_string())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should match whatever we do for normal integers

@AmitKulkarni23 AmitKulkarni23 requested a review from rcoh December 1, 2025 22:47
…pers in aws-smithy-types, allowing users to parse with their preferred big number library.
@AmitKulkarni23 AmitKulkarni23 force-pushed the add-biginteger-bigdecimal-support branch from aacd195 to 0f0fecf Compare December 3, 2025 20:29
Comment on lines +419 to +426
let s = format!("{f}");
// f64 formatting drops ".0" for whole numbers (0.0 -> "0")
// Restore it to preserve that the original JSON had decimal notation
if !s.contains('.') && !s.contains('e') && !s.contains('E') {
format!("{s}.0")
} else {
s
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this correct? I definitely want to see some tests for this.

Comment on lines +256 to +264
let xml = br##"<BigNumberData>
<bigInt>12345678901234567890</bigInt>
<bigDec>3.141592653589793238</bigDec>
</BigNumberData>
"##;
let output = ${format(operationParser)}(xml, test_output::BigNumberOpOutput::builder()).unwrap().build();
assert_eq!(output.big_int.as_ref().map(|v| v.as_ref()), Some("12345678901234567890"));
assert_eq!(output.big_dec.as_ref().map(|v| v.as_ref()), Some("3.141592653589793238"));
""",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice test! please add something similar for JSON

operations: [ProcessBigNumbers]
}

@http(uri: "/process", method: "POST")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there appears to be some code that handles E / scientific notation but I don't see any tests of that here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Support for BigInteger / BigDecimal

4 participants