-
Notifications
You must be signed in to change notification settings - Fork 224
Add support for Smithy bigInteger and bigDecimal types as string wrappers in aws-smithy-types, allowing users to parse with their preferred big number library. #4418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
rcoh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice! we need to clean up those representation in aws-smithy-types so that are forward compatible with eventual improvements
| /// Returns the string representation. | ||
| pub fn as_str(&self) -> &str { | ||
| &self.0 | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we shouldn't expose methods like this since they will probably be impossible to implement if we eventually switch to using a real internal representation that's not a string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Remove as_str() method - it's redundant and limiting
- Keep AsRef trait - works with any internal representation
- Update codegen to use .as_ref() instead of .as_str()
^^ Does this work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed this in the latest revision of PR
|
|
||
| impl BigInteger { | ||
| /// Creates a new `BigInteger` from a string. | ||
| pub fn new(value: impl Into<String>) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should implement FromStr instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Will use FromStr instead of new in next revision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed this in the latest revision of PR
| when (val target = model.expectShape(memberShape.target)) { | ||
| is StringShape -> deserializeString(target) | ||
| is BooleanShape -> rustTemplate("#{expect_bool_or_null}(tokens.next())?", *codegenScope) | ||
| is BigIntegerShape -> deserializeBigInteger() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we probably need to support this for more than just json protocols. also need protocol tests. does smithy have any protocol tests for these yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I will add serialization and deserialization code for XML, CBOR protocols in the next revision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does smithy have any protocol tests for these yet?
Protocol tests exist in misc.smithy but BigInteger/BigDecimal are commented out - https://github.com/smithy-lang/smithy-rs/blob/main/codegen-core/common-test-models/misc.smithy#L100. I will uncomment them now that the implementation is complete. However, it seems like misc.smithy only tests JSON. I will look at references and add protocol tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed this in the latest revision of PR
|
|
||
| private fun RustWriter.deserializeBigInteger() { | ||
| rustTemplate( | ||
| "#{expect_string_or_null}(tokens.next())?.map(|s| s.to_unescaped().map(|u| #{BigInteger}::new(u.into_owned()))).transpose()?", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also many (all?) of the supported JSON protocols represent these as regular JSON numbers: https://smithy.io/2.0/aws/protocols/aws-json-1_0-protocol.html#shape-serialization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reference. I will use expect_number_or_null instead of expect_string_or_null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed this in the latest revision of PR
| /// Consumes the `BigInteger` and returns the inner string. | ||
| pub fn into_inner(self) -> String { | ||
| self.0 | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the same reason as
if we eventually switch to using a real internal representation that's not a string
We could consider delaying adding this to leave an option for the future, unless this conversion is required right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Will remove this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed this in the latest revision of PR
343e932 to
019c154
Compare
|
|
||
| is TimestampShape -> rust("decoder.timestamp()") | ||
|
|
||
| is BigIntegerShape -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to reviewers:
I have added serialization and parsing logic for the following protocols:
- JSON
- CBOR
- XML
- AWS Query
- AWS EC2
Let me know if there are any other protocols.
|
|
||
| impl Default for BigInteger { | ||
| fn default() -> Self { | ||
| Self("0".to_string()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question for reviewrs: Default Values for BigInteger/BigDecimal
I've implemented Default trait for both types to support error correction in client codegen:
Context:
ErrorCorrection.kt line 67 generates Some(Default::default()) for all NumberShape types, including BigInteger/BigDecimal, when required fields are missing during deserialization.
Are "0" and "0.0" appropriate defaults for error correction scenarios?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should match whatever we do for normal integers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thought process:
BigInteger/BigDecimal Default implementations match primitive number behavior:
i32::default() = 0
i64::default() = 0
f32::default() = 0
f64::default() = 0
u32::default() = 0
u64::default() = 0
i8::default() = 0
i16::default() = 0
BigInteger::default()returnsBigInteger("0")(string "0" representing zero)BigDecimal::default()returnsBigDecimal("0.0")(string "0.0" representing zero)
All number types default to their zero representation. BigInteger/BigDecimal use string storage for arbitrary precision, but semantically represent the same zero value as primitive numbers.
rcoh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks pretty good. We can decide how we want to handle large numbers in JSON -- currently as you have implemented there will be a loss of precision (but there is no inherent need for that since we control aws-smithy-json and can have it parse a number as a string directly.
| bodyMediaType: "application/xml", | ||
| headers: {"Content-Type": "application/xml"}, | ||
| params: { | ||
| bigInt: 987654321, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we actually use numbers that don't fit into int/decimals?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test values are limited by Smithy's Java-based model parser, which converts numeric literals to Java Number types. When these are serialized back to strings, Java uses scientific notation for large values.
Example of the parser limitation:
params: {
bigDec: 123456789012345.123456789
}
After Smithy parses this, the codegen sees: 1.2345678901234512E14 (scientific notation with precision loss)
However, this is only a test limitation - the actual runtime code handles arbitrary precision correctly:
- XML/JSON input is parsed as strings from the wire
- BigDecimal/BigInteger use
FromStrto parse directly from those strings - Serialization writes the string back via
.as_ref() - No precision loss occurs in production code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gotcha. then we should write an integration test for this against the generated code manually. You can do this by writing a test in kotlin that generates the service, then utilizes the serializers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have done this as part of https://github.com/smithy-lang/smithy-rs/pull/4418/files#diff-ae00c010398af1c58c9129aef4393407a833d72c73b2063ad28e7754d3ae4eedR385
let input = crate::test_input::BigNumberOpInput::builder().payload(
crate::test_model::BigNumberData::builder()
.big_int("12345678901234567890".parse().unwrap())
.big_dec("3.141592653589793238".parse().unwrap())
.build()
).build().unwrap();
let serialized = ${format(operationSerializer)}(&input.payload.unwrap()).unwrap();
let output = std::str::from_utf8(&serialized).unwrap();
assert!(output.contains("<bigInt>12345678901234567890</bigInt>"));
assert!(output.contains("<bigDec>3.141592653589793238</bigDec>"));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am going to remove this unit test which has numbers that don't fit into int/decimals in next revison of the PR
|
|
||
| is BigIntegerShape -> | ||
| rustTemplate( | ||
| "<#{BigInteger} as ::std::str::FromStr>::from_str(decoder.str()?.as_ref()).map_err(|_| #{Error}::custom(\"infallible\", decoder.position()))", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this is what the spec said for CBOR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation uses decoder.str() / encoder.str() (CBOR text strings, Major Type 3) as a temporary approach.
According to the Smithy RPC v2 CBOR spec, BigInteger/BigDecimal should use:
- BigInteger: Major Type 6, tags 2 (unsigned bignum) or 3 (negative bignum)
- BigDecimal: Major Type 6, tag 4 (decimal fraction)
However, aws-smithy-cbor doesn't currently expose methods for these CBOR tags. The underlying minicbor library supports tags (used internally for timestamps), but we'd need to add public methods like encoder.bignum() and encoder.decimal() to properly implement the spec.
Any suggestions on how to address this? How do you recommend I proceed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modify aws-smithy-cbor — you can find the source of it in this repo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need some direction:
The CBOR spec requires binary encoding (tags 2/3/4), but BigInteger/BigDecimal are string wrappers to avoid choosing a bignum library.
Should we:
- Keep current text string encoding (non-compliant but simple)
- Document that BigInteger/BigDecimal don't work with CBOR
- Add
num-bigintdependency toaws-smithy-cborfor spec compliance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a dependency to aws-smithy-cbor seems OK.
Option 1 is a non-starter — we can't have non-compliant code in smithy-rs.
I would prefer 3, but for simplicity, we could have 2 (but it must FAIL to codegen at runtime, it can't be only a documented feature).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going ahead with Option 2 - failing codegen at runtime
| ) | ||
| is BigDecimalShape -> | ||
| rustTemplate( | ||
| "<#{BigDecimal} as ::std::str::FromStr>::from_str(decoder.str()?.as_ref()).map_err(|_| #{Error}::custom(\"infallible\", decoder.position()))", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this error is infallible is it? wouldn't this happen if the string wasn't a valid big decimal? this error seems worth preserving?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we validate the string is a valid number format?
Options:
- Keep infallible - Accept any string, let users validate when they parse it
- Add validation - Check the string is a valid number format, return error if not
What do you recommend?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah I see...a bit of a can of worms. I forgot we were basically doing nothing with the numeric values. We can punt this for now.
|
|
||
| rustTemplate( | ||
| """ | ||
| #{expect_number_or_null}(tokens.next())? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think do actually do this properly you need to add some additional code to aws-smithy-json to parse a number as a string? not sure how hard that would be.
As it is, this isn't terrible, but its not ideal since it defeats the point
|
|
||
| impl Default for BigInteger { | ||
| fn default() -> Self { | ||
| Self("0".to_string()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should match whatever we do for normal integers
…pers in aws-smithy-types, allowing users to parse with their preferred big number library.
aacd195 to
0f0fecf
Compare
| let s = format!("{f}"); | ||
| // f64 formatting drops ".0" for whole numbers (0.0 -> "0") | ||
| // Restore it to preserve that the original JSON had decimal notation | ||
| if !s.contains('.') && !s.contains('e') && !s.contains('E') { | ||
| format!("{s}.0") | ||
| } else { | ||
| s | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this correct? I definitely want to see some tests for this.
| let xml = br##"<BigNumberData> | ||
| <bigInt>12345678901234567890</bigInt> | ||
| <bigDec>3.141592653589793238</bigDec> | ||
| </BigNumberData> | ||
| "##; | ||
| let output = ${format(operationParser)}(xml, test_output::BigNumberOpOutput::builder()).unwrap().build(); | ||
| assert_eq!(output.big_int.as_ref().map(|v| v.as_ref()), Some("12345678901234567890")); | ||
| assert_eq!(output.big_dec.as_ref().map(|v| v.as_ref()), Some("3.141592653589793238")); | ||
| """, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice test! please add something similar for JSON
| operations: [ProcessBigNumbers] | ||
| } | ||
|
|
||
| @http(uri: "/process", method: "POST") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there appears to be some code that handles E / scientific notation but I don't see any tests of that here
Motivation and Context
Fixes #312
Smithy defines
bigIntegerandbigDecimaltypes for arbitrary-precision numbers, but smithy-rs had TODO placeholders instead of implementations. This prevented users from working with services that use these types.Description
BigIntegerandBigDecimalruntime types inaws-smithy-typesas string wrappersnum-bigint,rust_decimal)Testing
SymbolVisitorTest.ktbig-numbers.smithywith protocol testsChecklist
.changelogdirectory, specifying "client," "server," or both in theapplies_tokey.By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.