Fix UBSAN errors when converting types using from_json#5207
Fix UBSAN errors when converting types using from_json#5207ChrisCoxArt wants to merge 13 commits into
Conversation
and flush NaN and Inf to zero before trying to convert
and types that support Inf support NaN, until someone creates a new FP standard type
and add compile time checks for ranges to reduce compares needed at runtime
long double testing is inconsistent elsewhere - but add it and make sure it gets tested
|
There was a previous discussion of this about 10 years ago in #288. I don't know if his position has changed since then, but at the time it was that if you are extracting to a smaller data type than what is being used internally to hold the data value, then it is your responsibility to perform the range checking, not the library's, and thus people that extract it to the proper type don't pay the cost of range checking. |
|
Again: |
Yes, that is exactly how one does range checking when one is going to fit an unknown value from a larger datatype into a smaller datatype.
The library could limit the extraction function to just the internal types, but it does more as a convenience for the user. If the user cares about fitting larger values into smaller types, then it is the user's responsibility to do the checking to make sure it fits. In general, the library follows the "don't pay for what you don't use" philosophy. If you know that you don't need to do range checking, then you can call the function that doesn't do range checking. If you do need to do range checking because you absolutely need a smaller datatype than what is used internally, do the range checking yourself before putting the data in the smaller datatype.
It is not causing them. If you are 100% certain that the value will fit in the smaller datatype, then you can use the function that allows you to extract it into the smaller datatype. If you are not 100% certain, extract it as the native datatype and do the range checking yourself. I am not the maintainer, just a long time user, and in general that has been the philosophy of the library. As the maintainer, @nlohmann is definitely able to make changes, but if those changes can cause behavior differences, then it is less likely that they will be accepted. |
|
Wow. Your (almost Lewis Carroll like) "logic" belongs in a museum somewhere. Yes, the nlohmann json code is 100% causing the undefined behavior errors because of type conversions without range testing/clamping. Fixing that requires changes to the library code, or re-implementing chunks of the library code by every user of the library. Yes, fixing the errors in the library will cause behavior changes - because it will no longer cause errors, or return random unexpected values. This is JSON we're talking about - you can never be 100% certain of the data range unless you just wrote that data and only immediately consume your own internal data. As soon as storage is involved -- data can be changed, and ranges are undefined. As soon as third party data is involved -- ranges are undefined. As soon as end-users are involved -- ranges are undefined. If you assume that the data is always in the range you created, then you have created attack vectors and vulnerabilities in your code -- which is why tools like UBSan exist to expose those errors and vulnerabilities. Pushing responsibility for the errors in the library onto the users of the library is irresponsible, at best. I certainly hope that the maintainer has a better grasp on the world than you seem to be conveying. |
|
The fix is very simple, extract using the internal datatypes, and then you don't have to worry about introducing undefined behavior using the library functions. What you might want for range checking behavior isn't necessarily the same as someone else wants for range checking behavior, and it can vary from element to element.
And yet there are users of this library that are 100% certain of the data that they are reading, or they ALREADY do their own range checking, and thus for performance reasons absolutely do not want the extra range checking. |
|
No, pushing the fix onto users of the library means re-implementing parts of the library to do the right thing. Why should the user have to re-implement parts of the library? Why can't the library do the right thing in the first place? Why ship a library with known UB errors? Can users trust the library code if maintainers think that UB shouldn't be fixed and should require third party code to replace parts of the library? Your "logic" is beyond broken, and you are defending bad code and bad practices. |
|
Please define "the right thing" for every possible set of circumstances experienced by every user of this library. |
|
You can start with "don't generate errors from UBSan", and "don't return random unexpected values". Please stop defending broken code, and start working on how to fix it. |
|
I stand with #288. The library has a well-defined and documented behavior of storing data. When you decide to ask for conversions like Also, it's documented how the library is handling number types. There is no surprise, and we have functions to query the stored type. This means there is some caution and work needed by clients when numbers are read from the library - but all this is possible by calling public library functions. Finally, I would like to mention that the the UBSAN errors mentioned here are not caused by the library, but by client code making the wrong assumptions. It's basically an example of casting |
|
I strongly disagree. The library is causing the UBsan errors, because the library exposes functionality that has undefined behavior and fails to take precautions against that undefined behavior. |
Fixes #5206
from_json.hpp
unit-conversions.cpp
Checklist
I'm not sure about this - this change doesn't really fit the current number_handling.md documentation.
make amalgamate.Read the Contribution Guidelines for detailed information.
Note
DCO does not appear to know about digital signatures - it should be removed or replaced with smarter tests.