Add erlang:crc32/1,2 and erlang:crc32_combine/3#2209
Add erlang:crc32/1,2 and erlang:crc32_combine/3#2209bettio wants to merge 0 commit intoatomvm:mainfrom
erlang:crc32/1,2 and erlang:crc32_combine/3#2209Conversation
src/libAtomVM/term.h
Outdated
| * by caller using \c term_is_int() | ||
| * @note Only extracts from unboxed integers (28-bit on 32-bit builds, | ||
| * 60-bit on 64-bit builds) | ||
| * @note Safe conversions: \c size_t s = term_to_int(t) is always valid |
There was a problem hiding this comment.
Not sure this is true. size_t is unsigned while term_to_int(t) returns a signed value.
| ok = expect_badarg(fun() -> erlang:crc32_combine(?MODULE:id(16#100000000), 0, 0) end), | ||
| ok = expect_badarg(fun() -> erlang:crc32_combine(0, ?MODULE:id(16#100000000), 0) end), | ||
| ok = expect_badarg(fun() -> erlang:crc32_combine(0, 0, ?MODULE:id(16#100000000)) end), | ||
| ok. |
There was a problem hiding this comment.
We could also test crc32/2 with second element being invalid ([-1], [256], etc.).
src/libAtomVM/nifs.c
Outdated
| .base.type = NIFFunctionType, | ||
| .nif_ptr = nif_erlang_crc32_1 | ||
| }; | ||
| static const struct Nif crc32_old_nif = { |
There was a problem hiding this comment.
The name crc32_old_nif is a little bit confusing. crc32_2_nif? Any reason not to merge it with nif_erlang_crc32_1 and check for argc?
|
|
||
| const void *data_ptr; | ||
| size_t data_size; | ||
| char *alloc_ptr = NULL; |
There was a problem hiding this comment.
Part of this could be factorized if both nifs would be combined.
|
https://ampcode.com/threads/T-019cfc3d-3f9e-73c3-8602-d78da75e5b70 PR Review: Add
|
| Area | Assessment |
|---|---|
| Memory safety | ✅ No leaks — alloc_ptr is freed before make_maybe_boxed_int64; iolist_to_buffer cleans up on failure; empty data handled safely |
| Input validation | ✅ term_is_uint32 + VALIDATE_VALUE correctly rejects negatives, floats, atoms, tuples, oversized ints |
| Name collision | ✅ No conflict — crc32_uint and nif_erlang_crc32 don't collide with zlib's crc32 C symbol |
| Software fallback | ✅ Standard reflected polynomial 0xEDB88320 with proper init/final inversion; combine uses correct GF(2) matrix-squaring (matches zlib's algorithm) |
| argc dispatch | ✅ Single nif_erlang_crc32 handles both arities cleanly via argc == 2 check |
| Test coverage | ✅ Good: covers binary, iolist, empty data, incremental update, combine, and extensive badarg cases |
| CHANGELOG / docs | ✅ Present and accurate |
Test Coverage Gaps
The following additional test cases would strengthen confidence:
-
No-zlib build — Run the same test module with
WITH_ZLIB=offin CI. This is the biggest gap since the software fallback is otherwise unexercised. -
Boundary values — Test max uint32 CRC:
16#FFFFFFFF = erlang:crc32(16#FFFFFFFF, <<>>).
-
Empty update/combine identity — Verify CRC is unchanged when combined with empty data:
Old = erlang:crc32(<<"abc">>), Old = erlang:crc32(Old, <<>>). C1 = erlang:crc32(<<"abc">>), C1 = erlang:crc32_combine(C1, erlang:crc32(<<>>), 0).
-
Nested iodata — e.g.
[<<"He">>, [$l, $l], [<<"o">>]] -
Float badargs — Add
1.0/1.5as CRC and size arguments -
Cross-check property — Verify incremental CRC matches combined CRC for more input shapes:
erlang:crc32([A, B]) =:= erlang:crc32(erlang:crc32(A), B). erlang:crc32_combine(erlang:crc32(A), erlang:crc32(B), byte_size(B)) =:= erlang:crc32(<<A/binary, B/binary>>).
Minor Observations
-
iolist flattening allocates O(n):
nif_erlang_crc32flattens non-binary iodata into a temporary malloc'd buffer viaiolist_to_buffer. This matches existing AtomVM patterns but can be expensive on constrained targets for large iolists. A streaming approach over iodata segments would avoid the copy but is more complex — fine as a future optimization. -
make_maybe_boxed_int64for a uint32 result: Semantically broader than needed, but safe — CRC32 values fit comfortably and boxing support is useful when values exceed the immediate-int range on some targets. -
crc32_combine/3cannot validateSecondSize: If the caller passes a size that doesn't match the data that producedSecondCrc, the result will be wrong but well-defined. This matches OTP/zlib semantics — the function has no way to verify it.
Add `erlang:crc32/1,2` and `erlang:crc32_combine/3` These changes are made under both the "Apache 2.0" and the "GNU Lesser General Public License 2.1 or later" license terms (dual license). SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later
Add CRC32 (IEEE 802.3) checksum support to the erlang module,
matching OTP's API. When zlib is available the implementation
delegates to it, otherwise a standalone software fallback is used.
Also fix term_is_int32 and term_is_uint32 which returned true
unconditionally for unboxed integers on 64-bit builds, skipping
the range check needed for values above 32-bit limits.
These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).
SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later