Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement binary:match/2 and binary:match/3 #1621

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jakub-gonet
Copy link
Contributor

These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later

Copy link
Collaborator

@bettio bettio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also rebase on latest main. I added few comments on binary pos len functions introduced with the previous PR. Let's also handle them.

static term nif_binary_match(Context *ctx, int argc, term argv[])
{
term scope_atom = globalcontext_make_atom(ctx->global, ATOM_STR("\x5", "scope"));
term nomatch_atom = globalcontext_make_atom(ctx->global, ATOM_STR("\x7", "nomatch"));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the builtins NIFs (that are part of libAtomVM) should make use of defaultatoms.def instead of calling globalcontext_make_atom.


static inline BinaryPosLen term_invalid_binary_pos_len(void)
{
return (BinaryPosLen) { .pos = -1, .len = -1 };
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C allows returning a struct (by value), but this make the compiler generating memory copy code.
Here you are using an inline function, so actually it might be optimized out, but I', not sure about this.
So if you want to initialize this struct in an always optimized manner let's use macros.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last thing, (void) is not necessary.


static inline bool term_is_invalid_binary_pos_len(BinaryPosLen pos_len)
{
return pos_len.pos == -1 && pos_len.len == -1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use pointers in order to avoid memory copies: BinaryPosLen pos_len -> BinaryPosLen *pos_len.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is (pos_len.pos == -1) || (pos_len.len == -1);` a valid one?

@@ -60,6 +60,7 @@ get_non_networking_tests(OTPVersion) when
get_non_networking_tests(_OTPVersion) ->
[
test_apply,
test_binary,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually test stuff that is implemented using NIFs in tests/test.c (those tests are located in tests/erlang_tests/).
So if test_binary contains only tests for NIFs it should be moved elsewhere.

@@ -1106,6 +1111,33 @@ static inline term term_create_empty_binary(size_t size, Heap *heap, GlobalConte
return t;
}

static inline bool term_normalize_binary_pos_len(term binary, avm_int_t pos, avm_int_t len, BinaryPosLen *pos_len)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this function have any purpose outside binary:match or is it just a helper function for it?

@@ -136,6 +136,11 @@ struct PrinterFun
printer_function_t print;
};

typedef struct BinaryPosLen {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this struct have any purpose outside binary:match or is it just a helper function for it?

size_t pattern_size = term_binary_size(pattern_term);

BinaryPosLen pattern_slice = term_invalid_binary_pos_len();
const char *sub_binary = memmem(binary, size, pattern, pattern_size);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to access the content of sub_binary using [ ]? if not we can just use const void *.

RAISE_ERROR(OUT_OF_MEMORY_ATOM);
}
term result_tuple = term_alloc_tuple(2, &ctx->heap);
term_put_tuple_element(result_tuple, 0, term_from_int32(match_slice.pos));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's prefer using term_from_int() that is the right choice with avm_int_t.

match_slice = find_pattern_in_binary(binary_term, scope_slice, pattern_term);
} else {
term patterns = pattern_or_patterns_term;
while (!term_is_nil(patterns)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is dangerous when an improper list is provided. Let's use while (term_is_nonempty_list(patterns))

BinaryPosLen new_match_slice = find_pattern_in_binary(binary_term, scope_slice, pattern_term);
match_slice = select_earlier_slice(match_slice, new_match_slice);
patterns = term_get_list_tail(patterns);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's raise here at the end if patterns is not a list. (that means that it is improper).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants