Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Acero] Not support type like Fixed Size List in asof join node #44729

Open
mroz45 opened this issue Nov 14, 2024 · 0 comments
Open

[Acero] Not support type like Fixed Size List in asof join node #44729

mroz45 opened this issue Nov 14, 2024 · 0 comments

Comments

@mroz45
Copy link
Contributor

mroz45 commented Nov 14, 2024

Describe the enhancement requested

The current implementation of the asof join operation does not support the Fixed Size List. When attempting to use this data type in an asof join, the following error occurs:

Invalid: Unsupported data type fixed_size_list<item: int32>[3] for field List

Example of the code causing errors

    const int32_t list_size = 3; 
    const int num_rows = 5; 

    auto list_type = arrow::fixed_size_list(arrow::int32(), list_size);
    auto inner_builder = std::make_shared<arrow::Int32Builder>(arrow::default_memory_pool());

    arrow::Int16Builder idx_builder;
    arrow::FixedSizeListBuilder builder(arrow::default_memory_pool(), inner_builder, list_type);
    ARROW_RETURN_NOT_OK(builder.Reserve(num_rows));

    for (int i = 0; i < num_rows; ++i) {
        ARROW_RETURN_NOT_OK(inner_builder->AppendValues({i * 10, i * 10 + 1, i * 10 + 2}));
        ARROW_RETURN_NOT_OK(builder.Append());
        ARROW_RETURN_NOT_OK(idx_builder.Append(i));
    }

ARROW_ASSIGN_OR_RAISE(auto idx_list, idx_builder.Finish());
ARROW_ASSIGN_OR_RAISE(auto result1, builder.Finish());

    for (int j = 0; j < num_rows; ++j) {
        ARROW_RETURN_NOT_OK(inner_builder->AppendValues({j * 20, j * 20 + 4, j * 20 + 5}));
        ARROW_RETURN_NOT_OK(builder.Append());
        ARROW_RETURN_NOT_OK(idx_builder.Append(j*2));
    }
    
ARROW_ASSIGN_OR_RAISE(auto idx_list2, idx_builder.Finish());
ARROW_ASSIGN_OR_RAISE(auto result2, builder.Finish());

//first table
  std::shared_ptr<arrow::Schema> schema1;
  schema1=arrow::schema({arrow::field("List", list_type), arrow::field("idx", arrow::int16())});

  std::shared_ptr<arrow::Table> l_table;
  l_table = arrow::Table::Make(schema1, {result1, idx_list}, num_rows);


//second table
  std::shared_ptr<arrow::Schema> schema2;
  schema2=arrow::schema({arrow::field("List", list_type), arrow::field("idx", arrow::int16())});

  std::shared_ptr<arrow::Table> r_table;
  r_table = arrow::Table::Make(schema2, {result2, idx_list2}, num_rows);

auto table_source_options_l = ac::TableSourceNodeOptions{l_table};
ac::Declaration left{"table_source", std::move(table_source_options_l), "lTable"};

auto table_source_options_r = ac::TableSourceNodeOptions{r_table};
ac::Declaration right{"table_source", std::move(table_source_options_r),"rTable"};

arrow::acero::AsofJoinNodeOptions::Keys left_keys;
left_keys.on_key = arrow::FieldRef("idx"); 

arrow::acero::AsofJoinNodeOptions::Keys right_keys;
right_keys.on_key = arrow::FieldRef("idx");  

ac::AsofJoinNodeOptions asof_opt{{left_keys, right_keys}, 3};

arrow::acero::Declaration asof{"asofjoin", {std::move(left), std::move(right)}, std::move(asof_opt),};
ARROW_ASSIGN_OR_RAISE(auto output_table, ac::DeclarationToTable(std::move(asof)));

Component(s)

C++

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant