Skip to content

fix: Address slicing for pyarrow array in data_color#741

Merged
rich-iannone merged 12 commits intoposit-dev:mainfrom
FBruzzesi:fix/data-color-pyarrow
Mar 11, 2026
Merged

fix: Address slicing for pyarrow array in data_color#741
rich-iannone merged 12 commits intoposit-dev:mainfrom
FBruzzesi:fix/data-color-pyarrow

Conversation

@FBruzzesi
Copy link
Contributor

Summary

Follow up from the disclaimer section in #736

summary > For pyarrow backed tbl_data, `data_color` the code would end up breaking a few lines down the line when performing the follow operation:
column_vals = data_table[col][row_pos].to_list()

In fact, slicing with a list on a chunked array raises

*** TypeError: 'list' object cannot be interpreted as an integer

I wanted this PR to be atomic enough to solve one issue. I can follow up on this other one

Checklist

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I manually checked a few vs pandas/polars, but more eyes the better

@codecov
Copy link

codecov bot commented Jul 30, 2025

Codecov Report

❌ Patch coverage is 93.75000% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 92.31%. Comparing base (3596881) to head (0ed698e).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
great_tables/_tbl_data.py 92.85% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #741      +/-   ##
==========================================
- Coverage   92.31%   92.31%   -0.01%     
==========================================
  Files          48       48              
  Lines        6039     6052      +13     
==========================================
+ Hits         5575     5587      +12     
- Misses        464      465       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.


@get_rows.register
def _(ser: PdSeries, indexes: list[int]) -> PdSeries:
return ser.iloc[indexes]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was intended to use index location in the first place but the issue never arise?

@machow machow self-assigned this Mar 3, 2026
Copy link
Collaborator

@machow machow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM. @rich-iannone if CI passes and it seems okay, want to merge?

@rich-iannone rich-iannone merged commit 813770a into posit-dev:main Mar 11, 2026
14 checks passed
@FBruzzesi FBruzzesi deleted the fix/data-color-pyarrow branch March 12, 2026 08:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants