Skip to content

Issue 283 nano plot scaling#404

Open
minimav wants to merge 6 commits intoposit-dev:mainfrom
minimav:issue-283-nano-plot-scaling
Open

Issue 283 nano plot scaling#404
minimav wants to merge 6 commits intoposit-dev:mainfrom
minimav:issue-283-nano-plot-scaling

Conversation

@minimav
Copy link

@minimav minimav commented Jul 18, 2024

Summary

This PR fixes a bug whereby values from unselected rows affect the scaling in nanoplots. In addition to the single values case given in #283, a simple example of the bug in the context of multiple values per row is below (when autoscale is used):

multiple_vals_df = pd.DataFrame(
    {
        "i": list(range(1, 3)),
        "lines_small": ["12.44 6.34", "5.2 -8.2 10"],
        "lines_large": ["12.44, 6.34", "5.2 -8.2 100"],
    }
)
(
    GT(multiple_vals_df, rowname_col="i")
    .fmt_nanoplot(columns="lines_small", rows=[0], plot_type="line", autoscale=True)
    .fmt_nanoplot(columns="lines_large", rows=[0], plot_type="line", autoscale=True)
)
Screenshot 2024-07-18 at 20 49 20

Single values example post-fix:

Screenshot 2024-07-18 at 19 53 06

Multiple values example post-fix:

Screenshot 2024-07-18 at 19 53 29

I have added a couple of unit tests based on existing ones, let me know if they work. Thanks for your work on this library, I have been using it at work and really enjoying it 🙂

Related GitHub Issues and PRs

Checklist

@codecov
Copy link

codecov bot commented Jul 19, 2024

Codecov Report

❌ Patch coverage is 83.33333% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 92.50%. Comparing base (9874b7c) to head (13b63b0).

Files with missing lines Patch % Lines
great_tables/_formats.py 83.33% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #404      +/-   ##
==========================================
+ Coverage   92.31%   92.50%   +0.18%     
==========================================
  Files          48       48              
  Lines        6039     6043       +4     
==========================================
+ Hits         5575     5590      +15     
+ Misses        464      453      -11     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@machow
Copy link
Collaborator

machow commented Feb 13, 2026

I'm so sorry it took us so long to get to this. @rich-iannone do you mind looking through? If needed, we may just need to start a new PR to address, since we've made a lot of changes since then.

@rich-iannone
Copy link
Member

This PR fixes the issue that fmt_nanoplot() used all rows for scaling of single-value nanoplots (one value per cell), even when rows= restricted which rows received nanoplots. This meant that an extreme value in a non-selected row would distort the scale of the selected rows' plots.

With this PR, here's an example where we generate nanoplots in a column of values that are common but exclude the last value (which are different between the two DFs, and very large in df_extreme).

import polars as pl
from great_tables import GT

df_normal = pl.DataFrame({"vals": [-5.3, 6.3, 5.2, -8.2, 10]})
df_extreme = pl.DataFrame({"vals": [-5.3, 6.3, 5.2, -8.2, 500.0]})

gt_normal = GT(df_normal).fmt_nanoplot(
    columns="vals",
    plot_type="bar",
    rows=[0, 1, 2, 3],
)

gt_extreme = GT(df_extreme).fmt_nanoplot(
    columns="vals",
    plot_type="bar",
    rows=[0, 1, 2, 3],
)

In the first table (gt_normal), the excluded value is 10 and so it's difficult to notice any scale effects.

gt_normal.show("browser")
image

However, the same values that exclude a large value of 500 show the same bar plots. The extreme value of 500 isn't considered in the scaling of the other values (they would appear compressed to the zero line).

gt_extreme.show("browser")
image

The unit tests compare single SVGs from two different DataFrames (with different unselected values). The equivalence check of the first row's SVG across the DFs indirectly shows that there aren't scaling effects from unselected cells.

Copy link
Member

@rich-iannone rich-iannone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants