Skip to content

HERD.add_ref: only warn on entity_uri mismatch, not on re-passing the same URI#1513

Open
bendichter wants to merge 2 commits into
hdmf-dev:devfrom
bendichter:herd-add-ref-warn-on-uri-mismatch
Open

HERD.add_ref: only warn on entity_uri mismatch, not on re-passing the same URI#1513
bendichter wants to merge 2 commits into
hdmf-dev:devfrom
bendichter:herd-add-ref-warn-on-uri-mismatch

Conversation

@bendichter

Copy link
Copy Markdown
Contributor

Motivation

From the HERD tutorial review in pynwb (NeurodataWithoutBorders/pynwb#2200):

let's check to see if these cases are handled. We want to ensure that the tables are always normalized, so adding an identical entry should not duplicate a row

NeurodataWithoutBorders/pynwb#2200 (comment)

I verified the dedup behavior empirically: when add_ref is called with an entity_id that already exists, get_entity finds it and the existing entity row is reused — no duplicate is ever inserted, so the tables are normalized as intended.

The remaining wrinkle is ergonomic. Previously add_ref warned —

This entity already exists. Ignoring new entity uri

whenever an entity_uri was passed for an existing entity_id, even when the URI was identical to the stored one. Re-passing the same entity_uri is the common case when annotating many objects/files with the same entity (e.g. tagging the species of every file in a dandiset), so this produced spurious warnings and pushed that streaming tutorial to special-case it:

entity = herd.get_entity(entity_id="NCBI_TAXON:10090")
entity_uri = None if entity is not None else "https://..."   # only to dodge the warning
herd.add_ref(..., entity_id="NCBI_TAXON:10090", entity_uri=entity_uri)

Change

add_ref now warns only when a different entity_uri is provided for an existing entity_id. The existing URI is always kept (unchanged behavior); re-passing the same URI is silent. The mismatch warning is also more informative (it names both URIs and the entity_id).

This lets callers drop the entity is not None workaround and simply always pass entity_uri=....

Tests

  • test_entity_uri_warning_on_mismatch — a different URI for an existing entity warns (asserts the exact message) and keeps the stored URI; the entity is not duplicated.
  • test_entity_uri_no_warning_when_same — re-passing the same URI raises no warning (warnings.simplefilter("error")) and does not duplicate the entity row.

All 80 test_resources tests pass; ruff clean.

Note: this PR is independent of #1512 (the file-argument removal) — both touch add_ref but in different sections and should merge cleanly in either order.

🤖 Generated with Claude Code

When add_ref is called with an entity_id that already exists in the HERD,
the entity tables are normalized and the existing entity row is reused -- no
duplicate is ever created. Previously add_ref warned ("This entity already
exists. Ignoring new entity uri") whenever an entity_uri was passed for an
existing entity, even when the URI was identical. That made re-passing the
same entity_uri -- a common pattern when annotating many objects or files
with the same entity -- noisy, and forced callers to special-case it (e.g.
`entity_uri=None if entity is not None else uri`).

Now add_ref only warns when a *different* entity_uri is provided for an
existing entity_id; the existing URI is always kept. Re-passing the same URI
is silent.

Addresses NeurodataWithoutBorders/pynwb#2200 review:
NeurodataWithoutBorders/pynwb#2200 (comment)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@codecov

codecov Bot commented Jun 23, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.22%. Comparing base (af7879a) to head (2e5dbe8).

Additional details and impacted files
@@           Coverage Diff           @@
##              dev    #1513   +/-   ##
=======================================
  Coverage   93.22%   93.22%           
=======================================
  Files          41       41           
  Lines       10224    10224           
  Branches     2109     2109           
=======================================
  Hits         9531     9531           
  Misses        417      417           
  Partials      276      276           

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@bendichter bendichter marked this pull request as ready for review June 23, 2026 19:49

@oruebel oruebel left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I'll let @rly approve since Ryan is working on HERD things right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants