HERD.add_ref: only warn on entity_uri mismatch, not on re-passing the same URI#1513
Open
bendichter wants to merge 2 commits into
Open
HERD.add_ref: only warn on entity_uri mismatch, not on re-passing the same URI#1513bendichter wants to merge 2 commits into
bendichter wants to merge 2 commits into
Conversation
When add_ref is called with an entity_id that already exists in the HERD,
the entity tables are normalized and the existing entity row is reused -- no
duplicate is ever created. Previously add_ref warned ("This entity already
exists. Ignoring new entity uri") whenever an entity_uri was passed for an
existing entity, even when the URI was identical. That made re-passing the
same entity_uri -- a common pattern when annotating many objects or files
with the same entity -- noisy, and forced callers to special-case it (e.g.
`entity_uri=None if entity is not None else uri`).
Now add_ref only warns when a *different* entity_uri is provided for an
existing entity_id; the existing URI is always kept. Re-passing the same URI
is silent.
Addresses NeurodataWithoutBorders/pynwb#2200 review:
NeurodataWithoutBorders/pynwb#2200 (comment)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## dev #1513 +/- ##
=======================================
Coverage 93.22% 93.22%
=======================================
Files 41 41
Lines 10224 10224
Branches 2109 2109
=======================================
Hits 9531 9531
Misses 417 417
Partials 276 276 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
From the HERD tutorial review in pynwb (NeurodataWithoutBorders/pynwb#2200):
— NeurodataWithoutBorders/pynwb#2200 (comment)
I verified the dedup behavior empirically: when
add_refis called with anentity_idthat already exists,get_entityfinds it and the existing entity row is reused — no duplicate is ever inserted, so the tables are normalized as intended.The remaining wrinkle is ergonomic. Previously
add_refwarned —— whenever an
entity_uriwas passed for an existingentity_id, even when the URI was identical to the stored one. Re-passing the sameentity_uriis the common case when annotating many objects/files with the same entity (e.g. tagging the species of every file in a dandiset), so this produced spurious warnings and pushed that streaming tutorial to special-case it:Change
add_refnow warns only when a differententity_uriis provided for an existingentity_id. The existing URI is always kept (unchanged behavior); re-passing the same URI is silent. The mismatch warning is also more informative (it names both URIs and the entity_id).This lets callers drop the
entity is not Noneworkaround and simply always passentity_uri=....Tests
test_entity_uri_warning_on_mismatch— a different URI for an existing entity warns (asserts the exact message) and keeps the stored URI; the entity is not duplicated.test_entity_uri_no_warning_when_same— re-passing the same URI raises no warning (warnings.simplefilter("error")) and does not duplicate the entity row.All 80
test_resourcestests pass;ruffclean.Note: this PR is independent of #1512 (the
file-argument removal) — both touchadd_refbut in different sections and should merge cleanly in either order.🤖 Generated with Claude Code