You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
TL;DR: The set object uses the member item's hash, thus the member items can't be mutable. This means you can modify the object once it's in the set if you want the __contains__ functionality to still work. We need to come up with another container for CycloneDX objects.
class HashableClass:
def __init__(self, the_attribute=""):
self.the_attribute = the_attribute
def __hash__(self) -> int:
return hash(self.the_attribute)
s = set([HashableClass(the_attribute="Foobar")])
for h in s:
# prints True, as expected
print(h in s)
for h in s:
h.the_attribute = None
for h in s:
# prints False; not expected
print(h in s)
for h in s:
# prints True...also not expected
print(h in list(s))
@jkowalleck and @bitkeks joined in the conversation and we tried to come up with ways to rectify this.
A suggestion was to use a list, but it was pointed out that it would not "automatically de-dupe" if an identical object was inserted.
Another idea: extend SortedSort as CDXSortedSet, and override how it tracks and checks membership. In other words, use __eq__ (most likely) to check if an object is already in the set, and for the __contains__ operation. Yes, this would be slower (O(n) instead of O(1)), but would mean you could iterate over a set, modify the objects, and in operations would work after said operations.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
TL;DR: The
set
object uses the member item's hash, thus the member items can't be mutable. This means you can modify the object once it's in the set if you want the__contains__
functionality to still work. We need to come up with another container for CycloneDX objects.This started with a rather long discussion on Slack: https://cyclonedx.slack.com/archives/CVA0QJEVA/p1740618170768969
I noticed this:
Turns out,
set
uses a dictionary internally, so registers the object's hash upon insertion, and if the object's hash chagnes anin
test later will return false. This is a Python implementation detail: https://docs.python.org/3/library/stdtypes.html#set and https://docs.python.org/3/glossary.html#term-hashable@jkowalleck and @bitkeks joined in the conversation and we tried to come up with ways to rectify this.
A suggestion was to use a list, but it was pointed out that it would not "automatically de-dupe" if an identical object was inserted.
Another idea: extend SortedSort as CDXSortedSet, and override how it tracks and checks membership. In other words, use
__eq__
(most likely) to check if an object is already in the set, and for the__contains__
operation. Yes, this would be slower (O(n) instead of O(1)), but would mean you could iterate over a set, modify the objects, andin
operations would work after said operations.Othes ideas and suggestions are welcome!
Beta Was this translation helpful? Give feedback.
All reactions