hashtable: add bounds check in hashtableNext iterator #2611

uriyage · 2025-09-15T10:54:18Z

Prevent iteration beyond hashtable bounds by checking if iterator is already at the end before proceeding with the main iteration loop. This adds defensive bounds checking for table index and bucket index.

codecov · 2025-09-15T11:15:31Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 72.03%. Comparing base (a47e8fa) to head (c57878c).
⚠️ Report is 22 commits behind head on unstable.

Additional details and impacted files

@@             Coverage Diff              @@
##           unstable    #2611      +/-   ##
============================================
- Coverage     72.24%   72.03%   -0.22%     
============================================
  Files           127      128       +1     
  Lines         70820    71041     +221     
============================================
+ Hits          51167    51175       +8     
- Misses        19653    19866     +213

Files with missing lines	Coverage Δ
src/hashtable.c	`89.35% <100.00%> (+0.04%)`	⬆️

... and 28 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

zuiderkwast

Thanks Uri!

Note that this is just for making it safe to call hashtableNext again when it has already returned false. It's not really needed, but it makes it easier to detect bad usage, or is it for convenience?

Isn't calling hashtableNext() after the end a bug that we want to detect? Why not do an assert instead of graceful return then?

src/hashtable.c

Prevent iteration beyond hashtable bounds by checking if iterator is already at the end before proceeding with the main iteration loop. This adds defensive bounds checking for table index and bucket index. Signed-off-by: Uri Yagelnik <[email protected]>

src/hashtable.c

ranshid · 2025-09-17T08:53:08Z

@uriyage let's also add unittest for this?

src/hashtable.c

Signed-off-by: Uri Yagelnik <[email protected]>

uriyage · 2025-10-05T10:23:45Z

@uriyage let's also add unittest for this?

Done

ranshid

LGTM

zuiderkwast · 2025-10-06T08:24:43Z

Many times I've tried to reject adding dependencies from the low-level datastructures to the whole of the server globals and everything. Dependencies should be in the other direction, from high level to lower level components.

madolson · 2025-10-11T01:58:39Z

I somewhat agree with Viktor. We can add a debugAssert to severassert.h if we want, that is just a no-op.

I was also dubious of this PR, since it's just adding a defensive check. @uriyage @ranshid Are we still moving forward with this?

zuiderkwast · 2025-10-11T13:31:32Z

I somewhat agree with Viktor. We can add a debugAssert to severassert.h if we want, that is just a no-op.

We can just use assert instead. The check in the asserts are very light-weight, not even heavier than checking if the debugging is enabled.

Debug assert depends on a server config. That would prevent usage of this hashtable in non-server code (valkey-cli, valkey-benchmark).

zuiderkwast · 2025-10-11T13:37:56Z

@madolson According to a commment above, Uri explained why:

We have a use case where we don't save the iterator state and attempt to run it again.

I suppose we could wait for this use case to be upstreamed and then include this change at the same time.

madolson · 2025-10-11T16:57:35Z

Couldn't they then just save the state?

I agree, I feel like I would rather close this and re-open if that use case becomes public. I'm not sure what that case is.

JimB123 · 2025-10-23T20:44:47Z

src/hashtable.c

+#define HASHTABLE_ITER_PRIMARY_TABLE 0
+#define HASHTABLE_ITER_REHASH_TABLE 1


These defines are unnecessary and create an inconsistency with the rest of the code. If we want to create theses (which I don't recommend) the entire file should be refactored to use them consistently.

JimB123 · 2025-10-23T21:26:12Z

Can someone more clearly lay out the scenario here? I'm thinking that this is a case of: 1) creating a safe iterator on the hashtable, 2) deleting or moving the underlying table, 3) continuing to call hashtableNext. Is this right? If so, I don't think this solves the problem.

With a safe iterator, it's necessary to ensure that the iterator gets deleted before the underlying hashtable is deleted. If not, when we delete the iterator, we will try to re-enable rehashing on the (deleted) hashtable.

One possible way to address this is to keep a list of all "safe" iterators associated with each hashtable. When the hashtable is deleted, it could then mark the iterators as invalid. This could also be done for unsafe iterators, but I don't think that's necessary as unsafe iterators, by their nature, are created/used/deleted in a short time.

JimB123 · 2025-10-24T16:39:40Z

I've done a review on the iterator, and these are the things that need to be addressed. There are a few potential issues with long-lived (safe) iterators. One use case for this is for defrag which iterates over hashes in an incremental manner - spanning across timer invocations.

Issue 1: Rehashing isn't paused immediately

Ref:

valkey/src/hashtable.c

Lines 2036 to 2037 in 1cf0df9

    
           if (isSafe(iter)) { 
        
               hashtablePauseRehashing(iter->hashtable);

At this code reference, the logic indicates that if it's a safe iterator, that once we perform the first call to hashtableNext that we should pause rehashing. Rather than pausing when the iterator is created, the pause is deferred until we start iterating. This is potentially dangerous. Code, like kvstore, looks at the current rehashing paused state to determine if it's safe to delete the table. This makes it possible for: 1) iterator created, 2) kvstore deletes the table, 3) iteration begins, and the isSafe() clause dereferences the deleted hash table.

Issue 2: Empty hashtables remain in inconsistent state

Ref:

valkey/src/hashtable.c

Lines 2042 to 2044 in 1cf0df9

    
           if (iter->hashtable->tables[0] == NULL) { 
        
               /* Empty hashtable. We're done. */ 
        
               break;

With an empty hash table, the iterator returns false (indicating that iteration has completed). However, index/table still remain in the initial state - as if iteration has not yet begun. This makes it possible for an iterator to return false (indicating completion) but then later, if called again, begin returning newly added values. This likely isn't dangerous, because the caller is unlikely to call hashtableNext a second time after it has already returned false. But it is inconsistent, and easily fixed.

Issue 3: Deletion of hashtable makes `hashtableNext` unsafe

Ref:

valkey/src/hashtable.c

Lines 2054 to 2063 in 1cf0df9

    
                       /* Advance to the next position within the bucket, or to the next 
        
                        * child bucket in a chain, or to the next bucket index, or to the 
        
                        * next table. */ 
        
                       iter->pos_in_bucket++; 
        
                       if (iter->bucket->chained && iter->pos_in_bucket >= ENTRIES_PER_BUCKET - 1) { 
        
                           iter->pos_in_bucket = 0; 
        
                           iter->bucket = getChildBucket(iter->bucket); 
        
                       } else if (iter->pos_in_bucket >= ENTRIES_PER_BUCKET) { 
        
                           /* Bucket index done. */ 
        
                           if (isSafe(iter)) {

Once the iterator has started, hashtableNext relies on both iter->bucket and iter->hashtable to remain valid. It looks like iter->bucket is ok - it's handled internally that buckets aren't deleted if rehashing is paused. However there's no protection for iter->hashtable. This requires that every user of a safe iterator must be aware of hashtable deletion and avoid calling hashtableNext once the table has been deleted.

I'll suggest that the proper solution is to have a hashtable maintain a list of safe iterators. If the hashtable gets deleted, each of the iterators should be marked as completed such that future calls to hashtableNext will simply return false, without accessing iter->hashtable.

Issue 4: Deletion of hashtable prevents deletion of the iterator

Ref:

valkey/src/hashtable.c

Lines 2023 to 2024 in 1cf0df9

    
           void hashtableReleaseIterator(hashtableIterator *iterator) { 
        
               hashtableResetIterator(iterator);

This is fairly serious. As it stands, every user of a safe iterator must be aware of a hashtable deletion BEFORE it happens. Once the hashtable has been deleted, it becomes impossible to safely delete the iterator. The iterator deletion attempts to unpause rehashing and accesses the (deleted) hashtable.

I'll suggest that the solution for this is identical to the proposed solution for issue 3. Once the iterator is marked as completed, it should not attempt to access iter->hashtable at deletion time.

@zuiderkwast @rainsupreme - please review.

rainsupreme · 2025-10-24T22:01:37Z

That's a pretty thorough analysis Jim! I guess there are maybe a few scenarios in OSS where it feels like we might delete the hashtable while the iterator exists, though I haven't fully investigated these:

pubsub: client disconnect while iterating unsubscribe all?
vset: could be deleted while processing expiry?
modules: subcommands hashtable could be freed while being iterated - can modules add/remove commands or be loaded/unloaded at any time?

Adding tracking for safe iterator invalidation should be pretty lightweight in terms of performance and memory. I think we should add it, with unit tests. Also, I'm pretty sure this issue exists for dict's safe iterator as well - dict will be worth fixing if hashtable is.

rainsupreme

We should invalidate safe iterators for hashtable and dict when the underlying collection is deleted, to prevent invalid memory access and undefined states.

JimB123 · 2025-10-27T14:02:32Z

One other potential issue is with defragmentation. If the hashtable is moved during defragmentation, any active safe iterators will access the deleted memory. I don't think this can happen with defrag's own iterator, but if there's a potential for other iterators, this might be a concern. (e.g. forkless save?)

zuiderkwast · 2025-10-27T15:17:43Z

I think we should fix these bugs found by Jim in a separate PR. Essentially, each hashtable needs to keep track of a list of the safe iterators. (The same problems exist with dict, I imagine.)

However, I don't think we allow long-lived iterators currently. Only a few operations are allowed while iterating (i.e. while the iterator exists) so letting the event loop run and later resume iteration is not currently supported, as I read these doc comments: https://github.com/valkey-io/valkey/blob/9.0/src/hashtable.c#L1960-L1965

rainsupreme · 2025-10-28T23:12:28Z

I think this falls under the same category as ensuring someone doesn't call hashtableNext after it's returned false, but I don't mind if it's a separate PR.

At least then let's fix the doc comments to clarify this: Other activities (like event loop, defrag, threads) that could defrag/delete the hashtable must not be allowed to run while the iterator is considered valid.

github-actions bot assigned uriyage Sep 15, 2025

ranshid requested a review from zuiderkwast September 15, 2025 10:54

uriyage force-pushed the hashtable-iterator-bounds branch from e54a9cf to 31e5445 Compare September 15, 2025 10:55

zuiderkwast reviewed Sep 15, 2025

View reviewed changes

src/hashtable.c Outdated Show resolved Hide resolved

ranshid reviewed Sep 15, 2025

View reviewed changes

src/hashtable.c Outdated Show resolved Hide resolved

uriyage force-pushed the hashtable-iterator-bounds branch from 31e5445 to 6df1a9a Compare September 15, 2025 11:44

zuiderkwast reviewed Sep 15, 2025

View reviewed changes

src/hashtable.c Outdated Show resolved Hide resolved

ranshid reviewed Sep 17, 2025

View reviewed changes

src/hashtable.c Outdated Show resolved Hide resolved

zuiderkwast reviewed Sep 17, 2025

View reviewed changes

src/hashtable.c Show resolved Hide resolved

Address PR comments

c57878c

Signed-off-by: Uri Yagelnik <[email protected]>

uriyage force-pushed the hashtable-iterator-bounds branch from 39f159b to c57878c Compare October 5, 2025 10:21

ranshid approved these changes Oct 6, 2025

View reviewed changes

JimB123 reviewed Oct 23, 2025

View reviewed changes

JimB123 requested a review from rainsupreme October 24, 2025 16:41

rainsupreme suggested changes Oct 24, 2025

View reviewed changes

		#define HASHTABLE_ITER_PRIMARY_TABLE 0
		#define HASHTABLE_ITER_REHASH_TABLE 1

hashtable: add bounds check in hashtableNext iterator #2611

Are you sure you want to change the base?

hashtable: add bounds check in hashtableNext iterator #2611

Conversation

uriyage commented Sep 15, 2025

Uh oh!

codecov bot commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

zuiderkwast left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ranshid commented Sep 17, 2025

Uh oh!

Uh oh!

uriyage commented Oct 5, 2025

Uh oh!

ranshid left a comment

Choose a reason for hiding this comment

Uh oh!

zuiderkwast commented Oct 6, 2025

Uh oh!

madolson commented Oct 11, 2025

Uh oh!

zuiderkwast commented Oct 11, 2025

Uh oh!

zuiderkwast commented Oct 11, 2025

Uh oh!

madolson commented Oct 11, 2025

Uh oh!

JimB123 Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

JimB123 commented Oct 23, 2025

Uh oh!

JimB123 commented Oct 24, 2025

Issue 1: Rehashing isn't paused immediately

Issue 2: Empty hashtables remain in inconsistent state

Issue 3: Deletion of hashtable makes hashtableNext unsafe

Issue 4: Deletion of hashtable prevents deletion of the iterator

Uh oh!

rainsupreme commented Oct 24, 2025

Uh oh!

rainsupreme left a comment

Choose a reason for hiding this comment

Uh oh!

JimB123 commented Oct 27, 2025

Uh oh!

zuiderkwast commented Oct 27, 2025

Uh oh!

rainsupreme commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

codecov bot commented Sep 15, 2025 •

edited

Loading

Issue 3: Deletion of hashtable makes `hashtableNext` unsafe

rainsupreme commented Oct 28, 2025 •

edited

Loading