Skip to content

Add incremental invalidation engine#641

Open
st0012 wants to merge 1 commit intomainfrom
add-incremental-invalidation-pr2
Open

Add incremental invalidation engine#641
st0012 wants to merge 1 commit intomainfrom
add-incremental-invalidation-pr2

Conversation

@st0012
Copy link
Member

@st0012 st0012 commented Mar 4, 2026

Replaces the old remove_definitions_for_document + invalidate_ancestor_chains approach with a targeted invalidation engine. When a file is updated or deleted, the engine traces through the name_dependents reverse index to invalidate only the affected declarations, names, and references — instead of requiring a full graph rebuild.

How it works

consume_document_changes() (renamed from update()) and delete_document() run a three-step pipeline:

  1. invalidate — read-only scan that seeds a worklist from old/new definitions and references, building a pending_detachments side table for definitions that need to be detached from their declarations
  2. remove_document_data — cleans up phase 1 data (definitions, references, names, strings) from maps
  3. extend — merges new content and queues work items for the resolver

The resolver still does clear_declarations + full rebuild. Wiring it to drain pending_work incrementally is a follow-up.

Invalidation worklist

The worklist processes three item types:

  • Declaration — two modes:
    • Remove: no definitions remain or owner was already removed (orphaned). Cascades to members, singleton class, and descendants. Orphaned definitions are re-queued for re-resolution (e.g. class Foo::Bar survives even if Foo changes from a module to an alias).
    • Update: declaration survives but its ancestor chain may have changed (mixin added/removed, superclass changed). Clears ancestors/descendants and re-queues ancestor resolution.
  • Name — structural dependency broken (name's nesting or parent scope removed). Unresolves the name and cascades to all dependents.
  • References — ancestor context changed, but the name itself is still valid. Needed for mixin-related invalidation:
class Foo < Bar
  CONST
end

# Another file adds:
class Foo
  include Baz # Foo's ancestors changed, so references like CONST need re-evaluation
end

Cascade differentiation

The name_dependents reverse index distinguishes ChildName (compact syntax Foo::Bar) from NestedName (nested syntax module Foo; class Bar; end; end):

  • Structural cascade (name removed): both ChildName and NestedNameName
  • Ancestor-triggered cascade (mixin changed): ChildNameName (resolves through parent), NestedNameReferences (only references need rechecking)

@st0012 st0012 force-pushed the add-incremental-invalidation-pr2 branch from 4a7fc18 to f39ef05 Compare March 4, 2026 23:00
@st0012 st0012 marked this pull request as ready for review March 4, 2026 23:45
@st0012 st0012 requested a review from a team as a code owner March 4, 2026 23:45
@st0012 st0012 force-pushed the add-incremental-invalidation-pr2 branch from f39ef05 to 267785b Compare March 4, 2026 23:50
self.remove_definitions_for_document(&document);
let old_document = self.documents.remove(&uri_id);

self.invalidate(old_document.as_ref(), Some(&other));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth adding a fast path for the initial indexing on boot (which can trigger no invalidation and no removal of data).

Maybe a boolean flag for skipping invalidation.

Comment on lines +865 to +866
&& let Some(nesting_id) = name_ref.nesting()
&& let Some(NameRef::Resolved(resolved)) = self.names.get(nesting_id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify that document change this is accounting for? I'm having a hard time understanding it.

A constant reference was changed and we're enqueuing invalidation for the reference's nesting declaration.

Copy link
Member Author

@st0012 st0012 Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for

class Foo
  include Bar # constant reference, when added we invalidate Foo entirely for now
end

I've added comments for it.


self.declarations.remove(&decl_id);
} else {
// Ancestor-stale mode
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added more comments. Basically, it's triggered in

class Foo
  include Bar # this is added/removed
end

So we simply change ancestors/descendents update in this branch.

}
}

self.declarations.remove(&decl_id);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also doing data removal. Maybe this is fine, but I'm calling it out because the method documentation mentions a separation between invalidation and removal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I currently treat declaration removal as invalidation, kinda similar to unresolving a name. We remove (unresolve) the declaration here if we found it has no underlying definitions anymore.

The underlying materials (definitions, constant references, names...etc.) are only removed in remove_document_data.

return;
};

// Remove self from each ancestor's descendant set
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could be misunderstanding and maybe this was an existing bug, but removing self is not enough. The entire ancestor chain of self must be removed from descendants. However, we should absolutely not try to perform this removal because there are module deduping rules that you cannot possibly account for with a removal.

Whenever a declaration gets invalidated, we always need to invalidate the ancestors of all descendants. In both branches of this method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both paths will update the ancestors. I've updated the document to make it more clear: invalidate_declaration either "remove/rebuild" or "update" a declaration. In both paths we update ancestors.

This also means there are optimization opportunities in both paths we can do later, which I also included in comments.

@st0012 st0012 force-pushed the add-incremental-invalidation-pr2 branch 4 times, most recently from de13a32 to d560ba6 Compare March 9, 2026 23:14
@st0012 st0012 requested a review from vinistock March 9, 2026 23:16
@st0012 st0012 self-assigned this Mar 9, 2026
@st0012 st0012 force-pushed the add-incremental-invalidation-pr2 branch from d560ba6 to 6862eca Compare March 10, 2026 17:24
@st0012
Copy link
Member Author

st0012 commented Mar 10, 2026

If we adopt #654, we should carry the memoized name_depth caching into the incremental path here. The depth sort in prepare_units is a correctness requirement — without it, 13 resolution tests fail because the resolution loop's made_progress check gives up when children are processed before parents. This means the incremental prepare_units must also sort by depth, and for large invalidation cascades (e.g., a change to a widely-used module rippling through name_dependents), the non-memoized recursive name_depth in the sort comparator would hit the same bottleneck we fixed in #654. The fix is straightforward: call compute_name_depths before sorting the pending units, same as the full resolution path.

Introduces a worklist-based invalidation engine that cascades changes
through the graph when documents are updated or deleted. Uses
ChildName/NestedName edges from the name_dependents index to propagate
invalidation with two distinct modes:

- Structural cascade (UnresolveName): declaration removed or scope broken
- Ancestor cascade (UnresolveReferences): ancestor chain changed

Replaces the has_unresolved_dependency runtime check with explicit
invalidation variants determined at queue time.
@st0012 st0012 force-pushed the add-incremental-invalidation-pr2 branch from 6862eca to e37a74a Compare March 11, 2026 21:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants