Skip to content

baml_language(feat): implement generic parameter declarations#3196

Open
miguelcsx wants to merge 4 commits intoBoundaryML:canaryfrom
miguelcsx:feat/generic-parameters
Open

baml_language(feat): implement generic parameter declarations#3196
miguelcsx wants to merge 4 commits intoBoundaryML:canaryfrom
miguelcsx:feat/generic-parameters

Conversation

@miguelcsx
Copy link
Contributor

@miguelcsx miguelcsx commented Mar 1, 2026

We already supports call-site generic typing for a few built-in functions (e.g., baml.fetch_as<Todo>(...) and baml.deep_copy<T>(x)), and the type checker actively resolves those substitutions.

However, BAML lacked support for declaration-site generics on user-defined items. If a user tried to declare function Foo<T>(...), class Box<T>, or enum Option<T>, the parser failed or ignored it. In the HIR layer, queries like class_generic_params were merely stubbed out, hard-coded to return empty parameter lists with a TODO: Extract generic parameters from CST when BAML adds generic syntax.

This PR bridges that gap by adding the ability to parse generic syntax in declarations and correctly lower those definitions into the HIR.

TODOs

  • Address generic parameters in TIR

Summary by CodeRabbit

  • New Features

    • Generic-parameter lists are exposed and queryable on functions, classes, enums, type aliases, and methods.
    • New public type metadata for generic parameters is available.
  • Improvements

    • CST-driven extraction of generics with improved handling for duplicate names and hash collisions.
    • Collision-aware local ID allocation to make local item identification more robust.
  • Tests

    • Added end-to-end tests for generic extraction, duplicate-name resolution, and collision scenarios.

@vercel
Copy link

vercel bot commented Mar 1, 2026

@miguelcsx is attempting to deploy a commit to the Boundary Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 1, 2026

📝 Walkthrough

Walkthrough

Adds end-to-end generic-parameter support: parser recognizes <...> generic lists, syntax/AST expose GENERIC_PARAM_LIST and GenericParamList, HIR extracts generic params from the CST via new scanners/helpers, and a collision-aware LocalIdAllocator is introduced for local item ID allocation.

Changes

Cohort / File(s) Summary
Parser & CST/AST
baml_language/crates/baml_compiler_parser/src/parser.rs, baml_language/crates/baml_compiler_syntax/src/syntax_kind.rs, baml_language/crates/baml_compiler_syntax/src/ast.rs
Adds parse_generic_param_list invocation in declaration parsers; introduces GENERIC_PARAM_LIST syntax kind and GenericParamList AST node with params() and generic_param_list() accessors on FunctionDef, ClassDef, EnumDef, TypeAliasDef. Tests for parsing and recovery added.
HIR: generics extraction
baml_language/crates/baml_compiler_hir/src/generics.rs, baml_language/crates/baml_compiler_hir/src/lib.rs
Implements CST-driven extraction functions (*_generic_params_from_cst) to derive GenericParams/TypeParam from CST/AST; public HIR query entrypoints now delegate to these CST-based implementations; adds helpers to traverse CST and match LocalId allocation.
Local ID allocation
baml_language/crates/baml_compiler_hir/src/ids.rs, baml_language/crates/baml_compiler_hir/src/item_tree.rs
Adds LocalIdAllocator with collision-aware allocation and allocate_local_id helper; replaces inline id-generation in ItemTree to use allocator and shared allocation logic.
Tests & utilities
baml_language/crates/baml_tests/src/incremental/scenarios.rs
Adds name-based lookup helpers, collision finder, and extensive tests covering generic-parameter extraction for functions, methods, classes, enums, type aliases, duplicate generic names, and hash-collision scenarios.

Sequence Diagram(s)

sequenceDiagram
    participant Parser
    participant CST
    participant AST
    participant HIR_Query
    participant Extractor
    participant Allocator

    Parser->>CST: parse source (detect `<`)
    CST->>AST: attach GENERIC_PARAM_LIST node
    HIR_Query->>Extractor: request generic params for item
    Extractor->>AST: call generic_param_list() / iterate params()
    Extractor->>Allocator: alloc_id(kind, name)
    Allocator-->>Extractor: LocalItemId
    Extractor->>Extractor: build Arc<GenericParams>
    Extractor-->>HIR_Query: return Arc<GenericParams>
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 73.61% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'baml_language(feat): implement generic parameter declarations' accurately and specifically summarizes the main change: adding support for generic parameter declarations in BAML user-defined items.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@hellovai
Copy link
Contributor

hellovai commented Mar 1, 2026

@miguelcsx if you've got time, i'd love to hop on a call with you! been enjoying the PRs you're building out and i wonder if we can better support you. Either tag me on discord or want to grab a time on my calendar?

vbv@boundaryml.com

if the timeslot is open, you can book it! (I have open read access).

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6


ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ac50196 and d102c72.

📒 Files selected for processing (7)
  • baml_language/crates/baml_compiler_hir/src/generics.rs
  • baml_language/crates/baml_compiler_hir/src/ids.rs
  • baml_language/crates/baml_compiler_hir/src/lib.rs
  • baml_language/crates/baml_compiler_parser/src/parser.rs
  • baml_language/crates/baml_compiler_syntax/src/ast.rs
  • baml_language/crates/baml_compiler_syntax/src/syntax_kind.rs
  • baml_language/crates/baml_tests/src/incremental/scenarios.rs

Comment on lines +157 to +294
pub(crate) fn function_generic_params_from_cst(
db: &dyn Db,
func: FunctionId<'_>,
) -> Arc<GenericParams> {
let file = func.file(db);
let target_id = func.id(db).as_u32();

let result = with_source_file(db, file, |source_file| {
let mut allocator = LocalIdAllocator::new();
for item in source_file.items() {
match item {
ast::Item::Function(func_node) => {
if let Some(params) =
scan_function_item_for_generics(&mut allocator, &func_node, target_id)
{
return params;
}
}
ast::Item::Class(class_node) => {
if let Some(params) =
scan_class_methods_for_generics(&mut allocator, &class_node, target_id)
{
return params;
}
}
_ => {}
}
}
empty_generic_params()
});

match result {
Some(params) => params,
None => empty_generic_params(),
}
}

pub(crate) fn class_generic_params_from_cst(db: &dyn Db, class: ClassId<'_>) -> Arc<GenericParams> {
let file = class.file(db);
let target_id = class.id(db).as_u32();

let result = with_source_file(db, file, |source_file| {
let mut allocator = LocalIdAllocator::new();
find_top_level_generic_params(
&source_file,
&mut allocator,
target_id,
ItemKind::Class,
|item| {
if let ast::Item::Class(class_node) = item {
class_node.name().map(|name_token| {
(
Name::new(name_token.text()),
class_node.generic_param_list(),
)
})
} else {
None
}
},
)
});

if let Some(Some(params)) = result {
params
} else {
empty_generic_params()
}
}

pub(crate) fn enum_generic_params_from_cst(
db: &dyn Db,
enum_def: EnumId<'_>,
) -> Arc<GenericParams> {
let file = enum_def.file(db);
let target_id = enum_def.id(db).as_u32();

let result = with_source_file(db, file, |source_file| {
let mut allocator = LocalIdAllocator::new();
find_top_level_generic_params(
&source_file,
&mut allocator,
target_id,
ItemKind::Enum,
|item| {
if let ast::Item::Enum(enum_node) = item {
enum_node.name().map(|name_token| {
(Name::new(name_token.text()), enum_node.generic_param_list())
})
} else {
None
}
},
)
});

if let Some(Some(params)) = result {
params
} else {
empty_generic_params()
}
}

pub(crate) fn type_alias_generic_params_from_cst(
db: &dyn Db,
alias: TypeAliasId<'_>,
) -> Arc<GenericParams> {
let file = alias.file(db);
let target_id = alias.id(db).as_u32();

let result = with_source_file(db, file, |source_file| {
let mut allocator = LocalIdAllocator::new();
find_top_level_generic_params(
&source_file,
&mut allocator,
target_id,
ItemKind::TypeAlias,
|item| {
if let ast::Item::TypeAlias(alias_node) = item {
alias_node.name().map(|name_token| {
(
Name::new(name_token.text()),
alias_node.generic_param_list(),
)
})
} else {
None
}
},
)
});

if let Some(Some(params)) = result {
params
} else {
empty_generic_params()
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Add focused unit tests for CST generic-param extraction and share cargo test --lib results.

Please add unit tests in this crate for:

  • function ID alignment with compiler-generated <client>.resolve,
  • class-method scanning with malformed/unnamed methods,
  • LLM synthetic function variants inheriting the base generic list.

Please also include local cargo test --lib output for this change set.

As per coding guidelines, **/*.rs: Prefer writing Rust unit tests over integration tests where possible, and always run cargo test --lib if Rust code changes.

@miguelcsx
Copy link
Contributor Author

Hey @hellovai! Thanks so much for reaching out.

I saw your calendar link. Let me check my schedule and pick a time that works.

I'll send you a quick message with some options. Looking forward to chatting!

- add support for generic parameter declarations on functions, classes, enums, and type aliases

- Extend the parser to recognize `<...>` after declaration keywords and introduce AST-level `GenericParamList`
- fix CST generic param replay for class methods and client resolve
@miguelcsx miguelcsx force-pushed the feat/generic-parameters branch from d102c72 to 97e30fb Compare March 5, 2026 17:07
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5


ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8e09698b-32cc-486d-b29c-f5cd3e5c4649

📥 Commits

Reviewing files that changed from the base of the PR and between d102c72 and 97e30fb.

📒 Files selected for processing (8)
  • baml_language/crates/baml_compiler_hir/src/generics.rs
  • baml_language/crates/baml_compiler_hir/src/ids.rs
  • baml_language/crates/baml_compiler_hir/src/item_tree.rs
  • baml_language/crates/baml_compiler_hir/src/lib.rs
  • baml_language/crates/baml_compiler_parser/src/parser.rs
  • baml_language/crates/baml_compiler_syntax/src/ast.rs
  • baml_language/crates/baml_compiler_syntax/src/syntax_kind.rs
  • baml_language/crates/baml_tests/src/incremental/scenarios.rs

- simple refactor to use helper functions
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
baml_language/crates/baml_compiler_hir/src/generics.rs (1)

292-296: ⚠️ Potential issue | 🟠 Major

Keep enum/type-alias ID allocation rules exactly aligned with lowering.

Line [293] and Line [324] currently drop nameless items, and the enum path at Line [292] does not filter bodyless recovery nodes. That can desynchronize LocalIdAllocator ordering and return generic params for the wrong EnumId/TypeAliasId.

🔧 Suggested fix
         find_top_level_generic_params(
             &source_file,
             &mut allocator,
             target_id,
             ItemKind::Enum,
             |item| {
                 if let ast::Item::Enum(enum_node) = item {
-                    item_name_and_list(
-                        enum_node.name().map(|token| Name::new(token.text())),
-                        enum_node.generic_param_list(),
-                    )
+                    if !enum_node.has_body() {
+                        return None;
+                    }
+                    let name = enum_node
+                        .name()
+                        .map(|token| Name::new(token.text()))
+                        .unwrap_or_else(|| Name::new("UnnamedEnum"));
+                    Some((name, enum_node.generic_param_list()))
                 } else {
                     None
                 }
             },
         )
@@
         find_top_level_generic_params(
             &source_file,
             &mut allocator,
             target_id,
             ItemKind::TypeAlias,
             |item| {
                 if let ast::Item::TypeAlias(alias_node) = item {
-                    item_name_and_list(
-                        alias_node.name().map(|token| Name::new(token.text())),
-                        alias_node.generic_param_list(),
-                    )
+                    let name = alias_node
+                        .name()
+                        .map(|token| Name::new(token.text()))
+                        .unwrap_or_else(|| Name::new("UnnamedTypeAlias"));
+                    Some((name, alias_node.generic_param_list()))
                 } else {
                     None
                 }
             },
         )
#!/bin/bash
set -euo pipefail

# Verify lowering and CST scanning use identical enum/type-alias allocation rules.
rg -n -C4 'ItemKind::Enum|ItemKind::TypeAlias|UnnamedEnum|UnnamedTypeAlias|has_body\(' \
  baml_language/crates/baml_compiler_hir/src/lib.rs \
  baml_language/crates/baml_compiler_hir/src/generics.rs

Also applies to: 322-326


ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 538ab30f-24e7-42b3-b717-914e22d72d89

📥 Commits

Reviewing files that changed from the base of the PR and between 97e30fb and 2557934.

📒 Files selected for processing (1)
  • baml_language/crates/baml_compiler_hir/src/generics.rs

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
baml_language/crates/baml_compiler_parser/src/parser.rs (1)

2107-2110: ⚠️ Potential issue | 🟠 Major

Declaration generic parsing can still leak pending_greaters across contexts.

When expect_greater() consumes >>, declaration parsing may leave pending_greaters > 0, which later interferes with unrelated type parsing. This is a parser-state corruption path for malformed declarations.

♻️ Suggested guard/flush after declaration generic close
             p.expect_greater();
+            if p.pending_greaters > 0 {
+                if let Some(span) = p.pending_greater_span {
+                    p.error(
+                        format!(
+                            "Unmatched '>' in generic parameter list (found {} extra)",
+                            p.pending_greaters
+                        ),
+                        span,
+                    );
+                }
+                for _ in 0..p.pending_greaters {
+                    p.events.push(Event::Token {
+                        kind: SyntaxKind::GREATER,
+                        text: ">".to_string(),
+                    });
+                }
+                p.pending_greaters = 0;
+                p.pending_greater_span = None;
+            }

Also applies to: 2136-2137


ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 03e13c59-fa7f-4fae-bd92-4d4d07452605

📥 Commits

Reviewing files that changed from the base of the PR and between 2557934 and ede07d6.

📒 Files selected for processing (4)
  • baml_language/crates/baml_compiler_hir/src/ids.rs
  • baml_language/crates/baml_compiler_hir/src/lib.rs
  • baml_language/crates/baml_compiler_parser/src/parser.rs
  • baml_language/crates/baml_tests/src/incremental/scenarios.rs

Comment on lines +195 to +242
pub(crate) fn allocate_local_id<T>(
next_index: &mut FxHashMap<(ItemKind, u16), u16>,
kind: ItemKind,
name: &Name,
) -> LocalItemId<T> {
let hash = hash_name(name);
let index = next_index.entry((kind, hash)).or_insert(0);
let id = LocalItemId::new(hash, *index);
// Saturating add prevents silent wraparound in release builds.
// Reaching u16::MAX collisions for a single (kind, hash) bucket is
// practically impossible (requires 65 535 items with the same 16-bit
// name hash), so saturation is a safe sentinel rather than a recoverable
// error path.
debug_assert!(
*index < u16::MAX,
"LocalItemId collision index saturated for {name:?}"
);
*index = index.saturating_add(1);
id
}

/// Allocator for `LocalItemId`s with collision handling.
///
/// Replays the same hashing and collision-indexing logic as `ItemTree`.
/// This allows queries to reproduce the same local IDs when scanning CST.
pub struct LocalIdAllocator {
next_index: FxHashMap<(ItemKind, u16), u16>,
}

impl LocalIdAllocator {
/// Create a new allocator with empty collision state.
pub fn new() -> Self {
Self {
next_index: FxHashMap::default(),
}
}

/// Allocate a `LocalItemId` for a named item, updating collision state.
pub fn alloc_id<T>(&mut self, kind: ItemKind, name: &Name) -> LocalItemId<T> {
allocate_local_id(&mut self.next_index, kind, name)
}
}

impl Default for LocalIdAllocator {
fn default() -> Self {
Self::new()
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Add focused unit tests for allocator collision progression/overflow semantics.

allocate_local_id and LocalIdAllocator introduce critical ID-stability logic; this should be covered with unit tests in this crate (normal increment, hash collision bucket separation by ItemKind, overflow behavior).

As per coding guidelines: "Prefer writing Rust unit tests over integration tests where possible".

Comment on lines +203 to +213
// Saturating add prevents silent wraparound in release builds.
// Reaching u16::MAX collisions for a single (kind, hash) bucket is
// practically impossible (requires 65 535 items with the same 16-bit
// name hash), so saturation is a safe sentinel rather than a recoverable
// error path.
debug_assert!(
*index < u16::MAX,
"LocalItemId collision index saturated for {name:?}"
);
*index = index.saturating_add(1);
id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n baml_language/crates/baml_compiler_hir/src/ids.rs | sed -n '190,220p'

Repository: BoundaryML/baml

Length of output: 1325


🏁 Script executed:

rg -A 20 "fn allocate_local_id" baml_language/crates/baml_compiler_hir/src/ids.rs

Repository: BoundaryML/baml

Length of output: 818


🏁 Script executed:

rg -B 5 -A 15 "struct LocalIdAllocator" baml_language/crates/baml_compiler_hir/src/ids.rs

Repository: BoundaryML/baml

Length of output: 765


🏁 Script executed:

rg "mod test|#\[test\]|#\[cfg.*test" baml_language/crates/baml_compiler_hir/src/ids.rs

Repository: BoundaryML/baml

Length of output: 41


🏁 Script executed:

find baml_language/crates/baml_compiler_hir -type f -name "*.rs" | xargs grep -l "LocalIdAllocator\|allocate_local_id" | head -10

Repository: BoundaryML/baml

Length of output: 219


🏁 Script executed:

rg -B 5 -A 10 "struct LocalItemId|impl LocalItemId" baml_language/crates/baml_compiler_hir/src/ids.rs | head -40

Repository: BoundaryML/baml

Length of output: 653


🏁 Script executed:

fd -e rs . baml_language/crates/baml_compiler_hir/tests/ 2>/dev/null | head -5

Repository: BoundaryML/baml

Length of output: 41


🏁 Script executed:

rg "test.*allocat|allocat.*test" baml_language/crates/baml_compiler_hir/ -i

Repository: BoundaryML/baml

Length of output: 41


🏁 Script executed:

rg -A 5 "LocalIdAllocator" baml_language/crates/baml_compiler_hir/src/item_tree.rs | head -20

Repository: BoundaryML/baml

Length of output: 41


🏁 Script executed:

rg "practically impossible|safe sentinel|u16::MAX" baml_language/crates/baml_compiler_hir/src/ids.rs

Repository: BoundaryML/baml

Length of output: 309


Missing unit test coverage for LocalIdAllocator and underdocumented saturation behavior.

The saturating_add on line 212 will cause ID aliasing when a single (kind, hash) bucket exceeds 65,535 collisions. While the comment acknowledges this is "practically impossible," the code provides only a debug-mode assertion and no test coverage. Per coding guidelines, Rust unit tests should be added for allocator edge cases.

Test coverage needed

Add unit tests for:

  • Allocator behavior at and beyond collision saturation
  • Multiple items with forced hash collisions
  • Verify debug_assert triggers (or add test for saturation behavior in release mode)

The proposed fix using checked_add with expect is valid but changes the design intent (panic vs. silent saturation); either approach should be explicitly tested and documented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants