Skip to content

Conversation

@Zettroke
Copy link
Contributor

@Zettroke Zettroke commented May 8, 2025

Summary

Function closures not updating TypeState during the compilation, which can result in incorrect types for local variables.

Minimal reproducible example causing runtime error.
Trying to handle this error with coalesce operator causes compilation error.
playground

v = ""

for_each({"a": 1}) -> |key, value| {
    v = 0
}

.res = ("!" + v) # ?? "fallback"

This bug was found while using our local fork with CoW optimization as it heavily relies on resolve_constant.

Change Type

  • Bug fix
  • New feature
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

There might be cases where type checks were passed because of this behavior

Does this PR include user facing changes?

Dunno :)

  • Yes. Please add a changelog fragment based on
    our guidelines.
  • No. A maintainer will apply the "no-changelog" label to this PR.

Checklist

  • Our CONTRIBUTING.md is a good starting place.
  • If this PR introduces changes to LICENSE-3rdparty.csv, please
    run dd-rust-license-tool write and commit the changes. More details here.
  • For new VRL functions, please also create a sibling PR in Vector to document the new function.

Copy link
Member

@pront pront left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Zettroke!

Can you add a VRL test case and a changelog?

@Zettroke
Copy link
Contributor Author

Zettroke commented May 8, 2025

@pront The deed is done

@pront
Copy link
Member

pront commented May 9, 2025

This looks great @Zettroke. I will fix the failing CI check, it's the first time it ran against a branch from a fork.

let variable_details = state
.local
.remove_variable(ident)
.expect("Closure variable must be present");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can avoid expects in this fucntion. See example: e2a0009.

Also, do you have any example where this change can break existing behavior? I wonder if we should mark this as a breaking PR.

Copy link
Contributor Author

@Zettroke Zettroke May 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more of an invariant I tried to document by using expect.
The code in Builder::check_closure is called earlier and adds these variables to TypeState. So if they are absent it means there is a bug.
Maybe I should specify it in expect message?

As for examples of broken code, there is an example I found by scanning or VRL codebase.

elements = []

for_each(array!(.x)) -> |_index, value| {
    elements = push(elements, string!(value))
}

# log_state()

command_string = ""
for_each(elements) -> |_index, value| {
    command_string = command_string + " " + string(value)
}

Error:

error[E103]: unhandled fallible assignment
   ┌─ :14:26
   │
14 │         command_string = command_string + " " + string(value)
   │         ---------------- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ this expression is fallible because at least one argument's type cannot be verified to be valid
   │         │
   │         or change this to an infallible assignment:
   │         command_string, err = command_string + " " + string(value)
   │

The difference is in Kind of elements probed at log_state() point:
current-vrl:

Ident(elements): Details {
    type_def: TypeDef {
        fallibility: CannotFail,
        kind: "array",
        purity: Impure,
        returns: "any",
    },
    value: Some(
        Array(
            [],
    ),
}

fixed-vrl:

Ident(elements): Details {
    type_def: TypeDef {
        fallibility: CannotFail,
        kind: "[string or undefined]",
        purity: Impure,
        returns: "any",
    },
    value: None,
}

So the second for_each doesn't type check. I can probably make simpler example, but too tired right now 😄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the time to explain this. We should mark this as a breaking change and provide an example in the changelog. As always, there is no rush.

Copy link
Contributor Author

@Zettroke Zettroke May 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pront After some work of adapting our VRL code base to this fix I think I have a better example of broken code.

parsed_value = {}

for_each(object!(.value)) -> |key, value| {
    kv = parse_json!(value)

    parsed_value = set!(parsed_value, [key], kv)
}

result = ""

for_each(object!(parsed_value)) -> |_asd, value| {
  result = result + value.id + value.level + " " # <- Now this is fallible because type of `value` is unknown
}

.result = result

Error:

error[E103]: unhandled fallible assignment
   ┌─ :12:12
   │                                                                                                                                                                                                                                                                                                                
12 │   result = result + value.id + value.level + " " # <- Now this is fallible because type of `value` is unknown
   │   -------- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ this expression is fallible because at least one argument's type cannot be verified to be valid                                                                                                                                                               
   │   │                                                                                                                                                                                                                                                                                                            
   │   or change this to an infallible assignment:                                                                                                                                                                                                                                                                  
   │   result, err = result + value.id + value.level + " "                                                                                                                                                                                                                                                          
   │                                                                                                                                                                                                                                                                                                                
   = see documentation about error handling at https://errors.vrl.dev/#handling
   = see functions characteristics documentation at https://vrl.dev/expressions/#function-call-characteristics
   = learn more about error code 103 at https://errors.vrl.dev/103
   = see language documentation at https://vrl.dev
   = try your code in the VRL REPL, learn more at https://vrl.dev/examples

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also found a trick to keep added error handling compatible with older vrl versions, though it's very ugly 😅
parsed_value = parse_json!("{}")
Couldn't think of anything better :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I would be more comfortable if we included this in the next-next release. BTW I fixed .github/workflows/vector_integration_check.yaml on main so you can just undo those changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the before and after types for the value (inside the second for_each closure) here? And how does parsed_value = parse_json!("{}") line changes the type? Just trying to understand if this behavior breakage makes sense before asking all users to change their VRL programs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Earlier example:

	"t1": {
		"bytes": true
	},
	"t2": {
		"never": true
	}

Copy link
Contributor Author

@Zettroke Zettroke Jun 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parse_json!("{}") makes it any, so for_each can't deduce the type of (key, value) pair. Therefore,
it's type is never which forces the error handling.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Optional) Can you please add a "before and after" in the changelog? It would be very helpful for the next breaking guide and VRL users.

@pront pront added the meta: awaiting author Pull requests that are awaiting their author. label Jun 24, 2025
@simplepad
Copy link
Contributor

Fixes #1257

@pront
Copy link
Member

pront commented Jul 17, 2025

Hi @Zettroke, apologies for the delayed review. I wanted to double check the user impact on this but I am more convinced that this is a strict improvement. There are a few merge conflicts. Can you fix them whenever you get a chance?

Copy link
Member

@pront pront left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bug:

	"t1": {
		"bytes": true
	},
	"t2": {
		"never": true
	}

The type is clearly a string. After some internal discussion, I realized that never is just the equivalent of Rust's ! type, which means the compiler can guarantee this value will never exist .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

meta: awaiting author Pull requests that are awaiting their author.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants