[onnx_importer] Disambiguate empty string: optional none vs tensor name#4551
[onnx_importer] Disambiguate empty string: optional none vs tensor name#4551
Conversation
The NodeImporter cached torch.constant.none under _nv_map[""], matching ONNX's convention that an empty string in Node.input denotes an omitted optional input. Some producers (e.g. Microsoft SkipSimplifiedLayerNormalization) also bind real intermediate results to outputs whose names are the empty string. Each such output overwrote _nv_map[""], so later nodes that use "" for omitted optionals (e.g. GroupQueryAttention's trailing inputs) incorrectly received those tensor SSA values instead of torch.constant.none. Behavior changes: - Cache the shared none value under _OPTIONAL_NONE_CACHE_KEY instead of "". - When resolving node inputs, treat input_name == "" as omitted optional: append get_none() and an empty onnx.TypeProto without indexing _nv_map[""]. - Register outputs named "" under unique keys __torch_mlir_onnx_importer_anon_<n> so multiple anonymous outputs do not overwrite each other. Adds test/python/onnx_importer/test_empty_string_optional_inputs.py: minimal Identity -> custom op graph where optional inputs are "" and must import as %none operands, not tensor values stored under "". Symptom fixed: GroupQueryAttention previously showed duplicated operands such as (%10#2, %10#2, %10#2) instead of (%none, %none, %none) for optional slots.
|
Hi @IanWood1 , sorry to bother you. It looks like this repository doesn’t automatically assign reviewers, and I don’t have permission to request reviews from others. Would you be able to review this PR when you get a chance? |
IanWood1
left a comment
There was a problem hiding this comment.
Thank you! The core idea looks right: empty string inputs should be imported as omitted optionals. I think there are still possible _nv_map collisions from storing importer private sentinel values in the map, so it would be good to address those before merging.
| if output_name == "": | ||
| key = ( | ||
| f"__torch_mlir_onnx_importer_anon_{self._anon_output_counter}" | ||
| ) | ||
| self._anon_output_counter += 1 | ||
| self._nv_map[key] = output_value | ||
| else: | ||
| self._nv_map[output_name] = output_value |
There was a problem hiding this comment.
Does this work? I don't see why we need to store __torch_mlir_onnx_importer_anon_
| if output_name == "": | |
| key = ( | |
| f"__torch_mlir_onnx_importer_anon_{self._anon_output_counter}" | |
| ) | |
| self._anon_output_counter += 1 | |
| self._nv_map[key] = output_value | |
| else: | |
| self._nv_map[output_name] = output_value | |
| if output_name != "": | |
| self._nv_map[output_name] = output_value |
| if _OPTIONAL_NONE_CACHE_KEY in self._nv_map: | ||
| return self._nv_map[_OPTIONAL_NONE_CACHE_KEY] |
There was a problem hiding this comment.
Is "__torch_mlir_onnx_importer_optional_none__" guaranteed to not collide with any ONNX names? Since _nv_map is the mapping from ONNX value names to imported MLIR values, the cached none value is not really part of that namespace. It seems cleaner and less error-prone to store it as importer state, e.g. self._none_value, rather than as a synthetic entry in _nv_map.
|
|
||
| """Regression for NodeImporter: ONNX input name '' means omitted optional. | ||
|
|
||
| The importer must not conflate that with _nv_map[\"\"] when an earlier node binds |
There was a problem hiding this comment.
| The importer must not conflate that with _nv_map[\"\"] when an earlier node binds | |
| The importer must not conflate that with _nv_map[""] when an earlier node binds |
|
|
||
|
|
||
| def _minimal_collision_model() -> onnx.ModelProto: | ||
| """Identity writes to output \"\"; second node lists '', '' as omitted inputs.""" |
There was a problem hiding this comment.
| """Identity writes to output \"\"; second node lists '', '' as omitted inputs.""" | |
| """Identity writes to output ""; second node lists '', '' as omitted inputs.""" |
There was a problem hiding this comment.
Thanks, fixed. _nv_map now only stores real ONNX value names. Omitted optionals use NodeImporter._none_value, and empty output names are not stored. And also applied nits fix.
Fixes #4550
The NodeImporter cached torch.constant.none under _nv_map[""], matching ONNX's convention that an empty string in Node.input denotes an omitted optional input. Some producers (e.g. Microsoft SkipSimplifiedLayerNormalization) also bind real intermediate results to outputs whose names are the empty string. Each such output overwrote _nv_map[""], so later nodes that use "" for omitted optionals (e.g. GroupQueryAttention's trailing inputs) incorrectly received those tensor SSA values instead of torch.constant.none.
Behavior changes:
Adds test/python/onnx_importer/test_empty_string_optional_inputs.py: minimal Identity -> custom op graph where optional inputs are "" and must import as %none operands, not tensor values stored under "".
Symptom fixed: GroupQueryAttention previously showed duplicated operands such as (%10#2, %10#2, %10#2) instead of (%none, %none, %none) for optional slots.