[Docs]: NPU Plugin high level design diagram

### Documentation link

https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_npu/README.md

### Description

I have some questions about the [high-level architecture diagram](https://github.com/openvinotoolkit/openvino/raw/master/src/plugins/intel_npu/docs/img/high_level_design.png) in the README.md that shows the OpenVINO NPU design:

- Why doesn't the compilation part on the left side of the architecture diagram call Level Zero interfaces? Can the compiled model be executed directly by the NPU driver?

I think it should use Level Zero interfaces to load pre-compiled models, similar to the execution part on the right side.

- Do the left and right sides of the architecture diagram represent compilation and execution steps respectively?

In reality, compilation and execution sometimes operate sequentially. OpenVINO NPU can load OpenVINO IR models, compile them and pass them to the NPU driver for execution, or it can directly load pre-compiled blob models. I noticed that Level Zero's ze_graph can load pre-compiled models - is this one of the messages that the architecture diagram is trying to convey?

Based on the code provided, we can see that ze_graph supports loading pre-compiled models through the `ZE_GRAPH_FORMAT_NATIVE` format:


```50:55:level-zero-npu-extensions/ze_graph_ext.h
typedef enum _ze_graph_format_t
{
    ZE_GRAPH_FORMAT_NATIVE = 0x1,                   ///< Format is pre-compiled blob (elf, flatbuffers)
    ZE_GRAPH_FORMAT_NGRAPH_LITE = 0x2,              ///< Format is ngraph lite IR

} ze_graph_format_t;
```


And the graph descriptor allows loading both pre-compiled blobs and IR models:


```162:180:level-zero-npu-extensions/ze_graph_ext.h
typedef struct _ze_graph_desc_t
{
    ze_structure_type_graph_ext_t stype;            ///< [in] type of this structure
    void* pNext;                                    ///< [in,out][optional] must be null or a pointer to an extension-specific
    ze_graph_format_t format;                       ///< [in] Graph format passed in with input
    size_t inputSize;                               ///< [in] Size of input buffer in bytes
    const uint8_t* pInput;                          ///< [in] Pointer to input buffer
    const char* pBuildFlags;                        ///< [in][optional] Null terminated string containing build flags. Options:
                                                    ///< - '--inputs_precisions="<arg>:<precision> <arg2>:<precision> ..."'
                                                    ///<   '--outputs_precisions="<arg>:<precision> <arg2>:<precision> ..."'
                                                    ///<   - Set input and output arguments precision. Supported precisions:
                                                    ///<     FP64, FP32, FP16, BF16, U64, U32, U16, U8, U4, I64, I32, I16, I8, I4, BIN
                                                    ///< - '--inputs_layouts="<arg>:<layout> <arg2>:<layout> ..."'
                                                    ///<   '--outputs_layouts="<arg>:<layout> <arg2>:<layout> ..."'
                                                    ///<   - Set input and output arguments layout. Supported layouts:
                                                    ///<     NCHW, NHWC, NCDHW, NDHWC, OIHW, C, CHW, HW, NC, CN
                                                    ///< - '--config PARAM="VALUE" PARAM2="VALUE" ...'
                                                    ///<   - compile options string passed directly to compiler
} ze_graph_desc_t;
```


This suggests that Level Zero provides interfaces for both compilation and execution phases, though the architecture diagram may be simplifying the relationship between these components.

### Issue submission checklist

- [X] I'm reporting a documentation issue. It's not a question.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Docs]: NPU Plugin high level design diagram #27512

Documentation link

Description

Issue submission checklist

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Docs]: NPU Plugin high level design diagram #27512

Description

Documentation link

Description

Issue submission checklist

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions