oneDNN v3.10 release notes #4106

vgvozdeva · 2025-10-10T11:26:58Z

This PR includes a release notes draft based on the information from the PRs for the contributors to review. Your additions and corrections are highly appreciated.

vgvozdeva · 2025-10-10T11:28:36Z

RELEASE_NOTES.md

@@ -0,0 +1,69 @@
+# Performance Optimizations
+## Intel Architecture Processors


@tprimak Could you please review and update section if required?

vgvozdeva · 2025-10-10T11:29:10Z

RELEASE_NOTES.md

+
+[Grouped Query Attention (GQA)]: https://uxlfoundation.github.io/oneDNN/v3.10/dev_guide_graph_gqa.html#gqa-for-training-forward-propagation
+
+## AArch64-based Processors


@theComputeKid @Sqvid could you please help summarizing AArch64 improvements?

Thanks, I will draft something shortly. If I remember correctly, the etiquette is just to push directly to this branch right?

P.S. @Ryo-not-rio is currently more active than @theComputeKid and should probably be tagged instead (or additionally) going forward. Thanks.

Yes, you can just push changes directly or mention in the comments here.

vgvozdeva · 2025-10-10T11:29:40Z

RELEASE_NOTES.md

+* Improved performance of `int8` matmul and inner product primitives with `fp16` destination.
+* Improved performance of subgraphs containing sequence of multiple binary ops with Graph API.
+
+## Intel Graphics Products


@karturov Could you please review and update section if required?

RELEASE_NOTES.md

Copilot

Pull Request Overview

This PR adds release notes for oneDNN v3.10, documenting performance optimizations, new functionality, and breaking changes for the upcoming release.

Comprehensive release notes covering performance improvements across Intel processors and graphics products
Documentation of new functional and Graph API features including host-side scalar support
Acknowledgment of contributors and deprecation notices for BLAS-like API

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-10T17:16:28Z

RELEASE_NOTES.md

+* Improved performance of subgraphs containing sequence of multiple binary ops with Graph API.
+
+## Intel Graphics Products
+*Improve GEMM performance for small batch size on Intel Core Ultra processors (Series 2) (formerly Lunar Lake).


Missing bullet point formatting. Should start with '* ' instead of '*'.

Suggested change

*Improve GEMM performance for small batch size on Intel Core Ultra processors (Series 2) (formerly Lunar Lake).

* Improve GEMM performance for small batch size on Intel Core Ultra processors (Series 2) (formerly Lunar Lake).

RELEASE_NOTES.md

vpirogov · 2025-10-10T17:18:00Z

RELEASE_NOTES.md

+* Improved `int8` matmul performance with `int4` weights and per-tensor zero-points.
+* Improved `bf16` matmul performance with `fp8` weights.
+* Graph API optimizations:
+  * Improved Scaled Dot Product Attention (SDPA) subgraph performance when relaxed accumulation mode is enabled on Intel Core Ultra processors (formerly Meteor Lake).


Needs link to https://uxlfoundation.github.io/oneDNN/v3.10/dev_guide_graph_sdpa.html

vpirogov · 2025-10-10T17:19:02Z

RELEASE_NOTES.md

+## Intel Graphics Products
+* Introduced support for `fp4` weights in matmul primitive.
+* Introduced support for grouped quantization with group size 16 in matmul with int8 compressed weights on Intel GPUs.
+* Introduced support group size16 `int8` for decompressed weight with regular weights decompression.


Looks like duplication of the previous line.

Suggested change

* Introduced support group size16 `int8` for decompressed weight with regular weights decompression.

RELEASE_NOTES.md

Co-authored-by: Copilot <[email protected]>

Co-authored-by: Vadim Pirogov <[email protected]>

Co-authored-by: Copilot <[email protected]>

Co-authored-by: Vadim Pirogov <[email protected]>

Co-authored-by: Copilot <[email protected]>

Co-authored-by: Vadim Pirogov <[email protected]>

ElaineBao · 2025-10-14T02:10:44Z

RELEASE_NOTES.md

+
+* Introduced [`host_scalar` property] for logical tensors. This functionality allows passing host-side scalars instead of device memory objects when using oneDNN with OpenCL or SYCL runtimes.
+* Introduced [accumulation mode attribute] support in `Matmul` op. This attribute allows relaxing `fp32` accumulation requirements to achieve performance benefits on some platforms. 
+


Suggested change

Introduced fusion support for GQA training forward and backward propagation.

Thanks for the addition! This one belongs in performance optimizations section though.

Hi @vpirogov , I removed the one in the performance optimizations section and added this one here.... I was referring to v3.9 release note, where sdpa training forward and backward is mentioned in functionality section..

RELEASE_NOTES.md

Co-authored-by: Tao Lv <[email protected]>

Co-authored-by: YixinBao <[email protected]>

vgvozdeva added 2 commits October 10, 2025 13:23

doc: oneDNN v3.10 release notes initial version

2a06243

doc: remove empty lines

500c3fb

vgvozdeva requested review from a team as code owners October 10, 2025 11:26

github-actions bot added documentation A request to change/fix/improve the documentation. Codeowner: @oneapi-src/onednn-doc backport labels Oct 10, 2025

vgvozdeva commented Oct 10, 2025

View reviewed changes

vgvozdeva requested review from a team October 10, 2025 11:30

vpirogov requested a review from Copilot October 10, 2025 17:15

vpirogov reviewed Oct 10, 2025

View reviewed changes

RELEASE_NOTES.md Outdated Show resolved Hide resolved

Copilot AI reviewed Oct 10, 2025

View reviewed changes

vpirogov reviewed Oct 10, 2025

View reviewed changes

RELEASE_NOTES.md Outdated Show resolved Hide resolved

vpirogov reviewed Oct 10, 2025

View reviewed changes

RELEASE_NOTES.md Outdated Show resolved Hide resolved

vgvozdeva and others added 7 commits October 10, 2025 20:35

doc: update release notes

cc6415f

Co-authored-by: Copilot <[email protected]>

doc: update release notes

e9b5b8d

Co-authored-by: Vadim Pirogov <[email protected]>

doc: update release notes

92c945a

Co-authored-by: Copilot <[email protected]>

doc: update release notes

70cf197

Co-authored-by: Vadim Pirogov <[email protected]>

doc: update release notes

d0e73ce

Co-authored-by: Copilot <[email protected]>

doc: update release notes

997fb51

Co-authored-by: Vadim Pirogov <[email protected]>

doc: update relase notes

a76ae25

ElaineBao reviewed Oct 14, 2025

View reviewed changes

TaoLv reviewed Oct 14, 2025

View reviewed changes

RELEASE_NOTES.md Outdated Show resolved Hide resolved

vgvozdeva and others added 3 commits October 14, 2025 11:15

doc: Update RELEASE_NOTES.md

eaeb788

Co-authored-by: Tao Lv <[email protected]>

doc: Update RELEASE_NOTES.md

501ed14

Co-authored-by: YixinBao <[email protected]>

doc: Update RELEASE_NOTES.md

3f1c2b3

Co-authored-by: YixinBao <[email protected]>

		@@ -0,0 +1,69 @@
		# Performance Optimizations
		## Intel Architecture Processors


		[Grouped Query Attention (GQA)]: https://uxlfoundation.github.io/oneDNN/v3.10/dev_guide_graph_gqa.html#gqa-for-training-forward-propagation

		## AArch64-based Processors

	*Improve GEMM performance for small batch size on Intel Core Ultra processors (Series 2) (formerly Lunar Lake).
	* Improve GEMM performance for small batch size on Intel Core Ultra processors (Series 2) (formerly Lunar Lake).


		* Introduced [`host_scalar` property] for logical tensors. This functionality allows passing host-side scalars instead of device memory objects when using oneDNN with OpenCL or SYCL runtimes.
		* Introduced [accumulation mode attribute] support in `Matmul` op. This attribute allows relaxing `fp32` accumulation requirements to achieve performance benefits on some platforms.


	Introduced fusion support for GQA training forward and backward propagation.

oneDNN v3.10 release notes #4106

Are you sure you want to change the base?

oneDNN v3.10 release notes #4106

Uh oh!

Conversation

vgvozdeva commented Oct 10, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Sqvid Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Sqvid Oct 13, 2025 •

edited

Loading