Add NV low latency support #3690

esullivan-nvidia · 2023-10-14T07:07:26Z

This pull request implements a new device interface called ID3DLowLatencyDevice using the VK_NV_low_latency2 extension. The purpose of this interface, is to give dxvk-nvapi a way to implement the nvapi Reflex interface.

ishitatsuyuki · 2023-10-16T08:11:52Z

src/d3d11/d3d11_swapchain.cpp

-      m_frameId += 1;
+    if (!Repeat) {
+      m_frameId = m_presenter->lowLatencyEnabled() ?
+        m_device->getLatencyMarkers().present :


Would this work with games that are not using latency markers, but only the sleep API? (e.g. Apex Legends)

This was fixed in a previous update to this PR to only use the present latency marker if it is valid.

adamnv · 2023-10-17T21:10:51Z

Support for the low-latency extensions just got released in the 545.23.06 NVIDIA driver so this is usable/testable now. (Propagating this note to all three project PRs, apologies for any duplication! 😸 )

ishitatsuyuki · 2023-10-20T16:30:40Z

src/dxvk/dxvk_device.cpp

@@ -274,6 +275,7 @@ namespace dxvk {
          DxvkSubmitStatus*         status) {
    DxvkSubmitInfo submitInfo = { };
    submitInfo.cmdList = commandList;
+    submitInfo.frameId = m_latencyMarkers.render;


Note that resource initialization (zeroing) submissions can happen independently from the immediate context or CS thread. submitCommandList is used by every submission including init submissions, and this can cause slightly awkward consequences such as submission tied with a particular frame ID after present.

It might be better to exclude InitContext from low latency tracking completely to avoid edge cases.

Updated to disable passing frame ids at vkQueueSubmit time for work submitted from the initializer context.

doitsujin · 2023-11-13T12:25:36Z

src/d3d11/d3d11_device.h

+    }
+
+    D3D11SwapChain* GetLowLatencySwapChain() {
+      return (m_swapchains.size()) == 1 ? m_swapchains[0] : nullptr; 


There's an inherent race condition here if the app destroys the swap chain while also calling any of the LowLatencyDevice methods that access it, in which case we're likely to just crash. Is this something we should be robust against?

doitsujin · 2023-11-13T12:25:58Z

src/d3d11/d3d11_device.h

+      std::remove(m_swapchains.begin(), m_swapchains.end(), swapchain);
+    }
+
+    UINT GetSwapchainCount() {


This doesn't seem to be used.

src/d3d11/d3d11_interfaces.h

@ptr1337

Requires at least Nvidia 545.xx drivers. The patches are from the following pull requests: [wine](ValveSoftware/wine#200) [dxvk-nvapi](jp7677/dxvk-nvapi#147) [vkd3d-proton](HansKristian-Work/vkd3d-proton#1739) [dxvk](doitsujin/dxvk#3690) Thanks also goes to @ptr1337 for initially building and testing the patchset.

@ptr1337

Requires at least Nvidia 545.xx drivers. The patches are from the following pull requests: [wine](ValveSoftware/wine#200) [dxvk-nvapi](jp7677/dxvk-nvapi#147) [vkd3d-proton](HansKristian-Work/vkd3d-proton#1739) [dxvk](doitsujin/dxvk#3690) Thanks also goes to @ptr1337 for initially building and testing the patchset.

ishitatsuyuki · 2023-11-26T02:55:27Z

src/dxvk/dxvk_presenter.cpp

@@ -48,6 +57,7 @@ namespace dxvk {


  VkResult Presenter::acquireNextImage(PresenterSync& sync, uint32_t& index) {
+    std::lock_guard<dxvk::mutex> lock(m_lowLatencyMutex);


AcquireNextImage can block. I think holding a mutex for the entire duration of acquire can cause potential trouble.

I see the same concern was also raised in vkd3d-proton review, so I suppose this can be addressed in a similar way (only hold the lock in case of recreation).

With the locking in acquire and present removed my layer-based LL2 implementation works with significantly less stutters — so this is probably that has to be fixed. I think the locks aren't really necessary here anyway since access to swapchain is externally synchronized (which implies it can't be destroyed for the duration of call).

The lock should only be taken when the swapchain is destroyed now. So I think this issue can be considered resolved.

ishitatsuyuki · 2023-11-26T02:56:25Z

src/dxvk/dxvk_presenter.cpp

@@ -68,11 +78,13 @@ namespace dxvk {
  VkResult Presenter::presentImage(
          VkPresentModeKHR  mode,
          uint64_t          frameId) {
+    std::lock_guard<dxvk::mutex> lock(m_lowLatencyMutex);


QueuePresent is also allowed to block, although this might be less of a concern if the driver is implemented in a way that doesn't block.

Updated this to only take the lock when destroying the swapchain.

ishitatsuyuki · 2023-11-30T17:31:28Z

src/dxvk/dxvk_presenter.cpp

@@ -151,6 +163,8 @@ namespace dxvk {


  VkResult Presenter::recreateSwapChain(const PresenterDesc& desc) {
+    std::lock_guard<dxvk::mutex> lock(m_lowLatencyMutex);


recreateSurface can also result in the swapchain getting destroyed. I think we currently lack locking there (and this might be a tricky one since recreateSurface will keep m_swapchain NULL until the next call to recreateSwapchain).

Updated this to only take the lock when destroying the swapchain. Whenever dxvk calls one of the other LL2 entry points it holds the low latency lock, and confirms the swapchain handle is valid. This approach should prevent any usage of the destroyed swapchain handle.

Saancreed · 2023-12-22T02:02:36Z

src/dxvk/dxvk_presenter.cpp

+    uint32_t timingCount = 0;
+
+    std::lock_guard<dxvk::mutex> lock(m_lowLatencyMutex);
+    m_vkd->vkGetLatencyTimingsNV(m_vkd->device(), m_swapchain, &timingCount, &markerInfo);
+
+    if (timingCount != 0) {
+      frameReports.resize(timingCount, { VK_STRUCTURE_TYPE_GET_LATENCY_MARKER_INFO_NV });
+      markerInfo.pTimings = frameReports.data();
+
+      m_vkd->vkGetLatencyTimingsNV(m_vkd->device(), m_swapchain, &timingCount, &markerInfo);


I think VkStructureType passed to resize is wrong here (VK_STRUCTURE_TYPE_GET_LATENCY_MARKER_INFO_NV → VK_STRUCTURE_TYPE_LATENCY_TIMINGS_FRAME_REPORT_NV)? So together with updated Vulkan headers for spec 2 version of NV_ll2 this should look more like:

Suggested change

uint32_t timingCount = 0;

std::lock_guard<dxvk::mutex> lock(m_lowLatencyMutex);

m_vkd->vkGetLatencyTimingsNV(m_vkd->device(), m_swapchain, &timingCount, &markerInfo);

if (timingCount != 0) {

frameReports.resize(timingCount, { VK_STRUCTURE_TYPE_GET_LATENCY_MARKER_INFO_NV });

markerInfo.pTimings = frameReports.data();

m_vkd->vkGetLatencyTimingsNV(m_vkd->device(), m_swapchain, &timingCount, &markerInfo);

std::lock_guard<dxvk::mutex> lock(m_lowLatencyMutex);

m_vkd->vkGetLatencyTimingsNV(m_vkd->device(), m_swapchain, &markerInfo);

if (markerInfo.timingCount != 0) {

frameReports.resize(markerInfo.timingCount, { VK_STRUCTURE_TYPE_LATENCY_TIMINGS_FRAME_REPORT_NV });

markerInfo.pTimings = frameReports.data();

m_vkd->vkGetLatencyTimingsNV(m_vkd->device(), m_swapchain, &markerInfo);

Good catch, this should be fixed now.

ishitatsuyuki · 2024-01-13T09:34:25Z

src/d3d11/d3d11_device.h

+    }
+
+    void RemoveSwapchain(D3D11SwapChain* swapchain) {
+      std::remove(m_swapchains.begin(), m_swapchains.end(), swapchain);


Suggested change

std::remove(m_swapchains.begin(), m_swapchains.end(), swapchain);

m_swapchains.erase(std::remove(m_swapchains.begin(), m_swapchains.end(), swapchain));

Found by clang-cl.

Fixed, thanks for running this change through clang-cl!

ishitatsuyuki · 2024-01-13T09:34:43Z

src/d3d11/d3d11_device.cpp

-
+
+      if (m_device->GetDXVKDevice()->features().nvLowLatency2) {
+        m_device->AddSwapchain(presenter.ref());


This seems to miss a matching RemoveSwapchain call.

@ptr1337

Requires at least Nvidia 545.xx drivers. The patches are from the following pull requests: [wine](ValveSoftware/wine#200) [dxvk-nvapi](jp7677/dxvk-nvapi#147) [vkd3d-proton](HansKristian-Work/vkd3d-proton#1739) [dxvk](doitsujin/dxvk#3690) Thanks also goes to @ptr1337 for initially building and testing the patchset.

This commit add support for the VK_NV_low_latency2 extension, and implements the ID3DLowLatencyDevice interface.

esullivan-nvidia · 2024-02-06T10:17:35Z

Just like I mentioned in the vkd3d-proton PR, thank you for your patience in regards to the slow updates. Feel free to provide any additional feedback, or concerns and I will address them as quickly as possible. I am aware there are a couple of outstanding issues. I will get to those tomorrow. I think most of the major concerns have been resolved though.

ishitatsuyuki · 2024-02-06T13:38:29Z

src/d3d11/d3d11_swapchain.cpp

-    if (!Repeat)
-      m_frameId += 1;
+    if (!Repeat) {
+      m_frameId = (m_presenter->lowLatencyEnabled() && m_device->getLatencyMarkers().present) ?


Two things mandates that m_frameId is monotonically increasing:

VK_KHR_present_id: A non-zero presentId must be greater than any non-zero presentId passed previously by the application for the same swapchain.

CallbackFence: The helper works similarly to a timeline semaphore so the timeline value must be increasing.

I think we need to ensure that the value increase here. For present IDs, recreating the swapchain will do the job when we see a smaller ID. CallbackFence will need some more modification: I'll describe my idea below.

Make D3D9Swapchain use CallbackFence and remove Fence; the latter seems to be a superset of the former.

Move CallbackFence into Presenter, so that it can be easily recreated by the Presenter. Change accesses to e.g. D3D11Swapchain::m_frameLatencySignal to something like m_presenter->frameLatencySignal().

Make Presenter::destroySwapchain destroy the signal too, then also recreate the signal in Presenter::recreateSwapchain. (destroySwapchain waits for the current frame to be signaled.) Add a new parameter to Presenter::recreateSwapchain for the frame ID the signal should be initialized with. Also reset m_lastFrameId to the specified value.

cc @doitsujin in case they have feedback for the proposed refactor.

ptr1337 · 2024-02-16T13:49:13Z

Hi,

We have updated on proton-cachyos the nvidia-reflex patches. One user is following a issue in apex legends.
The issue happens as soon Reflex + Boost gets enabled. Reflex only is okay.
The error message is:
There was a problem processing game logic

I have attached a Screenshot.

Reflex without boost is working without problems. 545 drivers.

Changes compared to last version:
CachyOS/CachyOS-PKGBUILDS@c09fd8f

If you need any further informations or testing, feel free to hit me or @A1RM4X

Edit: No issue.

ishitatsuyuki · 2024-02-16T14:49:30Z

We have updated on proton-cachyos the nvidia-reflex patches. One user is following a issue in apex legends. The issue happens as soon Reflex + Boost gets enabled. Reflex only is okay. The error message is: There was a problem processing game logic

I have attached a Screenshot.

Reflex without boost is working without problems. 545 drivers.

Toggling Reflex causing game logic crash feels pretty unlikely so it might be good to test this multiple times to confirm that it's not a fluke.

If it's reproducible, I think attaching logs with PROTON_LOG=1 would be useful for confirming or ruling out exceptions within the driver.

A1RM4X · 2024-02-16T19:36:58Z

Fluke confirmed.

It is working now. It just crashed the first time for whatever reason.

A1RM4X · 2024-02-23T21:41:21Z

Tested with the latest 550 drivers: Cyberpunk 77 show the nvidia reflex option.

The option was not showing with the 545 drivers.

oscarbg · 2024-03-06T06:20:32Z

curious on the state of merging this PR.. as equivalent VKD3D and DXVK-NVAPI Reflex patches got merged already..

ishitatsuyuki · 2024-03-06T06:22:20Z

Please don't make "state of merging" type of comments. The PR will be merged once all the concerns are addressed. Currently, the frame ID interaction with swapchain and CallbackFence needs to be fixed.

ptr1337 · 2024-03-06T06:57:32Z

Please don't make "state of merging" type of comments. The PR will be merged once all the concerns are addressed. Currently, the frame ID interaction with swapchain and CallbackFence needs to be fixed.

We have also a user, which reports problems with this MR, if VRR is used together with Reflex in Overwatch. i can provide later the day some logs from him.

ishitatsuyuki reviewed Oct 16, 2023

View reviewed changes

ivyl mentioned this pull request Oct 16, 2023

Support VK_NV_low_latency2 ValveSoftware/wine#200

Open

flibitijibibo mentioned this pull request Oct 16, 2023

Add support for exclusive fullscreen to the D3D11 driver to reduce input latency FNA-XNA/FNA3D#180

Draft

ishitatsuyuki reviewed Oct 20, 2023

View reviewed changes

esullivan-nvidia force-pushed the nv_low_latency2 branch from 74c1ba4 to 2518e1c Compare October 24, 2023 07:34

doitsujin reviewed Nov 13, 2023

View reviewed changes

loathingKernel mentioned this pull request Nov 22, 2023

Add Nvidia Reflex patches. GloriousEggroll/proton-ge-custom#104

Merged

ishitatsuyuki reviewed Nov 26, 2023

View reviewed changes

ishitatsuyuki reviewed Nov 30, 2023

View reviewed changes

Saancreed reviewed Dec 22, 2023

View reviewed changes

ishitatsuyuki suggested changes Jan 13, 2024

View reviewed changes

Update Vulkan headers.

0cd8e71

esullivan-nvidia force-pushed the nv_low_latency2 branch 2 times, most recently from a7cdb7b to 88b323b Compare February 6, 2024 08:53

Add VK_NV_low_latency2 support

d0e232c

This commit add support for the VK_NV_low_latency2 extension, and implements the ID3DLowLatencyDevice interface.

esullivan-nvidia force-pushed the nv_low_latency2 branch from 88b323b to d0e232c Compare February 6, 2024 09:05

ishitatsuyuki reviewed Feb 6, 2024

View reviewed changes

ishitatsuyuki mentioned this pull request Feb 7, 2024

Add NV low latency support HansKristian-Work/vkd3d-proton#1739

Closed

Saancreed mentioned this pull request May 31, 2024

Crash with DLSS Enabled in Star Citizen jp7677/dxvk-nvapi#174

Open

SveSop mentioned this pull request Jun 18, 2024

Enabling DLSS in War Thunder freezes the game jp7677/dxvk-nvapi#171

Open

doitsujin force-pushed the master branch from c860fda to 990c7f5 Compare October 17, 2024 12:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add NV low latency support #3690

Add NV low latency support #3690

esullivan-nvidia commented Oct 14, 2023

ishitatsuyuki Oct 16, 2023

esullivan-nvidia Feb 6, 2024

adamnv commented Oct 17, 2023

ishitatsuyuki Oct 20, 2023

esullivan-nvidia Feb 6, 2024

doitsujin Nov 13, 2023

doitsujin Nov 13, 2023

esullivan-nvidia Feb 6, 2024

ishitatsuyuki Nov 26, 2023

ishitatsuyuki Nov 26, 2023

ishitatsuyuki Nov 30, 2023

esullivan-nvidia Feb 6, 2024

ishitatsuyuki Nov 26, 2023

esullivan-nvidia Feb 6, 2024

ishitatsuyuki Nov 30, 2023

esullivan-nvidia Feb 6, 2024

Saancreed Dec 22, 2023

esullivan-nvidia Feb 6, 2024

ishitatsuyuki Jan 13, 2024

esullivan-nvidia Feb 6, 2024

ishitatsuyuki Jan 13, 2024

esullivan-nvidia commented Feb 6, 2024

ishitatsuyuki Feb 6, 2024 •

edited

Loading

ptr1337 commented Feb 16, 2024 •

edited

Loading

ishitatsuyuki commented Feb 16, 2024

A1RM4X commented Feb 16, 2024

A1RM4X commented Feb 23, 2024 •

edited

Loading

oscarbg commented Mar 6, 2024

ishitatsuyuki commented Mar 6, 2024

ptr1337 commented Mar 6, 2024

		@@ -48,6 +57,7 @@ namespace dxvk {


		VkResult Presenter::acquireNextImage(PresenterSync& sync, uint32_t& index) {
		std::lock_guard<dxvk::mutex> lock(m_lowLatencyMutex);

		@@ -151,6 +163,8 @@ namespace dxvk {


		VkResult Presenter::recreateSwapChain(const PresenterDesc& desc) {
		std::lock_guard<dxvk::mutex> lock(m_lowLatencyMutex);

	std::remove(m_swapchains.begin(), m_swapchains.end(), swapchain);
	m_swapchains.erase(std::remove(m_swapchains.begin(), m_swapchains.end(), swapchain));



		if (m_device->GetDXVKDevice()->features().nvLowLatency2) {
		m_device->AddSwapchain(presenter.ref());

Add NV low latency support #3690

Are you sure you want to change the base?

Add NV low latency support #3690

Conversation

esullivan-nvidia commented Oct 14, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adamnv commented Oct 17, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

esullivan-nvidia commented Feb 6, 2024

ishitatsuyuki Feb 6, 2024 • edited Loading

Choose a reason for hiding this comment

ptr1337 commented Feb 16, 2024 • edited Loading

ishitatsuyuki commented Feb 16, 2024

A1RM4X commented Feb 16, 2024

A1RM4X commented Feb 23, 2024 • edited Loading

oscarbg commented Mar 6, 2024

ishitatsuyuki commented Mar 6, 2024

ptr1337 commented Mar 6, 2024

ishitatsuyuki Feb 6, 2024 •

edited

Loading

ptr1337 commented Feb 16, 2024 •

edited

Loading

A1RM4X commented Feb 23, 2024 •

edited

Loading