Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add functions to support Standard Swizzle textures #515

Open
wants to merge 23 commits into
base: main
Choose a base branch
from

Conversation

isplunke
Copy link

@isplunke isplunke commented Sep 17, 2024

Implement z-order curve "swizzling" for 2D and 3D textures to support the pixel order used for "Standard Swizzle" as defined by D3D12_TEXTURE_LAYOUT_64KB_STANDARD_SWIZZLE.

These functions only perform the row-major <=> z-order curve operation. 64KB alignment, miptail packing, and other requirements are handled in other distinct operations.

@isplunke
Copy link
Author

@microsoft-github-policy-service agree company="Microsoft"

@walbourn walbourn self-assigned this Sep 18, 2024
@walbourn
Copy link
Member

Note for testing, I'll add coverage to the test suite which will likely just do a swizzle and deswizzle and make sure the bits match.

As for determining if the swizzle is correct, I probably need to make a test app that renders using WARP which supports standard swizzle.

@walbourn walbourn changed the title [Draft] Add functions to swizzle textures [Draft] Add functions to support Standard Swizzle textures Sep 18, 2024
@walbourn walbourn changed the title [Draft] Add functions to support Standard Swizzle textures Add functions to support Standard Swizzle textures Sep 25, 2024
@walbourn walbourn added the dx12 Direct3D 12 label Sep 25, 2024
- needs to be tested.  where to get dx12 boilerplate?  test just that texute still looks correct, or also test performance improves when roteated?
- maybe merge functions?  would then pass a boolean/enum to determine rowToSwizzle versus swizzleToRow

Added functions that converts pixel order from row major to standard swizzle and from standard swizzle to row major.

Functions for both 1 Image, and an array of Images.

Folwoing standard DirectXTex pattern: Uses Image and TexMetadata as input.  Outputs/initializes a ScratchImage.
Added new file to .vcxproj, other.vcxproj, and Cmake

Merged to/from functions togeather
memcpy src is const
added non-AVX2 deposit_bits
TODO what flags/threshold to use to re-Compress?
@walbourn walbourn marked this pull request as ready for review March 9, 2025 18:18
@walbourn
Copy link
Member

I'm going to wrap up the March 2025 release, and then finish this PR out. That way I have some time to iterate on the new method. The current implementation only works for 'tiled sized' textures.

Copy link

@fdwr fdwr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👀

uint32_t extract_bits(uint32_t val, int mask) noexcept
{
uint32_t res = 0;
for (uint32_t bb = 1; mask !=0; bb += bb)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (uint32_t bb = 1; mask !=0; bb += bb)
for (uint32_t bb = 1; mask != 0; bb += bb)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, unless there's a perf advantage to left shifting via addition (maybe there is), just saying bb <<= 1 would probably be clearer to readers.

//---------------------------------------------------------------------------------
// row-major to z-order curve
//---------------------------------------------------------------------------------
template<int xBytesMask, size_t bytesPerPixel>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing how bytesPerPixel is used below, would naming this bytesPerBlock make more sense (where that block might be a 4x4 block or a 1x1 block, aka single pixel)? Otherwise it's weird for BC formats like BC1 because there are actually 0.5 bytes per pixel (8 bytes per block).

for (size_t x = 0; x < width; ++x)
{
const uint32_t swizzleIndex = deposit_bits(static_cast<uint32_t>(x), xBytesMask) + deposit_bits(static_cast<uint32_t>(y), ~xBytesMask);
const size_t swizzleOffset = swizzleIndex * bytesPerPixel;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 Given the 16-bit masks above like 0b1010101011001111, would multiplying by bytesPerPixel yield a swizzled byte offset outside the current 64KB per page? From the docs, all formats from 8bpp to 64bpp are swizzled within each 64KB page.

"The standard swizzle formats applies within each page-sized chunk, and pages are laid out in linear order with respect to one another. A 16-bit interleave pattern defines the conversion from pre-swizzled intra-page location to the post-swizzled location." ... "layout of texels within the tile is defined" ... "Tiles are arranged in row-major order."

So for say BC1, it sounds like each page would be 512x256 pixel Morton tile, and that every 512x256 tile is then linearly appended in classic left-to-right then top-to-bottom order.

Texel block size Mask Tile size per 64KB page
8bpp 1x1 pixel
(e.g. DXGI_FORMAT_R8_UNORM)
XYXY XYXY YYYY XXXX 256x256 pixels  (2^8 x 2^8)
16bpp 1x1 pixel
(e.g. DXGI_FORMAT_B5G5R5A1_UNORM)
XYXY XYXY XYYY XXX- 256x128 pixels  (2^8 x 2^7)
32bpp 1x1 pixel
(e.g. DXGI_FORMAT_B8G8R8A8_UNORM)
XYXY XYXY XYYY XX-- 128x128 pixels  (2^7 x 2^7)
64bpp 1x1 pixel
(e.g. DXGI_FORMAT_R16G16B16A16_UNORM)
XYXY XYXY XXYY X--- 128x64 pixels  (2^7 x 2^6)
64bpb 4x4 block
(e.g. DXGI_FORMAT_BC1_UNORM)
XYXY XYXY XXYY X--- 512x256 pixels  (2^7*4 x 2^6*4)
128bpp 1x1 pixel
(e.g. DXGI_FORMAT_R32G32B32A32_FLOAT)
XYXY XYXY XXYY ---- 64x64 pixels  (2^6 x 2^6)
128bpb 4x4 block
(e.g. DXGI_FORMAT_BC2_UNORM / BC3)
XYXY XYXY XXYY ---- 256x256 pixels  (2^6*4 x 2^6*4)

If correct, then a 1024x512 BC1 texture would be laid out like:

- left right
top page 0 (512x256) page 1 (512x256)
bottom page 2 (512x256) page 3 (512x256)

That means this process could be in terms of page tiles (x / 512, y / 256) with an inner 1D 64KB loop that doesn't actually care about x,y coordinates anymore - it's just remapping from byte addresses to byte addresses via the bitmasks.

(but I could be all wrong too 😅🤷‍♂️, without seeing how D3D actually implemented D3D12_TEXTURE_LAYOUT_64KB_STANDARD_SWIZZLE)

constexpr uint64_t MAX_TEXTURE_SIZE = UINT32_MAX;
#endif

// Standard Swizzle is not defined for these formats.
Copy link

@fdwr fdwr Mar 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: I'm curious where this list came from, from documentation or trial-and-error calling the API? Is DXGI_FORMAT_R8_SNORM valid, given it's not in the switch? I see mention in the documentation that DXGI_FORMAT_R32G32B32_TYPELESS is excluded (along with MSAA, 1D, depth and stencil), but otherwise I'm not seeing any particular list.

Since the swizzle is defined solely in terms of the BitsPerPixel/BytesPerBlock, I wouldn't bother trying to artificially restrict which formats are considered valid or not from DirectXText's point of view with whatever D3D may or may not currently consider valid (since D3D is welcome to relax these in the future or on certain SKU's which would unnecessarily prevent this helper function from working, and trying to mirror the API just allows situations for this code and D3D to get out of sync).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add function to convert a texture to Standard Swizzle and vice versa
3 participants