Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segment layering/blending #4550

Open
blazoncek opened this issue Feb 13, 2025 · 23 comments
Open

Segment layering/blending #4550

blazoncek opened this issue Feb 13, 2025 · 23 comments

Comments

@blazoncek
Copy link
Collaborator

With 0.14 we've got proper blending of color and palettes across all segments, 0.15 improved on that and allowed effects to be blended as well. Recent blending styles (#4158) introduced effect transitions with different styles.

What is missing IMO is correct segment blending/layering with different blending modes known to Photoshop users, i.e. lighten, darken, multiply, add, subtract, difference, etc.

I've recently come across a snippet that I find useful and relatively simple to implement in WLED. Unfortunately it would require re-implementation of pixel drawing functions.

Let's first see the code and then discuss the nuances:

void WS2812FX::blendSegments(const Segment &top, const Segment &bottom) {
  constexpr auto top        = [](uint8_t a, uint8_t b){ return a; };
  constexpr auto bottom     = [](uint8_t a, uint8_t b){ return b; };
  constexpr auto add        = [](uint8_t a, uint8_t b){ return a + b; };
  constexpr auto subtract   = [](uint8_t a, uint8_t b){ return b > a ? (b - a) : 0; };
  constexpr auto difference = [](uint8_t a, uint8_t b){ return b > a ? (b - a) : (a - b); };
  constexpr auto average    = [](uint8_t a, uint8_t b){ return (a + b) >> 1; };
  constexpr auto multiply   = [](uint8_t a, uint8_t b){ return a * b; };
  constexpr auto divide     = [](uint8_t a, uint8_t b){ return b != 0 ? a / b : 0; };
  constexpr auto lighten    = [](uint8_t a, uint8_t b){ return a>b ? a : b; };
  constexpr auto darken     = [](uint8_t a, uint8_t b){ return a<b ? a : b; };
  constexpr auto screen     = [](uint8_t a, uint8_t b){ return uint8_t(0xFF - ~a * ~b); }; // 255 - (255-a)*(255-b)
  constexpr auto overlay    = [](uint8_t a, uint8_t b){ return uint8_t(a<0x80 ? 2*a*b : (0xFF - 2 * ~a * ~b)); };

  using FuncType = uint8_t(*)(uint8_t, uint8_t);
  FuncType funcs[] = {
    top, bottom, add, subtract, difference, average, multiply, divide, lighten, darken, screen, overlay
  };

  const uint8_t blendMode = top.blendMode;
  auto func = funcs[blendMode % (sizeof(funcs) / sizeof(FuncType))];

  for (auto i = 0; i<top.length(); i++) {
    uint32_t c_a = top.getPixelColor(i);
    uint32_t c_b = bottom.getPixelColor(i);
    auto r_a = R(c_a);
    auto g_a = G(c_a);
    auto b_a = B(c_a);
    auto w_a = W(c_a);
    auto r_b = R(c_b);
    auto g_b = G(c_b);
    auto b_b = B(c_b);
    auto w_b = W(c_b);
    pixels[i] = RGBW32(func(r_a,r_b), func(g_a,g_b), func(b_a,b_b), func(w_a,w_b));
  }
}

This assumes that Segment::getPixelColor() returns unmodified value set by Segment::setPixelColor() during effect execution. To achieve that, each segment must maintain its own pixel drawing buffer (also known in the past as setUpLeds())
Next it also assumes WS2812FX instance will maintain its entire canvas buffer (called pixels[]; similar to global buffer).

The process with which segments/layers would be blended is:

  • render each effect in its segment buffer (entirely independent and unaware of other segments)
  • blend all segment buffers into canvas buffer starting with black canvas and adding segments from bottom to top
  • transfer canvas buffer to LEDs

This process does not handle color/palette blending (which does not change from current) or effect transition (from one effect to another) but only allows users to stack segments one atop another which would allow mixing of content of two segments even if effect function does not support layering.

The price is of course memory and (possibly) speed degradation as there will be more operations per pixel. However, segment's setPixelColor()/getPixelColor() could be simplified and there would be no need for WS2812FX::setPixelColor().

I would like to get some feedback and thoughts about layering and the implementation. An if it is worth the effort even if speed would be impaired.

@willmmiles
Copy link
Member

Your proposed pipeline is more-or-less how I'd've expected WLED to be implemented, before I got in to the actual code. I'd thought of it more of "segment" renders to a texture buffer; textures are then rendered to the frame buffer (with grouping, spacing, any other stretching/scaling transformations going on here); and finally the frame buffer is mapped to pixels on the outputs bus(es).

Memory costs aside: we've observed in prototype implementations that having the FX write directly to local texture buffers (setUpLeds() style) can often yield an FPS speedup. There's potential that pipelining the other aspects (texture to frame buffer, and frame to physical) may also yield net speed improvements as the code for each phase becomes simpler -- less indirection, fewer conditionals, and easier to fit in cache. Some ESP models (S3, I'm looking at you!) also offer SIMD instructions that can also further accelerate pipelined simple transforms.

Lastly I think the pipelined approach will also make the code simpler and easier to follow. The key is breaking "Segment" down into the component parts -- the FX renderer, the texture->frame renderer, and the physical mapper can all be pulled into individual stages and components instead of packing everything into one giant class.

@blazoncek
Copy link
Collaborator Author

textures are then rendered to the frame buffer (with grouping, spacing, any other stretching/scaling transformations going on here);

Thanks. This is an interesting approach which indeed simplifies single segment processing (and reduces memory requirements) but will inevitably create blending into "frame buffer" more complex (imagine two partially overlapping segments with spacing and grouping set).

I would leave frame buffer to physical mapping in the bus level logic but that would make anything but Cartesian coordinate system difficult to implement (mapping arbitrary 3D location of pixels into frame rendering logic, especially sparse set-up). But we can leave this mapping out of the scope of this discussion.

@DedeHai
Copy link
Collaborator

DedeHai commented Feb 14, 2025

In general I think this is a good approach!
Some thoughts:

  • this would greatly simplify the code logic and offer potential speed improvements (as we saw in Soap FX optimization #4543)
  • the question of white-channel use arises: we discussed using it as a transparency/opacity channel before which would allow FX to be used as masks to underlying segments
  • should the buffers be 24bit, 32bit (or maybe even just 16bit, reducing color accuracy and add some shift-logic overhead)
  • you already mentioned partial segment overlap: how to handle layering in that case? I ran into the same issue in the PS where I chose the simple approach to just use "color_add()" when detecting partial overlay, which is not the cleanest solution: ideally, "color_add()" should be used if there is a underlying segment, "set_color()" if not. This requires a buffer-mask or per-pixel scanning of all segment boarders (or a more clever approach I could not think of)
  • what would be the fallback-solution if buffer allocation fails? for example on large, sparse-mapped setups, the buffer is huge
  • should the segment buffers be allocated in FX-data memory (like I have done in the PS)?

just an idea: use a "mask" checkmark in segment settings to treat an FX as a transparency mask instead of using its colors, the mask could be R+G+B or even something more elaborate.

@willmmiles
Copy link
Member

Thanks. This is an interesting approach which indeed simplifies single segment processing (and reduces memory requirements) but will inevitably create blending into "frame buffer" more complex (imagine two partially overlapping segments with spacing and grouping set).

By "frame buffer", I mean what you are calling a "canvas buffer": a single global buffer of all LEDs in the virtual space to which all segments are rendered, top to bottom, for each output frame (eg. show() call). I would expect that segment spacing and grouping would be best implemented as part of the segment->canvas render process -- if output canvas coordinates overlap from one segment to the next, you blend; if not, they'll end up interleaved as expected.

Mostly I was trying to highlight that I had expected an implementation with the same sequence of concepts, but I'd've used different names -- I don't think there's any significant difference between what I described and what you proposed.

@willmmiles
Copy link
Member

  • the question of white-channel use arises: we discussed using it as a transparency/opacity channel before which would allow FX to be used as masks to underlying segments
  • should the buffers be 24bit, 32bit (or maybe even just 16bit, reducing color accuracy and add some shift-logic overhead)

RGBA at the FX level would be very practical if we're serious about considering blending, I think...

If we really want to get radical, I'd also float the idea of struct-of-arrays instead of array-of-structs, eg. struct pixel_buf { vector<uint8_t> R, G, B, A; }; I've heard tell that some algorithms can operate much faster in this space. Loop unrolling can make it quite efficient, and it vectorizes to SIMD instructions well.

  • what would be the fallback-solution if buffer allocation fails? for example on large, sparse-mapped setups, the buffer is huge

This is the true crux of this approach. If we move to this architecture, the solution might be to explicitly develop a sparse canvas buffer abstraction. Sparse matrices have a lot of prior art in numeric processing code, I'm sure there's some insights there that could be helpful.

@blazoncek
Copy link
Collaborator Author

If we introduce alpha channel (instead of W in effect functions) I would assume it only applies to that segment in regards to blending it with lower segment. Right?
Would it only be used when blending into frame/canvas buffer alongside segment's opacity? Even though it would work on pixel level that's sort of duplicate functionality IMO.

I do admit that it will open new possibilities when writing new effects, but none of current effects are written to utilise it so there will be no immediate gain.

So, if I summarize what we've defined so far:

  • segment will contain pixel buffer (uint32_t; RGBA, but A my be interpreted as W for some effects)
  • segment's setPixelColor() operates on "condensed" segment without grouping and spacing (current virtualLength() or virtualWidth()/virtualHeight()) writing only to pixel buffer
  • effect function uses W channel as alpha channel used in blending (we'll need to make exception for Solid effect, perhaps also for all single pixel effects)
  • strip will feature canvas/frame buffer large enough to encompass all virtual pixels (including pixels that are actually missing, it also needs to take into account combination of matrix+strip)
  • canvas/frame buffer might be RGB only, if it includes W it is not alpha but White
  • segments are blended in strip.show() function
  • segments are blended into canvas/frame buffer using segment opacity and pixel alpha channel
  • segment's blending mode is applied from segment to the canvas, expanding, skipping, reversing and mirroring pixels if needed (lowest segment, usually 0, is blended with black canvas/frame first followed by segments above it)
  • W channel will need to be maintained for some effects (clash with alpha channel)
  • when all segments are processed canvas/frame buffer is sent to bus manager for display (converting RGB into RGBW if necessary)
  • bus manager will handle virtual-to-physical mapping (we will need to move mapping functions from WS2812FX class)
  • handling sparse set-ups will need a follow-up

Caveats:

  • pixel buffer memory reallocation when segment dimensions change
  • canvas/frame buffer memory reallocation when LED count or 2D geometry changes
  • some effects may benefit using W channel instead of alpha
  • do we share pixel buffer when segment is in transition (copy/move constructors & operator=)
  • do we share pixel buffer when segment is copied
  • prevent segment changes during async JSON calls (this one is very important) or cancel any effect drawing immediately
  • moving ledmap handling into bus manager may be challenging

@willmmiles
Copy link
Member

If we introduce alpha channel (instead of W in effect functions) I would assume it only applies to that segment in regards to blending it with lower segment. Right? Would it only be used when blending into frame/canvas buffer alongside segment's opacity? Even though it would work on pixel level that's sort of duplicate functionality IMO.

Yes -- I'd expect segment-level opacity to apply "on top" of the FX computed alpha channel, much the same way segment-level brightness applies "on top" of the FX computed colors. IIRC the computation is something like blend_A = pixel[index].A * segment.A / 255 -- we don't have to blend twice.

  • canvas/frame buffer might be RGB only, if it includes W it is not alpha but White
  • W channel will need to be maintained for some effects (clash with alpha channel)

Maybe Segment wants a "buffer type" indicator, set or carried by the FX selection, to inform the blending layer how to composite the segment to the canvas.

  • bus manager will handle virtual-to-physical mapping (we will need to move mapping functions from WS2812FX class)
  • moving ledmap handling into bus manager may be challenging

Bikeshedding a bit, but I'd probably put the canvas to bus functionality in its own free function(s) to start off with. Some renderToBus(Bus&, canvas, ledmap) type call. This operation is not really involved in managing buses, so IMHO it doesn't belong to BusManager.

  • canvas/frame buffer memory reallocation when LED count or 2D geometry changes

I think it's reasonable to flush everything (segments, transitions, etc.) and start clean if the bus configuration changes. If we can't re-allocate everything from a clean slate, the config is in trouble anyways...

  • pixel buffer memory reallocation when segment dimensions change
  • do we share pixel buffer when segment is in transition (copy/move constructors & operator=)
  • do we share pixel buffer when segment is copied

One neat feature of this architecture is that transitions can be implemented as part of canvas composition -- each segment need only contain enough information for the compositor to know which pixels to draw. So I'd suggest trying a broad approach of "don't re-use segments or buffers at all" as the place to start.

  • Move "outgoing" segments to an "active transition" list, and store their local transition properties; they are kept there while their transition counts go down
  • Attempt to allocate new segments for the new FX selection or size definition (pixel buffers and all)
  • If allocation fails, fall back by purging something from the transition list
  • If the active transition list is empty, you're just SOL anyways; engage next fallback behaviour

From there we can explore progressively more sophisticated fallbacks to handle cases where memory is tight. Some ideas:

  • Migrating segment pixel buffers to SPI RAM if needed;
  • A custom allocator for pixel buffers, independent from the system heap; this could even permit re-arranging buffers after creation to support more advantageous physical layouts
  • Using the MMU to allow for discontiguous buffer allocation - see esp_mmu_map()

The segment object itself shouldn't be big enough to matter (compared to the pixel buffers), so we can just allocate new ones whenever it's convenient.

  • prevent segment changes during async JSON calls (this one is very important) or cancel any effect drawing immediately

This is on my todo list already -- the render loop really needs to be mutexed with any source of segment state changes. Every platform we support has multiple tasks, even ESP8266! I believe this may be responsible for some crash cases in some configurations in the current code. I haven't had time to look at it yet -- either we want a new "render state lock", or we can expand the scope of the current JSON buffer lock to cover any case of reading/writing the core state.

@blazoncek
Copy link
Collaborator Author

Just a thought that sparked: We have to distinguish effect transitions (i.e. change form one effect to another), color transition and palette transition. Each of those apply to one segment only. With transition styles two effect runs may be necessary (this part needs tweaking). Transition struct is responsible for holding old/temporary values and buffers.
These have nothing to do with layer/segment blending.

@willmmiles I will read your post again when I have more time and properly reply.

@blazoncek
Copy link
Collaborator Author

I pushed a brief 1D untested POC. 2D shouldn't be more difficult but that was all I managed to conjure today.
https://github.com/blazoncek/WLED/tree/layer-blend

@blazoncek
Copy link
Collaborator Author

I think blending is now completely implemented.
All of the rendering to frame buffer happens in blendSegment() which is called in service() just before show().

It may need a review from @DedeHai as raw setPixelColorXY had his optimisation which I moved into blendSegment().

Segment's setPixelColor() now only stores pixel value in local pixels array. It does no mirroring or reversing (or transposing in XY variant) or spacing or grouping. Some effects that were modified to keep previous pixel values in _data buffer may need another rewrite to remove those modifications as they have become obsolete.

@willmmiles if you have the time please take a look. There are 3 relevant commits.

@willmmiles
Copy link
Member

Just a thought that sparked: We have to distinguish effect transitions (i.e. change form one effect to another), color transition and palette transition. Each of those apply to one segment only. With transition styles two effect runs may be necessary (this part needs tweaking). Transition struct is responsible for holding old/temporary values and buffers.
These have nothing to do with layer/segment blending.

I wish I had more time to spend - code would speak more precisely than words. :(

I'm suggesting implementing all transitions as running the effect twice with different settings, and using dynamic blending to actually effect the visible transition. For example, for a brightness change: the old copy keeps running with the old brightness, the new copy runs with the new brightness, and the alpha used for blending shifts over time from one to the other.

The basic implementation is some "struct transition_instructions" on each segment that the blending layer uses to modify the segment parameters before blending is executed. Traditional blending is done by altering the blending alpha; push/swipe/etc. alter the target canvas coordinates; and so forth.

It makes the code simple and the transitions easy to implement, no matter what the actual transition style is. (In fact you also get clever things like transitions that work even if segment boundaries change!) Each transition style is implemented as a time-varying change to the blending, where "old" and "new" segments get opposite transition states (eg. one counts down, the other counts up).

Then, once that's working well, allow FX to opt in to optimized transitions when the right conditions match. Each FX gets to decide what transition optimizations they support, eg. single FX call for color/palette blending, smooth settings changes, etc. -- only for certain blending styles, or only if segment size hasn't changed, and so on and so forth. The neat implemention trick is that these optimized blendings can be "just another bit of state" in the transition_instructions that adjusts the segment parameters before the FX renderer is called.

You may note I am glossing over what happens to FX state data during transition. Truthfully, we need to move past allocateData() and on to proper FX setup/copy/move routines; @DedeHai's particle system clearly demonstrates the need for and value of having more structure there. Letting each FX setup know about the type of, and interact with the data from previous FX state (copy it, move it, or whatever makes sense!) offers a number of opportunities for improved displays. (And will also likely offer yet another some small FX speedups, as state data checking can be moved to Segment creation time.)

@willmmiles if you have the time please take a look. There are 3 relevant commits.

Thanks, I'll take a look when I've got time to do a proper review - probably won't be until later in the week. I'm rather behind at the moment. :(

@blazoncek
Copy link
Collaborator Author

I wish I had more time to spend

That's why all my replies tend to be short. 😄

I'm suggesting implementing all transitions as running the effect twice with different settings, and using dynamic blending to actually effect the visible transition.

That's how it is implemented in my branch.

  • new (or current) effect is run (covering only non clipped pixels) filling segment's pixel buffer
  • if segment is in transition old effect is run (inverting clipped region) filling (remaining) segment's pixel buffer
  • all segments are processed/rendered using 2 steps from above (if they are in transition)
  • frame buffer is cleared
  • segments are blended into frame buffer if they are active, starting with segment 0
  • frame buffer is sent to NeoPixelBus after undergoing ledmap remapping

Clipping pixels happen in Segment::setPixelColor(). Segment::setPixelColor() no longer calls WS2812FX::setPixelColor(), it only fills segmen's pixel buffer. On the other hand Segment::getPixelColor() returns value from pixel buffer.

All of the magic happens in WS2812FX::service(). With a bit of help from `Segment::setClippingRect().

Caveat: This approach may interfere with live data (i.e. DDP, Art-Net, E1.31) as that still directly writes to NPB.

@blazoncek
Copy link
Collaborator Author

And ... it works!!!!

@DedeHai
Copy link
Collaborator

DedeHai commented Feb 20, 2025

I tested the layer blending today, awesome work!
it really adds much value as effects can be combined properly and using "multiply" even masking is possible: when setting the top FX color to white it acts as mask to the colors drawn below.
Here are a few things I found (tested on C3, 16x16, 2D only):

  • FPS drop due to blend function is almost negligible: 62FPS (main) vs. 58FPS (layer-blend branch) with 3 overlapping segments
  • in the blend functions, the /255 could be replaced by >>8, although less accurate, it makes a speed difference at least on the C3: 66FPS vs. 69FPS in my test when using multiply with shifts and I saw no visual difference. could be a #ifdef for the C3 (see [INFO] Code execution speed considerations for developers #4206)
  • partial overlap is handled flawlessly
  • Mirroring does not work yet (it cuts the segment size in half)
  • Freeze does not work yet, it just turns a segment black
  • Fade transition renders an FX twice, which is incorrect IMHO. Some FX speed up (Android for example) and others show visible artefacts: try Octopus, then change palette -> the double-rendering is visible

We briefly discussed the white channel issue on Discord: the question is still if the buffers should be 32bit or 24bit. From my point of view, using RGB instead or RGBW buffers:
Advantages:

  • smaller buffers, less RAM, faster calculations (3 vs. 4 bytes/calculations)
  • particle sytem uses CRGB buffers for speed and could use segment buffers instead (would need to test if CRGBW affects speed much)
  • faster blending/blurring
  • direct 8bit array access instead of using R(x) G(x) B(x)
    Disadvantages:
  • zero or equal checking is slower, i.e. col==0 or col1==col2
  • RGBW strips may be more difficult to handle (not entirely clear to me yet)

@blazoncek
Copy link
Collaborator Author

@DedeHai mirroring has (hopefully) been fixed.
As for speed optimisations: they are due later when functionality is flawless.
For effect transition bugs: We will handle them elsewhere.

@blazoncek
Copy link
Collaborator Author

blazoncek commented Feb 26, 2025

Freeze & double effect run fixed.
https://github.com/blazoncek/WLED/tree/remove-global

@blazoncek
Copy link
Collaborator Author

@willmmiles @DedeHai @TripleWhy my recent changes (to prevent running effect twice in FADE transition mode if it didn't change) prompted me to a new thinking about temporary/transition data.

Since the blending of segments is now relatively simple and quick (in my POC) it would make sense to completely ditch TemporarySegmentData and Transition structures (and swapSegenv() and restoreSegenv()) and rely on segment copies instead. This would make drawing logic simple (always draw entire segment) and only do clipping when blending segments.

The workflow would be as follows:

  • when starting transition, make a copy of segment (using copy constructor/assignment) [json.cpp]
    • store this copy address either in strip or parent segment (for reference) if needed
  • update segment with new values (color, palette, effect, etc) [json.cpp]
  • when servicing strip check if segment is in transition (has a corresponding copy) [FX_fcn.cpp]
    • select segment into SEGENV and SEGMENT (as currently implemented)
    • run effect function (using current segment)
    • temporarily select segment copy into SEGENV and SEGMENT (if segment has a copy)
    • run effect function (using copy of segment, i.e. run old effect)
  • (when blending segments) blend segment and its copy using _top() or _bottom() depending on isPixelClipped() state
  • when transition ends, destroy segment copy

The drawback is increased RAM usage as entire segment is copied, including its data and pixels (this may allow some really funny transitions though) but the benefit is simplified and quick drawing process.

Did I write this clearly enough? Does it make sense? What are your thoughts?

@DedeHai
Copy link
Collaborator

DedeHai commented Feb 27, 2025

My understanding is that the temporary segment already does most of that right? at least in the PS, I can run two systems in parallel during transitions, meaning the data is already copied and all its properties too, so the big difference of your proposal is to not name it temporary but copy and also copy its buffer, which I thought is also needed for transitions.
in a nutshell: I am fully in favour of making it simpler if there are no major drawbacks.
if you change anything in the behaviour: the calls from json.cpp should not be asynchronous but in sync with service(). Currently there are some bugs that temporary data is destroyed while an FX is accessing it: which in general is not a huge issue and goes mostly unnoticed but in the PS it leads to a crash as it will access pointers that are dangling.

@blazoncek
Copy link
Collaborator Author

There is an issue when painting segment while in transition and the transition mode is push style.
During each setPixelColor() the pixel coordinate has to be shifted so that painting occurs at the right location. This needlessly slows down paining. It also produces odd visual outcome if the effect remains unchanged.

With segment copy each iteration of effect runs normally as it would if there were no transitions. It should paint its own pixel buffer, not needing to care what lies underneath (or above) or if it is shifted or not. Effect should not be aware if it is in transition as your current PS is (which I opposed).

Once segment blending is invoked (during show()) if segment is in transition it will blend old segment into frame buffer and then new segment. What I haven't figured out ATM is how to choose blend mode for blending these two as they are blended into frame buffer directly. For PIXEL_DUST and all SWIPE modes it is relatively simple, but PUSH and FADE modes pose a challenge.

the calls from json.cpp should not be asynchronous but in sync with service()

This is already in place, check the code above.

@willmmiles
Copy link
Member

Since the blending of segments is now relatively simple and quick (in my POC) it would make sense to completely ditch TemporarySegmentData and Transition structures (and swapSegenv() and restoreSegenv()) and rely on segment copies instead. This would make drawing logic simple (always draw entire segment) and only do clipping when blending segments.

Yes, this is what I was proposing upthread! Transitions can be implemented as temporary Segments copying the original state. (Although I would implement this as the existing segment becomes the transition, and the new state is initialized as the "copy" -- since on an FX change, the old state buffer is relevant only to the temporary transition segment.)

What I haven't figured out ATM is how to choose blend mode for blending these two as they are blended into frame buffer directly. For PIXEL_DUST and all SWIPE modes it is relatively simple, but PUSH and FADE modes pose a challenge.

Consider a function:

void WS2812FX::renderPixelBuffer(
    uint32_t* pixels, size_t rows, size_t cols,  /* source buffer; rows/cols is # of pixels to render from buffer */
    size_t row_size,  /* of source buffer only; allows arbitrary start point selection */
    uint16_t x_start, int x_stride, size_t x_grouping, /* spacing implemented by stride; can be negative for reverse */
    uint16_t y_start, int y_stride, size_t y_grouping,
    bool transpose,
    blend_mode_t blend_mode,
    uint8_t opacity
    bool* mask /* or equivalent, for pixel_dust; could use a deterministic calculation instead */
);
// NOTE! mirroring is not an argument to the above
// Mirroring is implemented by rendering the same source buffer more than once, 
// with opposite start and reversed sign on stride

Given this function, you can implement FADE by passing in modified opacities, while SWIPE and PUSH are done by computing modified output coordinate arguments and/or supplying &pixels[nonzero][nonzero] to the source argument.

Does this make sense?

the calls from json.cpp should not be asynchronous but in sync with service(). Currently there are some bugs that temporary data is destroyed while an FX is accessing it: which in general is not a huge issue and goes mostly unnoticed but in the PS it leads to a crash as it will access pointers that are dangling.

It's not just json.cpp, all of the network interfaces should get checked. Really IMO any calls to update the FX state ought to be explicitly mutexed against the render loop. We've gotten away with a lot thus far because only the network interface runs on a separate thread, but I think it'd offer a lot more flexibility and reliability to be rigorous about mutexing state access. Taking a look at this has been on my list for a while but I haven't had much time - got caught up trying to upstream my AsyncTCP work. :( (In fact, I'd like to give serious thought to moving the render loop to its own task in the future. Yes, this can be done on 8266, too, through a couple of mechanisms; not all of which require a separate stack.)

@DedeHai
Copy link
Collaborator

DedeHai commented Feb 27, 2025

I need to read this a few more times and think about it, I am not sure I understand the old or the new proposed way correctly, but I trust your judgment on it.

Effect should not be aware if it is in transition as your current PS is (which I opposed).

Effects are actually not aware of transitions, just the particle system is and for a good reason: each effect takes up to 20k or RAM, so if you run transitions that means 40k for a single (large) segment. Sharing the buffer halfs the RAM use enabling more segments to run PS effects. The PS can be easily adjusted to render each FX to its segment buffer so same behaviour but each effect having its own particle buffer will crash an ESP quite quickly.

@blazoncek
Copy link
Collaborator Author

blazoncek commented Feb 27, 2025

Untested POC: https://github.com/blazoncek/WLED/tree/unified-blend

EDIT: I think the POC is ready for testing.
EDIT2: Apparently it works.

@TripleWhy
Copy link
Contributor

it would make sense to completely ditch TemporarySegmentData and Transition structures (and swapSegenv() and restoreSegenv())

Awesome!

and rely on segment copies instead.

Not so awesome..? ^^

I believe segment copies are currently only used to detect if something changed after applying a json command. So I'd be careful as (I assume) the copy support has never been tested to a functional degree. Also note that, segments contain a lot of information.

The drawback is increased RAM usage as entire segment is copied

Is it though? You do create a copy already when applying changes via json (which I believe are all changes?). What if you extended the lifetime of this copy that already exists?

Btw. where would you store such a copy? In the segment I assume? If you do that, make sure you do not copy the copy when making a copy, otherwise you could end up with a lot of copies.

I have actually considered the opposite before: Making Segment uncopyable; in my modular effect classes research. I did not get rid of TemporarySegmentData, but I did get rid of the swap functions. If you were to move the complete state of an effect into the effect (instead of reading it from the config each frame in each effect), TemporarySegmentData should be unnecessary though. The idea there was a little different: Segment has a (unique) Effect pointer, when something changes, the Effect isn't immediately destroyed but moved to an oldEffect pointer, then destroyed later when the transition is done. Now you have two effects that you can use independently, similar to your approach, but without copies, and smaller types.

but PUSH and FADE modes pose a challenge.

If only there was a solution that made this simple ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants