Unexpected behavior in hls4ml softmax layer with fixed-point precision #1393

sgazagnes · 2025-10-24T11:50:52Z

sgazagnes
Oct 24, 2025

Hello HLS4ML community
I recently started using hls4ml and I am currently following the hls4ml Part 1 & Part 2 tutorials, reproducing the basic dense NN example:
model = Sequential()
model.add(Dense(64, input_shape=(16,), name='fc1', kernel_initializer='lecun_uniform', kernel_regularizer=l1(0.0001)))
model.add(Activation('relu', name='relu1'))
model.add(Dense(32, name='fc2', kernel_initializer='lecun_uniform', kernel_regularizer=l1(0.0001)))
model.add(Activation('relu', name='relu2'))
model.add(Dense(32, name='fc3', kernel_initializer='lecun_uniform', kernel_regularizer=l1(0.0001)))
model.add(Activation('relu', name='relu3'))
model.add(Dense(5, name='output', kernel_initializer='lecun_uniform', kernel_regularizer=l1(0.0001)))
model.add(Activation('softmax', name='softmax'))

I noticed that the ROC curves of the converted HLS model show unusual behavior at low signal efficiency, which may suggest an unexpected precision or quantization loss issue compared to the original Keras model (see figure below).

To do some experimentation, I used the hls4ml tracing tools to inspect intermediate layer outputs in three hls4ml model variants:

M1: all layers → fixed<16,6,TRN,WRAP> (default when granularity = 'model')
M2: Dense output layer → fixed<38,18,TRN,WRAP> (config['LayerName']['output']['Precision']['result'] = 'fixed<38,18>', as set by default when granularity = 'name')
M3: Dense output layer → fixed<38,18,RND,SAT> (test case to understand the impact of RND and SAT parameters)

In all M1, M2, and M3 models, all layers use the same fixed precision (<16,6,TRN,WRAP>), only the precision of the dense output layer is changing.

Results

ROC behavior:

As mentioned above, the hls4ml models show unusual performance relative to Keras in the bottom left corner. The M1 model is shown with dashed and M2 with dots, M3 is not shown as it behaves similarly to M1 (i.e., the model with the output precision of <38,18,RND,SAT> behaves the same as the one with <16,6,TRN,WRAP>). We do expect a certain precision loss, but comparing to the tutorials slides provided, it does not seem that this particular behavior is expected.

I went further and inspected the Dense layer and softmax layer output distributions:

The output layer distributions are nearly identical across M1–M3, and close to the keras values, which one may expect.
However, the softmax layer shows strong deviations:
M1 and M3 softmax values span quite sparsely the interval in {0.4, 1}.
M2 behaves even more discretely, with softmax outputs directly collapsing to 0 or 1.

In classification contexts, this has no impact on the final model precision since all softmax values are collapsed with argmax, which can be seen by the almost identical model precision values:
Keras Accuracy: 0.759277108433735
hls4ml M1 (dense output <16,6, TRN WRAP>) Accuracy: 0.7593072289156626
hls4ml M2 (dense output <38,18, TRN WRAP>) Accuracy: 0.7592951807228916
hls4ml M3 (dense output <38,18, RND SAT>) Accuracy: 0.7593072289156626

But it is very unclear to me why the output values from softmax layer behaves that differently when changing the precision of the dense output layer. I is even more unclear why the model with the output layer having a fixed precision of <38,18, RND SAT> behaves similarly to the one with a fixed precision of <16,6, TRN WRAP>, but differently from the one with <38,18, TRN WRAP>.

So far, I did not find a way to reconcile the values distribution so that it matches more closely the keras ones (which is expected in the tutorial when using the default configuration). I did not find mentions of similar issues, to the exception of an old resolved issue regarding a sub-optimal implementation of the softmax function.

My environment:

OS: Fedora 41 Workstation, 64-bit (kernel 6.6.16-100)
Python: 3.10.16
hls4ml: 1.1.0
Vitis: 2024.2

Happy to provide more info and notebooks/codes if necessary.
Many thanks in advance!

Answered by jmitrevs

Oct 24, 2025

All of these issues are related to the softmax. You can see changes in the tails if you make any of these changes:

Increase the softmax table size by, for example, adding config['LayerName']['softmax']['TableSize'] = 4096 to the config.
Choose a different softmax implementation, for example, config['LayerName']['softmax']['Implementation'] = 'latency'

View full answer

JanFSchulte · 2025-10-24T14:14:56Z

JanFSchulte
Oct 24, 2025
Maintainer

Hi @sgazagnes,

thanks for giving hls4ml a try and reporting your feedback. In this case, the behavior you see is actually fully expected. In your M1 case, the default, you see the effect of reducing the variable precision after the training, so Post Training Quantizaton (PTQ). If you continue with the tutorial until part 4, you will actually see how this gets fixed by using quantization-aware training (QAT) with the QKeras package to directly train with the final model precision.

What is happening in case of your M2 and M3 model is that with the large variable precision you choose, the output of the dense layer can't be mapped into the narrow range of values of the lookup table that the softmax implementation in hls4ml uses, which basically destroys the precision.

Hope this helps,
Jan

0 replies

sgazagnes · 2025-10-24T14:59:21Z

sgazagnes
Oct 24, 2025
Author

Hi Jan,

Many thanks for the detailed answer. I think I now understand that the large variable precision causes an issue for the narrow range values of the lookup tables. In the meantime I went ahead with Part 4 and do observe that the QKeras version matches the Keras model, but I still observe the same behavior when the Qkeras model is converted using HLS4ML:

This has the same weird pattern as before, with this sharp ROC curve drop in the lower part of the plot.

If I compare to the figure displayed in the tutorial slides, I was expecting to observe this:

(The legend does not exactly state which curve is which but I assume this plot was created using the same tutorial notebook so the legend should match ?).

Or am I also seeing the expected hls4ml behavior here?
If so, does this mean that I am seeing the full numerical range produced by the softmax lookup table implementation?
Would improving the discretization in the range [0.4, 1] require adjusting the lookup table’s value range to better cover that interval?

Thanks again,
Simon

2 replies

jmitrevs Oct 24, 2025
Maintainer

All of these issues are related to the softmax. You can see changes in the tails if you make any of these changes:

Increase the softmax table size by, for example, adding config['LayerName']['softmax']['TableSize'] = 4096 to the config.
Choose a different softmax implementation, for example, config['LayerName']['softmax']['Implementation'] = 'latency'

Answer selected by sgazagnes

sgazagnes Oct 25, 2025
Author

That does the trick, thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unexpected behavior in hls4ml softmax layer with fixed-point precision #1393

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Unexpected behavior in hls4ml softmax layer with fixed-point precision #1393

Uh oh!

Uh oh!

sgazagnes Oct 24, 2025

Replies: 2 comments · 2 replies

Uh oh!

JanFSchulte Oct 24, 2025 Maintainer

Uh oh!

Uh oh!

sgazagnes Oct 24, 2025 Author

Uh oh!

jmitrevs Oct 24, 2025 Maintainer

Uh oh!

sgazagnes Oct 25, 2025 Author

sgazagnes
Oct 24, 2025

Replies: 2 comments 2 replies

JanFSchulte
Oct 24, 2025
Maintainer

sgazagnes
Oct 24, 2025
Author

jmitrevs Oct 24, 2025
Maintainer

sgazagnes Oct 25, 2025
Author