ads131m0x: Optimized timer scheduling to attain higher sampling rates#7273
ads131m0x: Optimized timer scheduling to attain higher sampling rates#7273dmbutyugin wants to merge 1 commit into
Conversation
Signed-off-by: Dmitry Butyugin <dmbutyugin@google.com>
|
Interesting. I'm a little surprised though, because I'd expect the current interface to generally be better at handling high update rates. I'm guessing that the 2-entry buffer in the adc chip is confusing the scheduling - as it'll sleep the timer for 8/10ths of an update if it detects activity. That amount of time may be too much if there is a buffered entry. If so, though, I think something like the following would be simpler and more scalable: --- a/src/sensor_ads131m0x.c
+++ b/src/sensor_ads131m0x.c
@@ -172,6 +172,11 @@ ads131m0x_read_adc(struct ads131m0x_adc *ads131m0x, uint8_t oid)
trigger_analog_update(ads131m0x->ta, raw);
buffer_append_int32(&ads131m0x->sb, raw);
ads131m0x_flush_buffer(ads131m0x, oid);
+
+ if (ads131m0x_is_data_ready(ads131m0x)) {
+ sched_wake_task(&wake_ads131m0x);
+ ads131m0x->pending_flag = 1;
+ }
}
// Create an ads131m0x sensorThat is, I think one could merge the benefits of high frequency data ready checks with an additional buffer flushing check. -Kevin |
| ads131m0x_crc16_ccitt(const uint8_t *data, uint8_t len) | ||
| ads131m0x_crc16_ccitt(const uint8_t *data, uint_fast8_t len) | ||
| { | ||
| uint16_t crc = 0xFFFF; | ||
| while (len--) { | ||
| uint8_t x = (crc >> 8) ^ *data++; | ||
| uint16_t x = (crc >> 8) ^ *data++; | ||
| x ^= x >> 4; | ||
| crc = (crc << 8) ^ ((uint16_t)x << 12) ^ ((uint16_t)x << 5) | ||
| ^ (uint16_t)x; | ||
| crc = (crc << 8) ^ (x << 12) ^ (x << 5) ^ x; | ||
| } | ||
| return crc; | ||
| } |
There was a problem hiding this comment.
Just FYI, modern MCUs are really slow when doing math on uint8_t and uint16_t. The ARM mcus don't have 8bit or 16bit math operations (like add, subtract, shift, multiply). So, any math done on a uint16_t or uint8_t the resulting assembler has to scale the value to a uint32_t, perform the actual math, and then truncate the results back to 8bit (or 16bit).
So, in a nutshell, you almost assuredly want to make all these values uint32_t and change the last line to be return crc & 0xffff. Alternatively, if you care about performance on AVR (it's hard to imagine someone hooking one of these sensors up to an old 8-bit AVR cpu, but whatever) then you'll likely want to use uint_fast16_t and still change the last line to return crc & 0xffff.
Cheers,
-Kevin
Alas, this change alone seems to be insufficient, and reports occasional buffer overflows on my setup when I request 8 kSPS, as I described above. I suppose this is because of the lines Basically, this part of event handler will report overflow errors even though a pending flag may be set by an attempt to reschedule a task by
I'm sorry, I did not get what you meant by that. The current mainline code does fast high frequency data ready checks indeed (when data_ready signal is not raised by the sensor), and the same is true in my proposal. |
Is the issue that it is reporting unnecessary warnings while the data is actually there, or is the issue that it can't keep up with the desired data rate? That is, if you dump 10 seconds of data, do you get a bunch of warnings while having the full 80000 samples - or are you getting 79000 samples which indicates a lot of losses? If the issue is just unnecessary warnings, you might want to loosen the warning - for example, something like: So, basically only report a possible_overflow if the timer detects 1.6 update times without fifo clear (0.8+0.4+0.4).
I think the querying scheme used in adxl345 and similar works well for chips with pretty large fifos - it enables the mcu code to sleep for prolonged periods with little chance of an overflow. However, if this sensor only has a one sample buffer then I'm not sure it's a good fit. I have some (relatively minor) concerns:
So, I guess, in general having the constant running timer seems like it's less likely to run into issues. (It is a little unnerving to have both the timer and task modifying Separately, I wouldn't worry too much about false "possible overflow" reports. I'd be much more concerned about lost messages without a warning. In particular, the host timing wont work well if there are lost messages - so a warning is prudent even if no data was actually lost. Maybe that helps a little, |
|
Thank you for your contribution to Klipper. Unfortunately, a reviewer has not assigned themselves to this GitHub Pull Request. All Pull Requests are reviewed before merging, and a reviewer will need to volunteer. Further information is available at: https://www.klipper3d.org/CONTRIBUTING.html There are some steps that you can take now:
Unfortunately, if a reviewer does not assign themselves to this GitHub Pull Request then it will be automatically closed. If this happens, then it is a good idea to move further discussion to the Klipper Discourse server. Reviewers can reach out on that forum to let you know if they are interested and when they are available. Best regards, PS: I'm just an automated script, not a human being. |
The current scheduling appears to be suboptimal in that it may miss or report FIFO overruns when the MCU resources still allow to attain the corresponding sampling rate. For example, with Mellow ALPS board
the current code reports FIFO overflows, while the MCU waketime is still below 60% (see the charts below for waketime):
The updated code can maintain that ~8 kSPS without issues. The new code follows the conventions for scheduling used by accelerometers, e.g. ADXL345 or MPU9250, but it retains the same wakeup times from the other loadcells to ensure quick reading of the new data once it becomes available.
Here's the comparison of the MCU wake times and load for the old and the new code:
So, as one can see, the test code results in a slightly higher MCU waketime, but the increase is not very large. And while one can argue that being able to get 8 kSPS is not useful, the changes also mean that less capable MCUs will be able to attain higher sampling rates quite below 8 kSPS.