Skip to content

audio_i2sin module for espressif and raspberrypi#10990

Open
FoamyGuy wants to merge 6 commits into
adafruit:mainfrom
FoamyGuy:audio_i2sin
Open

audio_i2sin module for espressif and raspberrypi#10990
FoamyGuy wants to merge 6 commits into
adafruit:mainfrom
FoamyGuy:audio_i2sin

Conversation

@FoamyGuy
Copy link
Copy Markdown
Collaborator

@FoamyGuy FoamyGuy commented May 8, 2026

Adds audio_i2sin.I2SIn class. Only enabled/implemented for espressif and raspberrypi ports currently.

Two new manual test scripts are included in the PR that were used to verify the functionality of the module. Recording to an SDCard, and sound reactive neopixels.

Sparkle Motion board def is updated to include the pins that the I2S mic is connected to.

Testing:

  • Tested with mic built-in to sparkle motion and with https://www.adafruit.com/product/6049
  • Record to sdcard works successfully on espressif port. On raspberrypi it seems like it's taking too long to write data to the sdcard and it's causing the recording to take too long and be broken up. All testing was with SPI and sdcardio. Might be worth testing sdio?
  • Sound reactive neopixels work under espressif and raspberrypi. I tested with sparkle motion, ESP32 Huzzah feather, ESP32 S3 TFT feather, RP2040 feather, and RP2350 feather.

Copy link
Copy Markdown
Member

@tannewt tannewt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code seems fine. Do we want it to be audio_i2sin instead of audioi2sin? I think the latter is more Python-style and more what we've done in the core. Python reference: https://peps.python.org/pep-0008/#package-and-module-names

@relic-se
Copy link
Copy Markdown

relic-se commented May 12, 2026

Starting to play around with this module. I see there's no option to support signed samples. The audio codec I'm testing with only supports signed samples (or I haven't implemented it in the driver library). I think it'd be beneficial to add that option to the constructor rather than require the user to perform the conversion within Python (which I haven't figured out how best to do yet).

An example of a samples_signed property would be within audiomixer.Mixer (https://docs.circuitpython.org/en/latest/shared-bindings/audiomixer/index.html#audiomixer.Mixer).

Here's the code I'm working with right now with a Pimoroni Pico Plus 2, TLV320AIC3204 audio codec, and SPW2430 microphone.

import array
from audioi2sin import I2SIn
import board
import ulab.numpy as np

import relic_tlv320aic3204

# Initialize codec
codec = relic_tlv320aic3204.TLV320AIC3204(
    i2c=board.STEMMA_I2C(),
    mclk=board.GP17,
    rst=board.GP16,
)
codec.sample_rate = 44100
codec.bit_depth = 16

# Initialize I2S bus
i2sin = I2SIn(
    bit_clock=board.GP18,
    word_select=board.GP19,
    data=board.GP21,
    sample_rate=codec.sample_rate,
    bit_depth=codec.bit_depth,
    mono=True,
)

# Connect IN1L to Left MICPGA and IN1R to Right MICPGA
codec.connect_input(relic_tlv320aic3204.INPUT_1, relic_tlv320aic3204.IMPEDANCE_20K)
codec.input_gain = 6.0  # dB

# Setup ADC Input
codec.dac_enabled = True  # BUG: DAC must be enabled for ADC functionality
codec.adc_volume = 0.0  # dB
codec.adc_enabled = True
codec.adc_muted = False

# Setup buffer
buf = array.array("H", [0] * (codec.sample_rate // 16))

while True:
    
    # Record to buffer
    i2sin.record(buf, len(buf))
    
    # Determine maximum level
    print(np.max(np.array(buf, dtype=np.uint16)))

Right now, the level is all wonky because of the issue with signedness.

@FoamyGuy
Copy link
Copy Markdown
Collaborator Author

@relic-se The latest commit adds samples_signed argument that is modeled after audiomixer. It defaults to True which is the same as prior behavior and can now be set to False to change behavior. I tested the default case on esp32s3 and rp2350 successfully. I'm not sure I have the right kind of mic breakout to test the new behavior.

@FoamyGuy
Copy link
Copy Markdown
Collaborator Author

@tannewt this is refactored to audioi2sin with no underscore in the latest commits. I agree it follows the existing modules naming pattern better this way, only a few have underscores.

@relic-se
Copy link
Copy Markdown

I was beginning to do a deeper review of the recent samples_signed addition, but I didn't realize that I2S data is always signed. I think it might be best to omit that property and assume that all data will be signed going forward. Speaking of, is there any reason why unsigned data was the default from the beginning? Or is that a hold-over from PDMIn?

@FoamyGuy
Copy link
Copy Markdown
Collaborator Author

@relic-se I believe that assuming signed data was the original behavior. Handling unsigned data was only added with this most recent commit and is only achieved by explicitly passing samples_signed=False. Leaving the default value True results in the same behavior from before that commit where data is assumed to be signed.

I can remove that new argument if it's not necessary to handle unsigned data. I don't have much experience with these (or really any) mics yet, so I am not about what is normal or what alternates are used.

@relic-se
Copy link
Copy Markdown

I believe my confusion stems from the use of array.array objects with unsigned typecodes to handle the data ("B", "H", and "I"). So, the data may be coming through as signed but it is being injected into a byte array which treats it as unsigned. Then, we are requiring the user to handle the conversion of that data back to its intended form.

I initially thought your original implementation only handled unsigned data which is why I suggested the samples_signed property (ie: I thought the default was samples_signed=False).

Unless I am mistaken on something, the data should always be signed and the user should not be concerned with fixing the signedness of the data. Hence, we should be using the formats "b", "h", and "i". Is there a limitation that I am not aware of regarding this?

@relic-se
Copy link
Copy Markdown

So, I can now see some benefit of the conversion implemented in i2sin_convert_to_unsigned for users who specifically are trying to feed into an unsigned destination (WAV file, etc), and I understand that the bus is always treated as signed regardless of the samples_signed property.

I still think that the following changes should be implemented:

  • If samples_signed=True, the required typecodes should be "b", "h", and "i" depending on bit depth.
  • If samples_signed=False, the required typecodes should be "B", "H", and "I" depending on bit depth.

Currently, the module requires unsigned typecodes regardless of samples_signed.

I can provide a full code review if that is beneficial.

@FoamyGuy
Copy link
Copy Markdown
Collaborator Author

@relic-se yeah, your review would be appreciated.. I'll look into the typecodes thing that you mentioned and see about changing that as suggested.

@tannewt
Copy link
Copy Markdown
Member

tannewt commented May 13, 2026

I'll do a final review after @relic-se does theirs.

@relic-se
Copy link
Copy Markdown

@FoamyGuy Is bit_depth=8 not planned for support on the raspberrypi implementation?

@FoamyGuy
Copy link
Copy Markdown
Collaborator Author

@relic-se my primary goal was to unlock support for mics built-in to Sparkle Motion devices and this other breakout: https://www.adafruit.com/product/6049. Aiming to support audio reacitive neopixel stuff and recording audio for allowing a project like these walkie talkies: https://learn.adafruit.com/esp-now-walkie-talkies to work under CircuitPython.

I'm not opposed to ultimately having more functionality but enabling those things is was the main focus of the current implementation.

I'm not sure how bit depth works exactly, would using bit depth 8 require specific microphone hardware? or any mic can support it?

@relic-se
Copy link
Copy Markdown

@FoamyGuy

I'm not sure how bit depth works exactly, would using bit depth 8 require specific microphone hardware? or any mic can support it?

I am not aware of any I2S microphones which are limited to or support 8-bit data. I only know of some audio codecs which support that format.

I am only this question asking because the espressif implementation has 8-bit support while the raspberrypi hal does not. I wasn't sure if that lack of functionality on one platform was a concern. I could be added by including another set of PIO programs and making some code alterations, but I am not opposed to leaving that support omitted. This is especially true if flash size is a concern on raspberrypi.

@relic-se my primary goal was to unlock support for mics built-in to Sparkle Motion devices and this other breakout: https://www.adafruit.com/product/6049. Aiming to support audio reacitive neopixel stuff and recording audio for allowing a project like these walkie talkies: https://learn.adafruit.com/esp-now-walkie-talkies to work under CircuitPython.

I think that scope is correct for this module. I do have an ICS-43434 module and plan on conducting some tests with that device once we've resolved these array format issues.

Copy link
Copy Markdown

@relic-se relic-se left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of my comments are related to bit depth and signedness. Currently, I think there is more work to be done to make it so that it doesn't have to be fixed within user code. All users should have to worry about is ensuring they have the correct bit_depth set for their microphone.

On that note, I could see an output_bit_depth option being added which could enable the core to handle conversion between 24-bit and 16-bit samples (for instance) and just give the user back their data in an "h" array where they can write it straight to WAV or output to an I2S speaker, etc. A default value of output_bit_depth=None would respect the provided bit_depth property.

Comment thread shared-bindings/audioi2sin/I2SIn.c Outdated

// I2S delivers signed PCM. When samples_signed is false, XOR each sample with
// the sign bit for its width to convert to unsigned PCM (WAV convention).
static void i2sin_convert_to_unsigned(void *buffer, uint32_t samples,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For better performance, only process as 32-bit words rather than casting buffer to relevant word size. See shared-module/audiomixer/Mixer.c for reference.

static inline uint32_t tounsigned8(uint32_t val) {

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated to use the words approach instead of samples in the latest commit.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch on processing the tail end of the data, but any thoughts on requiring buffer lengths that are multiples of 2 for 16-bit and 4 for 8-bit so that you wouldn't have to deal with that?

Comment thread shared-bindings/audioi2sin/I2SIn.c
Comment thread tests/circuitpython-manual/audioi2sin/i2sin_neopixel_reactive.py Outdated
Comment thread tests/circuitpython-manual/audioi2sin/i2sin_neopixel_reactive.py Outdated
Comment thread ports/raspberrypi/common-hal/audioi2sin/I2SIn.c Outdated
Comment thread ports/raspberrypi/common-hal/audioi2sin/I2SIn.c Outdated
Comment thread ports/raspberrypi/common-hal/audioi2sin/I2SIn.c
Comment thread shared-bindings/audioi2sin/I2SIn.c Outdated
Comment thread shared-bindings/audioi2sin/I2SIn.c Outdated
@FoamyGuy
Copy link
Copy Markdown
Collaborator Author

@relic-se Thank you! I think I have made all of the requested changes. Let me know if I have missed anything or changes don't align with what you had in mind.

I successfully re-tested the reactive neopixels on both RP2350 and esp32s3 and the sdcard recording on the esp32s3 after these changes.

@relic-se
Copy link
Copy Markdown

@relic-se Thank you! I think I have made all of the requested changes. Let me know if I have missed anything or changes don't align with what you had in mind.

Looking good so far! Any thoughts on the output_bit_depth property that I mentioned at the start of the review? I could even see that being an optional argument in in I2SIn.record to make something like mic.record(buf, len(buf), bit_depth=16). That way, you could transfer the data from a 24-bit I2S mic straight into a WAV file without having to handle two arrays like in i2sin_record_sdcard.py.

I successfully re-tested the reactive neopixels on both RP2350 and esp32s3 and the sdcard recording on the esp32s3 after these changes.

My audio codec example commented above is now working more as expected. I've got more testing to do to validate audio quality, but so far so good!

@relic-se
Copy link
Copy Markdown

In the following program, I'm take a 128 frame sample of data recording at 16-bit 44.1kHz after a second of delay (to warm up the bus). (Note: the environment I'm in is relatively quiet.)

mic = I2SIn(sample_rate=44100, bit_depth=16, mono=True) # excludes pins
time.sleep(1)
buf = array.array("h", [0] * 128)
mic.record(buf, len(buf))
print(", ".join([str(x) for x in buf]))

With that, I'm getting the following data:

0, 0, 0, 0, 0, -5108, -5116, -5116, -5116, -5103, -5103, -5113, -5113, -5113, -5181, 0, 0, 0, 0, 0, 0, -5121, 0, 0, -5016, 0, 0, -4961, 0, 0, 0, 0, 0, 0, 0, 0, 0, -5141, -5141, -5109, -5109, -5232, -5232, -5232, -5216, -5216, -5078, 0, 0, 0, 0, 0, 0, -5013, 0, 0, -5212, 0, 0, -5137, 0, 0, 0, 0, 0, 0, 0, 0, 0, -5308, -5308, -5213, -5213, -5113, -5113, -5113, -5228, -5228, -5069, 0, 0, 0, 0, 0, 0, -5072, 0, 0, -5180, 0, 0, -5202, 0, 0, 0, 0, 0, 0, 0, 0, 0, -5152, -5152, -4963, -4963, -4963, -5032, -5032, -5141, -5141, -5141, 0, 0, 0, 0, 0, 0, -5392, 0, 0, -5374, 0, 0, -5190, 0, 0, 0, 0

The same demo but using the pio_i2s library results in the following data:

-5125, -5137, -5137, -4934, -4934, -4934, -5060, -5060, -5038, -5038, -5038, -5102, -5102, -5265, -5265, -5265, -5165, -5165, -5096, -5096, -5096, -5055, -5055, -5127, -5127, -5127, -5159, -5159, -5174, -5174, -5174, -5232, -5232, -5272, -5272, -5272, -5161, -5161, -5225, -5225, -5105, -5105, -5105, -5264, -5264, -5264, -5264, -5264, -5025, -5025, -5073, -5073, -5073, -5055, -5055, -5053, -5053, -5053, -5089, -5089, -4960, -4960, -4960, -4962, -4962, -5025, -5025, -5025, -5167, -5167, -5157, -5157, -5157, -5072, -5072, -5179, -5179, -5179, -5104, -5104, -4995, -4995, -5000, -5000, -5000, -5076, -5076, -5068, -5068, -5068, -5086, -5086, -5014, -5014, -5014, -4995, -4995, -5091, -5091, -5091, -5110, -5110, -5132, -5132, -5132, -5221, -5221, -5039, -5039, -5039, -5046, -5046, -5131, -5131, -5131, -5060, -5060, -5169, -5169, -5130, -5130, -5130, -5189, -5189, -5244, -5244, -5244, -5118

The codec I'm using also supports 32-bit data. With that, I'm getting this result:

0, 0, 0, -325218779, 0, 0, 0, 0, 0, 0, -429075753, 0, 0, 0, 0, 0, 0, -247978860, 0, 0, 0, 0, 0, 0, -104241327, 0, 0, 0, 0, 0, 0, -281169933, 0, 0, 0, 0, 0, 0, -543164721, 0, 0, 0, 0, 0, 0, -429250668, 0, 0, 0, 0, 0, 0, -265447697, 0, 0, 0, 0, 0, 0, -199318291, 0, 0, 0, 0, 0, 0, -400510205, 0, 0, 0, 0, 0, 0, -405947633, 0, 0, 0, 0, 0, 0, -242857100, 0, 0, 0, 0, 0, -206727305, -249618896, 0, 0, 0, 0, 0, -374861202, -374861202, 0, 0, 0, 0, 0, -524915452, -524915452, 0, 0, 0, 0, 0, -438181383, -438181383, 0, 0, 0, 0, 0, -346484870, -357297258, 0, 0, 0, 0, 0, -499938737, -603335765, 0, 0, 0, 0, 0

And using pio_i2s once again:

-353766120, -353766120, -348990385, -348990385, -351100251, -351100251, -351100251, -352504128, -352504128, -341765360, -341765360, -341765360, -363757221, -363757221, -345549800, -345549800, -345549800, -356492102, -356492102, -343449048, -343449048, -343449048, -354787394, -354787394, -363637240, -363637240, -363637240, -359110902, -359110902, -362764221, -362764221, -362764221, -348507130, -348507130, -364833200, -364833200, -364833200, -353990207, -353990207, -355420971, -355420971, -358053155, -358053155, -358053155, -352787181, -352787181, -347310581, -347310581, -347310581, -354148546, -354148546, -361199427, -361199427, -361199427, -350211184, -350211184, -379151245, -379151245, -379151245, -337054398, -337054398, -381178281, -381178281, -381178281, -336496161, -336496161, -364800491, -364800491, -364800491, -352118008, -352118008, -366252565, -366252565, -366252565, -347807695, -347807695, -366585660, -366585660, -347415251, -347415251, -347415251, -365918333, -365918333, -341399555, -341399555, -341399555, -361310981, -361310981, -339256388, -339256388, -339256388, -357430132, -357430132, -354700255, -354700255, -354700255, -353860629, -353860629, -356062366, -356062366, -356062366, -349915709, -349915709, -343826813, -343826813, -343826813, -353093656, -353093656, -341844731, -341844731, -341844731, -366945707, -366945707, -340834174, -340834174, -387779050, -387779050, -387779050, -352782530, -352782530, -367289857, -367289857, -367289857, -350193249, -350193249, -367136044, -367136044, -367136044

Looks like audioi2sin.I2SIn is dropping a lot of frames of data. Not sure of the cause just yet.

@gamblor21
Copy link
Copy Markdown
Member

This may be a future consideration but was there any thought to take the I2S input right to an output? I'm thinking about like a kid's karaoke machine that they sing into and it adds echo or distortion and plays it out right away.

Maybe going to the buffer and feeding that into an audio.play object would work. Just a random thought.

@relic-se
Copy link
Copy Markdown

This may be a future consideration but was there any thought to take the I2S input right to an output? I'm thinking about like a kid's karaoke machine that they sing into and it adds echo or distortion and plays it out right away.

Maybe going to the buffer and feeding that into an audio.play object would work. Just a random thought.

Yes, I think the next step is to implement the audiocore API. I've already made some progress on audiobusoo.PDMIn, but it's not ready at this moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants