Skip to content

Enable MMU + D-cache: fix sustained host→device WRITE#20

Open
widgetii wants to merge 2 commits intomasterfrom
feature/mmu-dcache
Open

Enable MMU + D-cache: fix sustained host→device WRITE#20
widgetii wants to merge 2 commits intomasterfrom
feature/mmu-dcache

Conversation

@widgetii
Copy link
Copy Markdown
Member

Summary

Enable ARMv7 MMU with D-cache to fix FIFO overflow during sustained host→device writes.

ARMv7 short-descriptor page tables with 1MB identity-mapped sections:

  • DDR (128MB from RAM_BASE): write-back, write-allocate
  • I/O regions (UART, FMC, CRG, flash window): device/uncached

With D-cache, COBS+CRC processing is ~10x faster, eliminating PL011 FIFO overflow.

Before (uncached DDR)

Size Result
16KB OK
64KB FAIL (FIFO overflow)
256KB FAIL

After (D-cache enabled)

Size Speed Result
16KB 49 KB/s OK
64KB 80 KB/s OK
256KB 79 KB/s OK

All verified with CRC32 read-back.

Test plan

  • All CI checks pass locally (ruff, mypy, pytest 247, C 1604)
  • Self-update to real hi3516ev300 — agent boots with MMU enabled
  • 16KB / 64KB / 256KB WRITE all verified
  • CI on PR

🤖 Generated with Claude Code

widgetii and others added 2 commits March 31, 2026 20:19
ARMv7 short-descriptor page tables with 1MB identity-mapped sections.
DDR (128MB from RAM_BASE) is cacheable write-back/write-allocate.
All I/O regions (UART, FMC, CRG, flash window) are device/uncached.

With D-cache, COBS decode + CRC32 processing is ~10x faster, eliminating
PL011 FIFO overflow during sustained host→device transfers. Previously
WRITE failed after ~16-420KB; now 256KB verified at 79 KB/s.

Page table (16KB) allocated in BSS with 16KB alignment for TTBR0.

Tested on hi3516ev300:
- 16KB write: OK (previously OK)
- 64KB write: OK (previously FAILED)
- 256KB write: OK (previously IMPOSSIBLE)
- All verified with CRC32 read-back

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PL011 RX interrupt handler drains hardware FIFO into 4KB ring buffer
automatically. GIC configured for UART0 IRQ (SPI 7 on ev200/ev300).
IRQ mode stack set up. proto_recv reads from ring buffer via
uart_getc_safe — no more polling soft_rx_drain.

Combined with MMU/D-cache, this should eliminate sustained WRITE
failures. Testing showed 3/4 blocks work but block 4 loses 3 packets
(14848/16384 bytes received). Ring buffer overflow suspected.

Known issue: 8KB ring buffer crashes agent (BSS overlap or GIC issue).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant