Skip to content

Conversation

@david-cermak
Copy link
Collaborator

@david-cermak david-cermak commented Mar 5, 2025

Pending forward porting:

MDNS Component Refactor

Introduction

The MDNS component has been refactored from a single monolithic file mdns.c into a set of focused modules with clear responsibilities. This restructuring maintains the same functionality while improving code organization, maintainability, and testability.

The refactor has been performed in thee major stages, to ensure smooth transition to production while avoiding major changes in untested paths. These stages are:

  • Restructuring into several modules while keeping the actual code changes to minimum
  • Cover functionality of each module by specifically focused unit tests.
  • Optimizations on module levels (string and queue operations, esp_timer dependency, socket task improvements, error checking)

Module Structure

mdns_service.c

The central service module that orchestrates the MDNS component operation. It manages the service task, timer, action queue, and system event handling. This module provides the entry points for the public API and coordinates actions between different modules. It's responsible for the general lifecycle of the MDNS service.

mdns_responder.c

Implements the responder functionality that answers incoming MDNS queries. It manages hostname and service registration, and generates appropriate response packets based on the queried information. The responder tracks all registered services and their associated metadata, handling hostname conflicts and service announcement.

mdns_browser.c

Handles browse/query functionality, managing browse requests for MDNS services on the network. It provides the infrastructure for adding, tracking, and removing browse requests, and processes incoming service discovery responses. The module manages browsing state and notifies applications when services appear or disappear.

mdns_querier.c

Handles outgoing query functionality, maintaining state for ongoing queries and processing query responses. It supports one-shot and continuous queries, manages query timeouts, and organizes results. The module provides the infrastructure for applications to discover services and resolve hostnames.

mdns_send.c

Manages packet generation and transmission. It constructs properly formatted MDNS packets, with correct headers and resource records, and handles the transmission scheduling and queueing of outgoing messages. This module encapsulates knowledge about the MDNS protocol packet structure.

mdns_receive.c

Processes incoming MDNS packets, parsing and validating their structure. It extracts queries and responses from received packets and dispatches them to the appropriate handling modules. This module focuses on packet parsing and initial classification.

mdns_pcb.c

Manages protocol control blocks (PCBs) that handle the state machine for MDNS interface operation. It coordinates the probing, announcing, and running states for each network interface and protocol combination. This module is responsible for service conflict detection, duplicate interface handling, and ensuring proper service announcements.

mdns_networking_lwip.c and mdns_networking_socket.c

These modules provide networking abstraction for different underlying network stacks. They handle multicast group management, packet reception, and transmission over the network interfaces. This abstraction allows the MDNS implementation to work with different networking stacks.

mdns_netif.c

Manages network interface registration and monitoring. It tracks network interface state changes and notifies the MDNS service when interfaces become available or unavailable, enabling dynamic adaptation to network configuration changes.

mdns_utils.c

Contains common utility functions used across multiple modules, such as string handling, data structure operations, and debug helpers. This module centralizes shared functionality to avoid code duplication.

mdns_mem_caps.c

Provides memory allocation functions with capability-based allocation support. It encapsulates memory management operations, making it easier to track and debug memory usage across the MDNS component.

mdns_debug.c

Contains debugging and logging utilities specific to the MDNS component. It provides structured logging and debug information for troubleshooting complex MDNS operations.

Testability Improvements

The refactored module structure significantly enhances testability by clearly separating different functional areas. Each module can now be tested in isolation with proper mocking of its dependencies. The separation of networking, packet handling, and business logic makes it possible to create focused unit tests without the complexity of managing the entire MDNS stack. Module-specific state is now contained within individual modules rather than distributed throughout a monolithic codebase, making test case setup and verification more straightforward. The refactoring also improves the ability to simulate various network conditions and edge cases by intercepting inter-module communication at well-defined boundaries.

Conclusion

The MDNS component refactoring provides a more maintainable and testable architecture while preserving the existing public API. By breaking down the monolithic implementation into focused modules with clear responsibilities, the codebase becomes easier to understand, extend, and maintain. The refactored structure makes it simpler to identify and fix bugs, implement new features, and adapt to future requirements. This architectural improvement supports long-term evolution of the component while maintaining backward compatibility for existing applications.

Adding fuzzer tests

@david-cermak david-cermak self-assigned this Mar 5, 2025
@david-cermak david-cermak marked this pull request as draft March 5, 2025 16:33
@david-cermak david-cermak force-pushed the mdns/refactor branch 7 times, most recently from 2019c2f to 611c665 Compare March 9, 2025 17:25
@david-cermak david-cermak force-pushed the mdns/refactor branch 6 times, most recently from bf55daf to 89499cf Compare March 11, 2025 15:31
@david-cermak david-cermak force-pushed the mdns/refactor branch 7 times, most recently from e7072f5 to ae08137 Compare March 21, 2025 13:32
Copy link
Collaborator

@zwx1995esp zwx1995esp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the whole architecture, LGTM. I reviewed only for mdns_service.c and mdns_utils.c and left some nits and questions. For the other files, I will finish this weekend.

* @param index offset of uint16_t value
* @param value the value to set
*/
static inline void set_u16(uint8_t *packet, uint16_t index, uint16_t value)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: what is the difference between this function and mdns_utils_append_u16?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the append function is supposed to be used when composing a packet, so it keeps track of the current pointer and returns the position.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I just see the implementation of the two function are much the same, can we use the function mdns_utils_append_u16 for both set and append scenario?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, but i'd again leave it for phase III, where we can address optimizations. I'd keep it here as is, just to reduce the complexity of this PR

exit 1;
fi
else
exit 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this path exit 0 ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still a failure -- fuzzer returns another error code.

help
This buffer is used to output mDNS packets in debug prints.

config MDNS_ENABLE_CONSOLE_CLI
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this option defaults to y?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also y on current master (no change here).

Why? This option was added by a public contributor, before it was simply built unconditionally -- so the compatibility reasons.


bool mdns_priv_queue_action(mdns_action_t *action)
{
if (xQueueSend(s_action_queue, &action, (TickType_t)0) != pdPASS) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why make this non-blocking and treat the full queue as an error?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again not functionality change -- the goal of this refactoring stage.
But the reason is the requirement of non-blocking API + some calls IIRC are from esp_timer callback (which should never block)


esp_err_t mdns_priv_responder_init(void)
{
s_server = (mdns_server_t *)mdns_mem_malloc(sizeof(mdns_server_t));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use _calloc?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a pretty common pattern in the entire mdns code -- I'll address in the upcoming stage-#2

Copy link
Collaborator

@zwx1995esp zwx1995esp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor nitpicks — overall LGTM! Thanks to David for this huge refactor!

@zwx1995esp
Copy link
Collaborator

Hi, @david-cermak Some of the comments are related to the format. I think we can add a pipeline jobs later to check if the code in new PR is pretty or not.

@david-cermak
Copy link
Collaborator Author

Hi, @david-cermak Some of the comments are related to the format. I think we can add a pipeline jobs later to check if the code in new PR is pretty or not.

Thanks for the final review! Yes, I agree -- the point of "stage #1" was to keep the actual changes to minimum to

  • avoid unwanted regression and focus only on structural changes
  • so the agent can provide better code review -- although models are way smarter than in Feb/2025, so it shouldn't be a big issue anymore.

feat(mdns): Make mdns-debug a separate module

feat(mdns): Separate packet-tx and browse

feat(mdns): Add responder module

feat(mdns): Querier-search stuff, Responder-pcb stuff

feat(mdns): action queue abstraction

fix(mdns): Minor cleanup of packet parser

fix(mdns): Cleanup and split interfaces between modules

feat(mdns): Add fuzzing job with AFL++

fix(mdns): Rename and cleanup

feat(mdns): Add service module

fix(mdns): Forward porting 8ca45f3 delete race

fix(mdns): Minor cleanup

fix(mdns): Refactor fuzzer test suite

fix(mdns): Add host unit tests

fix(mdns): Add receiver unit tests

feat(mdns): Add test template for mdns-sender

fix(mdns): Minor cleanup

fix(mdns): Cleanup querier and pcb

fix(mdns): Cleanup mdns sender

fix(mdns): Cleanup mdns sender

fix(mdns): Add documentation about the refactor, remove old fuzz

fix(mdns): Add refactoring details to help review changes

fix(mdns): Add data-flow diagram

fix(mdns): Make debug module more flexible

and enable routing printfs to esp_log component

fix(mdns): Address minor review comments

fix(mdns): Debug prints with small buffer

fix(mdns): Minor cleanup

fix(mdns): Fix host tests by freezing idf-build-apps to 2.10

fix(mdns): per final code review
| [_mdns_service_match_instance](https://github.com/espressif/esp-protocols/blob/main/components/mdns/mdns.c#L409) | [mdns_utils_service_match_instance](https://github.com/espressif/esp-protocols/blob/main/components/mdns/mdns_utils.c#L162) | The refactored function changed from static to public scope (removed static keyword), which could break encapsulation and allow unintended external access to internal service matching logic. Additionally, the helper function `mdns_utils_str_null_or_empty` might have different behavior than the original `_str_null_or_empty` if it was reimplemented during the utility extraction. |
| [_mdns_get_service_item_instance](https://github.com/espressif/esp-protocols/blob/main/components/mdns/mdns.c#L422) | [mdns_utils_get_service_item_instance](https://github.com/espressif/esp-protocols/blob/main/components/mdns/mdns_utils.c#L143) | |
| [_mdns_read_fqdn](https://github.com/espressif/esp-protocols/blob/main/components/mdns/mdns.c#L452) | [mdns_utils_read_fqdn](https://github.com/espressif/esp-protocols/blob/main/components/mdns/mdns_utils.c#L18) | The refactored function mdns_utils_parse_fqdn uses a static buffer (static char buf[MDNS_NAME_BUF_LEN]) which could cause thread safety issues in a multi-threaded environment or reentrancy problems if the function is called recursively. Additionally, the memory movement operations using memmove in mdns_utils_parse_fqdn appear to be incorrectly calculating the offsets - the destination and source calculations seem to assume the mdns_name_t structure fields are exactly MDNS_NAME_BUF_LEN bytes apart, which may not be correct if the structure has padding or different field sizes. |
| [_mdns_set_u16](https://github.com/espressif/esp-protocols/blob/main/components/mdns/mdns.c#L512) | ??? | |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to fix the ???, I think this function should be the static function set_u16 in the file mdns_send.c. Or could we just remove this line?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are my agentic "safe-guards" -- the reviewer takes a few attempts to review both versions and pull different portion of context, but if it couldn't confidently comment it would give this "???" --> and indication for us human-reviewer to take over.
Had to to this, to save some money 😄 -- this big review cost about 0.5 USD...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha! Okay, I don’t think I’m willing to spend half a day reviewing this(so did I do, just a cursory look). I'm willing to spend more time on the code. 😄 So let's keep the it as is.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t think I’m willing to spend half a day reviewing this

I meant we should mainly review the "???" instances -> 3 out of 244 (not so bad IMO)

  • _mdns_set_u16()
  • _mdns_append_subtype_ptr_record()
  • _mdns_browse_result_add_txt()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

| [append_fqdn_dots](https://github.com/espressif/esp-protocols/blob/main/components/mdns/mdns.c#L711) | [append_fqdn_dots](https://github.com/espressif/esp-protocols/blob/main/components/mdns/mdns_send.c#L455) | The refactored append_fqdn_dots function in Result 1 has a subtle bug: it always appends "arpa" suffix regardless of the input domain, whereas the original function's logic suggests this should only be used for reverse DNS queries. The function also ignores the 'last' parameter completely, which may indicate missing functionality for handling non-reverse DNS queries. Additionally, the hardcoded "arpa" suffix could cause incorrect FQDN formatting for regular forward DNS queries. |
| [_mdns_append_fqdn](https://github.com/espressif/esp-protocols/blob/main/components/mdns/mdns.c#L750) | [append_fqdn](https://github.com/espressif/esp-protocols/blob/main/components/mdns/mdns_send.c#L667) | The refactored function calls `append_string` instead of `_mdns_append_string`, but the search results show no implementation of `append_string` in the refactored codebase. This suggests the function may be missing, which would cause linking errors or runtime failures. Additionally, the recursive call to `append_fqdn` relies on this potentially missing function, creating a dependency chain that may be broken. |
| [_mdns_append_ptr_record](https://github.com/espressif/esp-protocols/blob/main/components/mdns/mdns.c#L823) | [append_ptr_record](https://github.com/espressif/esp-protocols/blob/main/components/mdns/mdns_send.c#L781) | |
| [_mdns_append_subtype_ptr_record](https://github.com/espressif/esp-protocols/blob/main/components/mdns/mdns.c#L873) | ??? | |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here ???

@zwx1995esp
Copy link
Collaborator

Hi @david-cermak I just found two strange places ??? in the AI generated refactor details. (Just took a cursory look, because I thought the AI might make some mistakes.). Do we need to fix them?

@david-cermak
Copy link
Collaborator Author

I just found two strange places ??? in the AI generated refactor details. (Just took a cursory look, because I thought the AI might make some mistakes.). Do we need to fix them?

It just means that we need to review this function manually (and carefully)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants