Skip to content

Miscellaneous improvements to mctpd #81

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

khangng-ampere
Copy link
Contributor

Those commits are spawned out of my effort to support Set Endpoint ID, fixing some issues and papercuts in mctpd.

(See e14eaed for my upcoming Set Endpoint ID commit. It might be too big and harder to merge so I split them off from these misc commits)

Currently, when mctpd adds new addresses to the routing table,
unit tests will not receive it.

This adds the missing test infrastructure to handle new addresses added
to the routing table.

Signed-off-by: Khang D Nguyen <[email protected]>
This adds missing code to deallocate an iid after
being used for an response.

Signed-off-by: Khang D Nguyen <[email protected]>
Currently, mctp-netlink requires passing interface name to its API.
(due to mctp.c being written first)

However, in mctpd, we always get ifindex from sockaddr. This leads
to mctpd having to convert ifindex to interface name,
and then mctp-netlink will convert the interface name back to ifindex.

This change removes the unnecessary roundtrip.
ifindex is now the main thing to pass around.
mctp.c will do its own conversion.

Signed-off-by: Khang D Nguyen <[email protected]>
Copy link
Member

@jk-ozlabs jk-ozlabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good - just that the last change is a bit mysterious - why are we trying to bypass the kernel routing tables there?

(I assume this is due to interactions during Set Endpoint ID processing, as the local EID may have changed as a side effect of the command. However, that sounds like something we should special-case in Set Endpint ID, rather than use globally.

Also, won't you need a facility to respond using a zero EID, if the Set Endpoint ID fails?)

Comment on lines 1182 to 1204
int mctp_nl_addr_add(struct mctp_nl *nl, mctp_eid_t eid, int ifindex)
{
struct mctp_addralter_msg msg;
int rc;

rc = fill_addralter_args(nl, &msg, NULL, NULL, eid, ifindex);
if (rc)
return -1;

msg.nh.nlmsg_type = RTM_NEWADDR;

return mctp_nl_send(nl, &msg.nh);
}

int mctp_nl_addr_del(struct mctp_nl *nl, mctp_eid_t eid, int ifindex)
{
struct mctp_addralter_msg msg;
int rc;

rc = fill_addralter_args(nl, &msg, NULL, NULL, eid, ifindex);
if (rc)
return -1;

msg.nh.nlmsg_type = RTM_DELADDR;

return mctp_nl_send(nl, &msg.nh);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the only difference being the type here, could we simplify this into a parameter? This would also mean you don't need the rtm_command check in mctp.c to choose a function...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think API design wise for mctp-netlink.h, separate mctp_nl_addr_add and mctp_nl_addr_del should be superior. I don't think there is any common case that need adding or deleting parameterized. Or i.e, I think

mctp_nl_addr_add(...)

is better than

mctp_nl_addr(..., RTM_DELADDR)

It is also less error-prone to pass something else to it.

I think the problem might be mctp.c. Currently it fuses addr add and addr del into one code path, and then split the code path again with the rtm_command. Let me also clean that up by separating the code path.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, i'm a bit unclear about the direction you have taken with the new changes; this results in more duplicated code between the add and remove paths. Can you elaborate on your preference here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, i'm a bit unclear about the direction you have taken with the new changes; this results in more duplicated code between the add and remove paths. Can you elaborate on your preference here?

What I have in mind is keeping each command its own separate straightforward path, only extract out the common step. Some thing like

def step1(): pass
def step2(): pass

def add():
    step1()
    step2()
    doAdd()

def del():
    step1()
    step2()
    doDel()

Instead of

def step(add):
    step1()
    step2()
    if add:
        doAdd()
    else:
        doDel()

def add():
    step(add)

def del():
    step(del)

There are some benefits that I find helpful:

  • It is easier to understand each command in isolation.
  • It is easier to add a new argument to each command.

Ideally I would like to see cmd_addr_add looks like this:

static int cmd_addr_add(struct ctx *ctx, int argc, const char **argv)
{
	uint8_t eid;
	int ifindex;

	assert_argc_eq(argc, 4);
	eid = parse_eid(argv[1]);
	assert_keyword(argv[2], "dev");
	ifindex = parse_ifindex(ctx, argv[3]);

	mctp_ops_init();
	return mctp_nl_addr_add(ctx->nl, eid, ifindex);
}

static int cmd_addr_del(struct ctx *ctx, int argc, const char **argv)
{
	uint8_t eid;
	int ifindex;

	assert_argc_eq(argc, 4);
	eid = parse_eid(argv[1]);
	assert_keyword(argv[2], "dev");
	ifindex = parse_ifindex(ctx, argv[3]);

	mctp_ops_init();
	return mctp_nl_addr_del(ctx->nl, eid, ifindex);
}

(the parsing functions could exit(EXIT_FAILURE) and print out parsing error diagnostics on failure)

I think it is clear at a glance what each command has to do, than having to jump around to cmd_addr_addremove and unroll each path for add and remove mentally...

Maybe it looks duplicated, but code authoring wise I think you usually modify each path separately. The shape of each path is accidentally similar, but I don't think it should be deduplicated. (maybe if we have binary size constraints, but I don't think we can save much - and the point is to keep the skeleton thin, extract out all the meats or the steps into separate functions, hence all the parse_* and assert_*...)


It would also be nice if all the parsing stuff can go into one central place, so it can be updated alongside with the help message, and functions can work with the parsed values, rather than parsing scattered in places. Like

static int cmd_addr(struct ctx *ctx, int argc, const char **argv)
{
	if (argc == 2 && !strcmp(argv[1], "help")) {
		fprintf(stderr, "%s address\n", ctx->top_cmd);
		fprintf(stderr, "%s address show [IFNAME]\n", ctx->top_cmd);
		fprintf(stderr, "%s address add <eid> dev <IFNAME>\n", ctx->top_cmd);
		fprintf(stderr, "%s address del <eid> dev <IFNAME>\n", ctx->top_cmd);
		return 255;
	}

	// ...

	if (!strcmp(subcmd, "add")) {
		assert_argc_eq(argc, 4);
		eid = parse_eid(argv[1]);
		assert_keyword(argv[2], "dev");
		ifindex = parse_ifindex(ctx, argv[3]);
		
		return do_addr_add(ctx, eid, ifindex);
	}
	
	else if (!strcmp(subcmd, "del")) {
		assert_argc_eq(argc, 4);
		eid = parse_eid(argv[1]);
		assert_keyword(argv[2], "dev");
		ifindex = parse_ifindex(ctx, argv[3]);

		return do_addr_remove(ctx, eid, ifindex);
	}

	warnx("unknown address command '%s'", subcmd);
	return -1;
}

do_addr_* would work with the parsed values, no more parsing errors there.


I hope I got my point across 😄 Just my 2 cents on code organization

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, but using your example:

def step1(): pass
def step2(): pass

def add():
    step1()
    step2()
    doAdd()

def del():
    step1()
    step2()
    doDel()

doDel and doAdd are themselves identical except for the one value we pass as the netlink type, so we now need those duplicated all the way down the call stack.

Instead, we could have:

static int cmd_addr(struct ctx *ctx, int argc, const char **argv)
{

	eid = parse_eid(argv[1]);
	ifindex = parse_ifindex(ctx, argv[3]);

	if (!strcmp(subcmd, "add")) {
		type = RTM_NEWADDR;
	} else if (!strcmp(subcmd, "del")) {
		type = RTM_DELADDR;
	} else {
		warnx("unknown address command '%s'", subcmd);
		return -1;
	}

	rc = do_addr(ctx, eid, ifindex, type)

	return rc;
}

... and so avoid multiple duplicated commands in that entire path. We don't need any conditionals on type once it is set, we just pass the value to the netlink message construction.

I'm all for containing the parsing to the top level though.

Copy link
Contributor Author

@khangng-ampere khangng-ampere Jun 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would breaks the assumption of other commands, like route_add and route_del are separate and have different arguments, but then addr receives a RTM_* type.

It would be less surprising, less error-prone to see everything in uniformity, at the expense of slightly increased code size.

(also, if you add mctp addr show, we now have a do_addr_show, which diverges from do_addr)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do understand the desire to unify with the route and neigh code structure, but those are different in that the add and del formats are asymmetric (eg, route add can specify a mtu, but a del cannot). For addresses, both add and del are performed with exactly the same information (the link and eid).

We implement this with all of the common parts in one function, and tiny wrappers to contain the minmal divergence between the two.

Copy link
Contributor Author

@khangng-ampere khangng-ampere Jun 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the solution here is just to have the API at many granularities :) I added a mctp_nl_addr() just as you suggested, alongside with thin wrapper mctp_nl_addr_{add,del}.

(I dropped the parsing changes related to mctp.c, I will do that in another PR)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it! thanks.

@khangng-ampere
Copy link
Contributor Author

All looks good - just that the last change is a bit mysterious - why are we trying to bypass the kernel routing tables there?

(I assume this is due to interactions during Set Endpoint ID processing, as the local EID may have changed as a side effect of the command. However, that sounds like something we should special-case in Set Endpint ID, rather than use globally.

Actually, it comes from Get Endpoint ID. mctpd needs to respond to Get Endpoint ID in the case of empty routing table and no EID are assigned. The DSP0236 spec section 12.5 says

Thus, the destination EID in the message will typically be set to the special Physical Addressing Only EID value.

But it is true as you said, I could have added a reply_message_phys and use physical addressing only for Get Endpoint ID. However, correct me if I'm wrong: We have the physical address at all the call sites of reply_message, and using physical address is less overhead.

(mctpd is also supposedly the one to manage and create the routing table, so having the reverse dependency of the routing table decides how mctpd works is a bit... tangled.)

But either way, I should reword the commit to mention this Get Endpoint ID issue, and a testcase to cover this regardless of what direction we head.

Also, won't you need a facility to respond using a zero EID, if the Set Endpoint ID fails?)

That's true... It should work w.r.t current test facility or MCTP specification. But Linux kernel wise, I might need a patch for it alongside supporting Set Endpoint ID.

@jk-ozlabs
Copy link
Member

But it is true as you said, I could have added a reply_message_phys and use physical addressing only for Get Endpoint ID.

I would prefer if we defaulted to EID-based addressing unless we have a specific reason to do otherwise (ie., in scenarios like your example, where we may not have routing established). By bypassing the kernel routing, we may have divergent behaviour between what the kernel understands to be the correct route, and what mctpd will use.

However, correct me if I'm wrong: We have the physical address at all the call sites of reply_message, and using physical address is less overhead.

We have a physical address, which might not be the physical address of that specific endpoint (eg, for not-directly-attached devices), or may not be a valid physical address at that specific point in time. It's slightly less overhead, but slightly more chance of being incorrect.

(emphasis on slightly though; the scenarios in which the tx physaddr differs from the rx physaddr are pretty obscure)

@khangng-ampere khangng-ampere force-pushed the khangng/push-zynrlyzronmk branch from 51e76e4 to 011547c Compare June 19, 2025 04:25
@khangng-ampere
Copy link
Contributor Author

I updated a few things:

  • I limited the physical addressing to Get Endpoint ID only. However, this is still a bit awkward: what about other queries command? (like Get Routing Table, Get Rate Limit, ...). Maybe in the long run we should have a flag to decide what kind of routing we should do? For Bus Owner it should be set when we finished allocating and assigning EIDs to downstream endpoints. For Endpoint it should be when we received Set Endpoint ID (and have a fallback gateway route). Those are when EID-based routing should always work.
  • I tried to split off the mctp addr add and mctp addr del, only extract common function like parse_eid for deduplication. Currently, it returns an error code on parse failure for keeping the current behavior, but it should be a lot cleaner if we can directly exit with the error code, and only return the parsed value. something like
eid = parse_eid(argv[1]);
ifindex = parse_ifindex(argv[3]);

instead of

if (parse_eid(argv[1], &eid) < 0) {
    warnx("invalid address %s", argv[1]);
    return -1;
}

if (parse_ifindex(ctx, argv[3], &ifindex) < 0) {
    warnx("invalid device %s", argv[3]);
    return -1;
}

Thoughts? Also I could do those in a separate PR if this PR is currently acceptable.

Extracts netlink addresses utils from mctp.c
into mctp-netlink.c for reuse in mctpd Set EID.

Signed-off-by: Khang D Nguyen <[email protected]>
Currently, mctpd running as an endpoint with no EID set cannot reply to
Get EID message, due to having no routing knowledge and we currently
use EID-based routing for the response.

This uses physical addressing for Get Endpoint message.

Signed-off-by: Khang D Nguyen <[email protected]>
@khangng-ampere khangng-ampere deleted the khangng/push-zynrlyzronmk branch June 19, 2025 09:44
@khangng-ampere
Copy link
Contributor Author

Dang, I mistakenly deleted the branch, let me see if I can recover it

@khangng-ampere khangng-ampere restored the khangng/push-zynrlyzronmk branch June 19, 2025 09:49
@khangng-ampere
Copy link
Contributor Author

Third time the charm :( Sorry for the noise

@khangng-ampere khangng-ampere force-pushed the khangng/push-zynrlyzronmk branch from 011547c to ebb19f4 Compare June 19, 2025 09:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants