Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

meta-lxatac-bsp: lxatac-factory-data: reload udev rules on completion #107

Closed
wants to merge 1 commit into from

Conversation

hnez
Copy link
Member

@hnez hnez commented Feb 16, 2024

The lxatac-factory-data script generates a tac-bridge.link file, which sets the MAC address to use on the tac-bridge network interface, which is a bridge created by NetworkManager.

We have noticed that on some boots this MAC address is not correct. On these boots we can also see that udev (which reads the .link files) did not take the link file into account (even though it exists):

Normal boot:

# udevadm info /sys/class/net/tac-bridge | grep LINK_FILE
ID_NET_LINK_FILE=/run/systemd/network/10-tac-bridge.link

Boot with wrong MAC address:

# udevadm info /sys/class/net/tac-bridge | grep LINK_FILE
ID_NET_LINK_FILE=/usr/lib/systemd/network/99-default.link

This hints at some sort of race condition between lxatac-factory-data, udev reading its rules and NetworkManager setting up the bridge device.

Run udevadm control --reload-rules after creating the link file to hopefully solve the race condition.

I did some "statistical" testing of this change by running our labgrid pytest test in a loop, which checks if the MAC is correct:

# Before this change the interface has the correct MAC
# 14 / (14 + 6) = 70% of the time.
14 before/success
 6 before/failure

# After this change the interface has the correct MAC
# in all 21 runs.
21 udevadm-reload/success
 0 udevadm-reload/failure

# When replacing the `udevadm control --reload-rules` with
# just a `sleep 1` the MAC address is still correct in all 22
# runs, but the IPv6 link local address of the interface contains
# the wrong MAC address.
20 sleep/success
 2 sleep/failure

The tests indicate that the fix works, but are not 100% clear as to it working for the correct reasons (because just waiting also helps a bit).

The lxatac-factory-data script generates a tac-bridge.link file,
which sets the MAC address to use on the tac-bridge network interface,
which is a bridge created by NetworkManager.

We have noticed that on some boots this MAC address is not correct.
On these boots we can also see that udev (which reads the .link files)
did not take the link file into account (even though it exists):

Normal boot:

    # udevadm info /sys/class/net/tac-bridge | grep LINK_FILE
    ID_NET_LINK_FILE=/run/systemd/network/10-tac-bridge.link

Boot with wrong MAC address:

    # udevadm info /sys/class/net/tac-bridge | grep LINK_FILE
    ID_NET_LINK_FILE=/usr/lib/systemd/network/99-default.link

This hints at some sort of race condition between lxatac-factory-data,
udev reading its rules and NetworkManager setting up the bridge device.

Run `udevadm control --reload-rules` after creating the link file to
hopefully solve the race condition.

I did some "statistical" testing of this change by running our labgrid pytest
test in a loop, which checks if the MAC is correct:

    # Before this change the interface has the correct MAC
    # 14 / (14 + 6) = 70% of the time.
    14 before/success
     6 before/failure

    # After this change the interface has the correct MAC
    # in all 21 runs.
    21 udevadm-reload/success
     0 udevadm-reload/failure

    # When replacing the `udevadm control --reload-rules` with
    # just a `sleep 1` the MAC address is still correct in all 22
    # runs, but the IPv6 link local address of the interface contains
    # the wrong MAC address.
    20 sleep/success
     2 sleep/failure

The tests indicate that the fix works, but are not 100% clear as to it
working for the correct reasons (because just waiting also helps a bit).

Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
@hnez hnez requested a review from jluebbe February 16, 2024 13:42
@hnez hnez added this to the second stable release milestone Feb 16, 2024
@jluebbe
Copy link
Member

jluebbe commented Feb 20, 2024

As an alternative, perhaps we could avoid the devadm control --reload-rules by running this before udev? That would likely mean setting DefaultDependencies=no. @michaelolbrich what's your opinion?

@michaelolbrich
Copy link
Collaborator

It's not that simple. The same script talks to hostnamed and that's not an early service either.

@hnez
Copy link
Member Author

hnez commented Feb 23, 2024

I think we should just drop the interaction with hostnamed here and instead just write the current hostname into /etc/hostname if /etc/hostname does not yet exist. That allows us to run this script before udev.

Somewhat related: I think we should migrate /etc/hostname in the rauc hook if we don't do that already. That way users get a sensible hostname by default (passed in from barebox) but are also free to set a custom hostname that sticks. That's something we had customers ask about.

@SmithChart
Copy link
Member

👍 for the migration of /etc/hostname

@hnez
Copy link
Member Author

hnez commented Mar 1, 2024

Closed in favor of #115, which works without reload.

@hnez hnez closed this Mar 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants