Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initrd-network: Be more resilient in identifying interfaces #35

Conversation

samueldr
Copy link
Member

It seems that sometimes they're slow to appear.

Let's wait a bit, and retry.

It seems that sometimes they're slow to appear.

Let's wait a bit, and retry.
@samueldr samueldr added the 4. type: enhancement New feature or request label Oct 10, 2019
Copy link
Contributor

@craigem craigem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to collect work arounds like this somewhere so the can be revisited.

Issues of this nature solved in this way (I have no other way myself) always make me feel like another problem is being masked (I may also be quite wrong about that).

@samueldr
Copy link
Member Author

Here it's not really a workaround; though we'll need a similar contruct for other device.

Things happen in an asynchronous way. We have to wait for things.

Within upstream NixOS, a similar construct is used around waiting for drives. It may be required to do the same here, especially once LVM and possibly LUKS are added to the mix.

@lheckemann
Copy link
Member

Could we use udevadm monitor -s net to detect when a network device becomes available, rather than polling?

@samueldr
Copy link
Member Author

Sounds good. I'll read the manpage to see how this compares.

@samueldr
Copy link
Member Author

samueldr commented Mar 8, 2020

This targets the old init, this won't apply. Furthermore waiting is now "built-in".

@samueldr samueldr closed this Mar 8, 2020
@samueldr samueldr deleted the feature/initrd-networking-resiliency branch March 8, 2020 22:15
@lheckemann
Copy link
Member

Is it still sleep-and-poll-based? Would be nice to use udev in some way still.

@samueldr
Copy link
Member Author

samueldr commented Mar 9, 2020

A few months ago I was reading about that and could not find anything confirming udev can do that and that it should be relied on. In fact I found more information telling not to use udevadm settle than anything else.

The new implementation is kind of sleeping/polling, but it's non-blocking. There is no threading in the new init, so it just waits until the right /sys path exists before doing anything network-related. If we added a way to wait with udev, it would pause until udev returns, which in practice could make part of the boot process hang. E.g. if the network task is pausing to wait for the network interface before the framebuffer is initialized, splashes or the framebuffer terminal wouldn't show up until the network inteface shows.

What could happen is this could be re-implemented using library-based bindings for udev that asks udev, rather than polling the /sys fs. Though at that point I'm not sure it's better than trusting what the kernel makes available under /sys.

I would need to find proper prior work and documentation about using udev to wait on devices.

@lheckemann
Copy link
Member

I don't mean based on udevadm settle, but based on the appearance of the device we're interested in, using udev's monitoring API (which is crudely exposed by udevadm monitor) — so that we don't poll at all, instead being able to respond immediately to the device having appeared.

@samueldr
Copy link
Member Author

Right, then the bit about it not being threaded still applies here. Without threading, it would have to block until the device shows up, rather than being able to do whatever can be done.

I'm not sure adding threading to the init is worth it, with the additional hassle this will bring, compared to doing it in that simplistic way.

With that said, that doesn't mean I'm against it. I'm not seeing what would be gained considering we don't want to block.

@lheckemann
Copy link
Member

I don't see how threading would be necessary here, udev can give you an fd to select() on like any other AFAIU.

@samueldr
Copy link
Member Author

In that case I don't know.

I think it depends on what benefits this gives compared to the current strategy of polling the sysfs. The added complexity should be worth it. One such benefit, if it does, would be if we can ask "will there be 'this' network interface", rather than "wait for 'this' network interface", so we can knowingly fail the target, rather than timeout.

I'm still open for this to be implemented, but there is no priority here unless it has a clear benefit compared to the current approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4. type: enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants