[WIP. Don't merge] Resilience to error in packet decoding #285

HeroicKatora · 2019-04-11T09:00:27Z

Addresses #281. Work in progress. Addresses DOS by errors in the decoding step by establishing an error ring-buffer associated to the ethernet interface and socket type. When some decoding step fails, the error is put into the most specific error buffer available but otherwise ignored. This ensures that faulty packets can not affect further packet processing. This is a work in progress, intended for an early preview of ideas. As such, it may be incomplete, not compile at times and/or be buggy. I may possibly rebase or force-push the branch at will.

A second step of the mechanism design rework will try to rebalance the recv/send inbalance in case of a saturated phy. As layed out in the issue report, a full receive buffer can adversely affect send performance by blocking further sends. The error from the full buffer should break the receive part of the loop but it's not completely clear yet how to handle finally handle it. One possibility is putting it into the ring buffer or the ethernet iface, but this of course creates some friction with the return value a caller expects.

Alternatives

Callbacks have been evaluated as inferior for major reasons:

It's missing contextual information resulting from partial decoding. A failed udp decoding could for example still access and log the frames around itself. Re-decoding within the error callback is both inefficient, creates ambiguity and is awkward as it is effectively invoking a processing step that it expects to fail. This seems highly unintuitive.
Invoking the callback during polling means that constraints such as terminiation, timing, resource usage, ... need to be fulfilled by arbitrary user code. Without an extremely extensive documentation, it remains unclear how one could expect the programmer of the callback to know the internals enough to make good engineering decisions about its design.
The callback must fight the borrow checker if it wants to inspect any state such as the socket on which the error originated or the phy interface. Making error inspection a separate call makes this cleaner, less constrained and more efficient.
Ignoring most errors, as many usages likely want to, would become a no-op for the caller.

TODO

Cleanup some of the changes that were made just to quickly iterate over alternative designs for this feature. This is still ongoing
Make the code compile under a subset of features such as (socket-udp, proto-ipv4). This is actually an issue with the current library and I don't see any guide on which feature set should be supported other than all and default.
Implement error buffers for more than udp and decide on a strategy to handle left-over errors that don't belong to any socket.
Evaluate and choose the strategy for recv/send imbalance.
Implement tests to simulate and ensure resilience against the DOS potential presented in Ethernet denial of service #281.

whitequark · 2019-04-11T13:54:58Z

It's an interesting approach. It is a bit awkward, and I don't like that. However, I agree that it seems like our best way going forward, so let's use it.

I think your lifetime changes would need to be a part of a separate PR. They change the public interface as well. I haven't looked at them closely yet as it's hard to follow here.

HeroicKatora · 2019-04-11T14:14:28Z

Yes, it's all a bit raw and several changes are mixed together. I plan to split this up into several independent chunks once I have determined the overall usability.

HeroicKatora · 2019-04-12T18:37:59Z

@whitequark Cleanly rebased only the error buffer related changes. Note that the lifetime within iface and udp is currently linked to other attributes as I wasn't sure what the original motivation was for having some of them overlap and some of them exposed in the generic parameters. Some feedback on this would be helpful. And, this still needs to be performed for the other socket types to ensure that really no error is generated. I finally may want to remove the Result<_, Error> part from the inner methods to make this explicit.

This is a ring buffer storing the failure states of previously received (and dropped) packet. The buffer will always overwrite the oldest errors when it is full. An empty buffer, for example Default constructed, will simply drop all errors. The buffer will keep a count of the number of ignored errors. Overall, this makes it possible to indicate more than one failure condition per call to the library user. This is necessary to not terminate a `poll` call immediately on the first error. Stored errors can later be retrieved and inspected outside and independently of the call that created them. An instance of the buffer is added to the udp socket but largely unused at the moment.

Augments `ErrorBuffer` with a callback based consumer for all pending errors.

Here is why a buffer for errors is needed: The `poll` method of the ethernet interface may potentially produce more than one signalling error. But breaking out of control flow leaves the user without a proper continuation option. A renewed call to `poll` would instead start iterating sockets and packet flow in the previous order again, giving unfair advantage to the flows called first. Specifically, the inbound flow may also be controlled by a malicious party and error during parsing and reception must not affect other flows. Instead, we will try to store these errors inside a buffer provided in the builder so that the programmer can control the number of errors they want to inspect. Overflowing errors are dropped entirely. This is part of a patch series introducing that error handling to all socket types.

Adds methods for interacting with an ErrorBuffer similar to the methods provided on Result itself. But instead of panicking or ignoring an Err variant the contained error is pushed into the buffer.

Parsing incoming packets should not affect control flow for all other packet flows that are to be handled, to guarantee that such packets can not influence throughput adversely. This change will instead push them into the error buffers of the ethernet interface or udp socket, depending on how far parsing succeeded. This is part of a patch series to harden packet handling against adverserial network conditions and an overloaded network stack. Other commits change the handling for the other socket and protocol types.

Where tests previously asserted that `process` completed without an error they will no call process and assert that no error has been created in the error buffer afterwards. The check does not depend on an error buffer with non-zero capacity. But when the check has previously checked for an error case, add an actual error buffer with capacity for a single element and assert the same condition on the result of `pop`. Since tests are short enough, the result of these operations can not wrongly report the error condition of a previous packet processing.

m-labs-homu · 2019-05-08T20:05:56Z

☔ The latest upstream changes (presumably 7c606ac) made this pull request unmergeable. Please resolve the merge conflicts.

HeroicKatora · 2019-08-20T03:52:52Z

Just as a head up: The require changes are far too extensive to justify working on for me. I'm going to abandon this PR for good.

whitequark · 2019-08-20T10:02:21Z

I see. That's a bit unfortunate.

HeroicKatora force-pushed the lifetimes branch from c31ff67 to a444f59 Compare April 12, 2019 18:35

HeroicKatora force-pushed the lifetimes branch from a444f59 to 2d1c76c Compare April 13, 2019 07:36

HeroicKatora added 6 commits April 16, 2019 19:35

Add method to consume all pending errors

d1050ea

Augments `ErrorBuffer` with a callback based consumer for all pending errors.

Utilities for handling errors with the storage type

8c74e4d

Adds methods for interacting with an ErrorBuffer similar to the methods provided on Result itself. But instead of panicking or ignoring an Err variant the contained error is pushed into the buffer.

HeroicKatora force-pushed the lifetimes branch from 2d1c76c to dc44af5 Compare April 16, 2019 17:35

HeroicKatora closed this Aug 20, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP. Don't merge] Resilience to error in packet decoding #285

[WIP. Don't merge] Resilience to error in packet decoding #285

HeroicKatora commented Apr 11, 2019 •

edited

Loading

whitequark commented Apr 11, 2019

HeroicKatora commented Apr 11, 2019

HeroicKatora commented Apr 12, 2019

m-labs-homu commented May 8, 2019

HeroicKatora commented Aug 20, 2019

whitequark commented Aug 20, 2019

[WIP. Don't merge] Resilience to error in packet decoding #285

[WIP. Don't merge] Resilience to error in packet decoding #285

Conversation

HeroicKatora commented Apr 11, 2019 • edited Loading

Alternatives

TODO

whitequark commented Apr 11, 2019

HeroicKatora commented Apr 11, 2019

HeroicKatora commented Apr 12, 2019

m-labs-homu commented May 8, 2019

HeroicKatora commented Aug 20, 2019

whitequark commented Aug 20, 2019

HeroicKatora commented Apr 11, 2019 •

edited

Loading