Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TcpSocket timeout not respected in SynSent #323

Closed
MabezDev opened this issue Jan 15, 2020 · 8 comments
Closed

TcpSocket timeout not respected in SynSent #323

MabezDev opened this issue Jan 15, 2020 · 8 comments
Labels

Comments

@MabezDev
Copy link
Contributor

MabezDev commented Jan 15, 2020

If a TcpSocket is in SynSent with no means of ingressing or egressing (ethernet unplugged) the socket never leaves this state, even when a socket timeout has been set.

Is this the intented behaviour?

@MabezDev
Copy link
Contributor Author

Following up with a little more info.

It appears the only place the the timeout check is ran is is TcpSocket::dispatch, on my system when the ethernet is disconnected, the socket in SynSent never gets dispatch called on it, but other sockets (Listening ones) do.

I'll try and create a hosted example later on today

@whitequark
Copy link
Contributor

I think it's an oversight.

@MabezDev
Copy link
Contributor Author

I've managed to put together a hosted example, but it required a few changes to the TapInterface such that when the link is down it won't panic (simulating a real link loss).

Steps to reproduce:

  1. cargo run --example client tap0 xxxx xx where xxxx xx is an unreachable ip and port
  2. Take note of the output, every 5 seconds the socket will transition to closed
  3. run ip link set down tap0 to simulate the link loss
  4. Within a minute or too, the socket will get stuck in SynSent indefinitely (or until the link is up again)

Heres the patch you can apply to master to run this.

diff --git a/examples/client.rs b/examples/client.rs
index a9aaf65..a243476 100644
--- a/examples/client.rs
+++ b/examples/client.rs
@@ -13,7 +13,7 @@ use smoltcp::phy::wait as phy_wait;
 use smoltcp::wire::{EthernetAddress, Ipv4Address, IpAddress, IpCidr};
 use smoltcp::iface::{NeighborCache, EthernetInterfaceBuilder, Routes};
 use smoltcp::socket::{SocketSet, TcpSocket, TcpSocketBuffer};
-use smoltcp::time::Instant;
+use smoltcp::time::{Instant, Duration};
 
 fn main() {
     utils::setup_logging("");
@@ -56,6 +56,7 @@ fn main() {
     {
         let mut socket = sockets.get::<TcpSocket>(tcp_handle);
         socket.connect((address, port), 49500).unwrap();
+        socket.set_timeout(Some(Duration::from_secs(5)));
     }
 
     let mut tcp_active = false;
@@ -70,11 +71,15 @@ fn main() {
 
         {
             let mut socket = sockets.get::<TcpSocket>(tcp_handle);
+            let socket_state = socket.state();
+            info!("Socket state: {:?}", socket_state);
             if socket.is_active() && !tcp_active {
-                debug!("connected");
+                info!("connected");
             } else if !socket.is_active() && tcp_active {
-                debug!("disconnected");
-                break
+                info!("disconnected");
+                // break
+                socket.connect((address, port), 49500).unwrap();
+                socket.set_timeout(Some(Duration::from_secs(5)));
             }
             tcp_active = socket.is_active();
 
@@ -91,12 +96,12 @@ fn main() {
                     (data.len(), data)
                 }).unwrap();
                 if socket.can_send() && data.len() > 0 {
-                    debug!("send data: {:?}",
+                    info!("send data: {:?}",
                            str::from_utf8(data.as_ref()).unwrap_or("(invalid utf8)"));
                     socket.send_slice(&data[..]).unwrap();
                 }
             } else if socket.may_send() {
-                debug!("close");
+                info!("close");
                 socket.close();
             }
         }
diff --git a/src/phy/sys/tap_interface.rs b/src/phy/sys/tap_interface.rs
index f89597f..7623b5d 100644
--- a/src/phy/sys/tap_interface.rs
+++ b/src/phy/sys/tap_interface.rs
@@ -62,7 +62,7 @@ impl TapInterfaceDesc {
         unsafe {
             let len = libc::write(self.lower, buffer.as_ptr() as *const libc::c_void,
                                   buffer.len());
-            if len == -1 { Err(io::Error::last_os_error()).unwrap() }
+            if len == -1 { return Err(io::Error::last_os_error()) }
             Ok(len as usize)
         }
     }
diff --git a/src/phy/tap_interface.rs b/src/phy/tap_interface.rs
index dfdfb27..2a5cb6e 100644
--- a/src/phy/tap_interface.rs
+++ b/src/phy/tap_interface.rs
@@ -98,7 +98,7 @@ impl phy::TxToken for TxToken {
         let mut lower = self.lower.borrow_mut();
         let mut buffer = vec![0; len];
         let result = f(&mut buffer);
-        lower.send(&buffer[..]).unwrap();
+        lower.send(&buffer[..]).ok();
         result
     }
 }

@MabezDev
Copy link
Contributor Author

MabezDev commented Jan 17, 2020

I think it's an oversight.

I would tend to agree.

I believe the issue to be in https://github.com/m-labs/smoltcp/blob/ae27235cc4277d035563efb13216191dbd05d561/src/iface/ethernet.rs#L581

With verbose logging on, (socket::meta): #0: neighbor 216.58.206.110 missing, silencing until t+3.000s is emitted before the socket gets stuck. At which point the egress_permitted check fails and no work is done on the socket, including checking for a timeout.

If you are able to reproduce this with the above info, what are the steps to take to fix this do you think?

@MabezDev
Copy link
Contributor Author

Hmmm I think it gets worse than that unfortunately.

https://github.com/m-labs/smoltcp/blob/ae27235cc4277d035563efb13216191dbd05d561/src/iface/ethernet.rs#L630-L648

breaking in this match means that sockets may not get serviced at all.

Replacing the break's with () means all sockets are now dispatched but this still does not fix the problem.

https://github.com/m-labs/smoltcp/blob/ae27235cc4277d035563efb13216191dbd05d561/src/socket/tcp.rs#L1402

It seems everytime the socket in SynSent gets here remote_last_ts is None and hence timed_out returns false everytime, even though it has timed out!

I suspect to fix this issue, we will have to refactor the egress function. I'd appreciate your insight before I go ahead; I don't want to create more problems by PR'ing a naive solution.

@whitequark
Copy link
Contributor

Sorry, I don't think I'll have time to do this any time soon. I have a rather large backlog of issues in other projects and not much time.

@MabezDev
Copy link
Contributor Author

I understand. I can work around this for now.

@MabezDev
Copy link
Contributor Author

Seems this has been fixed by #368 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

2 participants