[Xastir] Xastir 1.3.xx and wx200d problems

Brian D Heaton brian.heaton at janusresearch.com
Thu Apr 29 09:37:31 EDT 2004


Stefano,

	Since the list doesn't send replies I've pasted what I think is the
important chunk below for everyone to see.  The important thing is that
a "FIN ACK" isn't sent the other way after the "FIN ACK" (packet number
39) that is visible here.  That should go across as the final part of
the TCP-CLOSE process.

	What I would expect to see at close is something like:

A > B [ACK}
A > B [FIN ACK]
B > A [FIN ACK]
A > B [ACK]

	Since we're only seeing the [FIN ACK] pass one way that would appear to
be the reason the sockets are hanging at close wait.



			THX/BDH
---------------------------------------------------------------
     37 2004-04-28 20:13:45.889018 127.0.0.1            
127.0.0.1             TCP      rasadv > 58791 [PSH, ACK] Seq=2115122606
Ack=2123443043 Win=32767 Len=5
     38 2004-04-28 20:13:45.889021 127.0.0.1            
127.0.0.1             TCP      58791 > rasadv [ACK] Seq=2123443043
Ack=2115122611 Win=32767 Len=0
     39 2004-04-28 20:16:48.303898 127.0.0.1            
127.0.0.1             TCP      58791 > rasadv [FIN, ACK] Seq=2123443043
Ack=2115122611 Win=32767 Len=0
     40 2004-04-28 20:16:48.343878 127.0.0.1            
127.0.0.1             TCP      rasadv > 58791 [ACK] Seq=2115122611
Ack=2123443044 Win=32767 Len=0
     41 2004-04-28 20:16:48.617999 127.0.0.1            
127.0.0.1             TCP      58800 > rasadv [SYN] Seq=2314811095 Ack=0
Win=32767 Len=0
     42 2004-04-28 20:16:48.618020 127.0.0.1            
127.0.0.1             TCP      rasadv > 58800 [SYN, ACK] Seq=2311782633
Ack=2314811096 Win=32767 Len=0
----------------------------------------------------------------


On Wed, 2004-04-28 at 20:21, Stefano Angelo Mario Lassini wrote:
> Here is an ethereal trace of what is happening between xastir and wx200d... 
> looks like a socket is established, a few pakets are exchanged and then, 
> after a while of inactivity, the socket is closed and immediately 
> re-opened... but netstat shows the socket hanging in a CLOSE_WAIT state. This 
> repeats on and on...
> 
> I am not a TCP expert by any means... can you tell me if the closure of the 
> socket is correct or if there is anything missing?
> 
> Thanks,
> 
> --sam
> 
> On Wednesday 28 April 2004 12:39, Stefano Angelo Mario Lassini wrote:
> > I agree with the analysis you make of this. I will have to wait until I
> > get back home tonight to try ethereal and see what it tells me. I haven't
> > used that tool in the past, but hopefully it is available in the Suse 9.0
> > pro distro, in which case it will only take a minute to install it on my
> > machine.
> >
> > I'll check and report what I find.
> >
> > Thanks,
> >
> > --sam
> >
> > > Sam,
> > >
> > > 	What I'm thinking is that one side or the other isn't fully closing.  I
> > > was hoping that a trace from Ethereal would give some indication as to
> > > where the TCP-CLOSE (FIN?) wasn't being completed.
> > >
> > > 			THX/BDH
> > >
> > > On Wed, 2004-04-28 at 10:14, Stefano Angelo Mario Lassini wrote:
> > >> Brian,
> > >>
> > >> Thanks for the suggestion.
> > >> I am able to ascertain that many sockets on the wx200d side are left in
> > >> a
> > >> CLOSE_WAIT state that causes them to hang forever (until the wx200d
> > >> process is killed and restarted). This happens both with sockets opened
> > >> from xastir and with sockets opened from the wx200 command line client
> > >> for
> > >> wx200d, so it seems to be a problem that goes beyond xastir.
> > >>
> > >> BTW, the connections are local, i.e. both xastir and the wx200d daemon
> > >> are
> > >> running on the same machine (I might migrate wx200d to another machine
> > >> once I get this issue figured out).
> > >>
> > >> --sam
> > >>
> > >> > Sam,
> > >> >
> > >> > 	Do you have Ethereal on either of the Linux machines in question?  I
> > >> > think you've got something not fully closing a socket.  If you can get
> > >>
> > >> a
> > >>
> > >> > trace of it you should be able to see what is being left hanging.
> > >> >
> > >> > 			THX/BDH
> > >> >
> > >> > On Mon, 2004-04-26 at 21:10, Stefano Angelo Mario Lassini wrote:
> > >> >> Over the past few weeks I have experienced failures in my
> > >>
> > >> xastir/wx200d
> > >>
> > >> >> setup
> > >> >> that would cause wx200d to stop responding after a couple of days or
> > >> >> less of
> > >> >> uptime. If xastir was not connected to the network wx port wx200d
> > >>
> > >> would
> > >>
> > >> >> run
> > >> >> for many days at a time without problems.
> > >> >>
> > >> >> I eventually traced the symptoms to the fact that the number of open
> > >> >> sockets
> > >> >> to wx200d would increase to the point that several hundreds of socket
> > >>
> > >> at
> > >>
> > >> >> a
> > >> >> time were open to wx200d (or so appears by listing /proc/{wx200d
> > >> >> PID}/fd.
> > >> >> Apparently every time that Xastir believes that the connectoin to
> > >>
> > >> wx200d
> > >>
> > >> >> is
> > >> >> down (due to lack of wx200d activity) it attempts to re-connect and
> > >>
> > >> in
> > >>
> > >> >> the
> > >> >> process a new socket is created and the old one is left hanging.
> > >> >>
> > >> >> Once the number of sockets grows beyond the number of file
> > >>
> > >> descriptors
> > >>
> > >> >> available to wx200d the daemon hangs, and needs to be killed and
> > >> >> restarted.
> > >> >>
> > >> >> I have tried to look at the code responsible for the networked
> > >>
> > >> weather
> > >>
> > >> >> station, but I have to admit that my understanding of the structure
> > >>
> > >> of
> > >>
> > >> >> the
> > >> >> xastir code base is pretty weak...
> > >> >>
> > >> >> Can anyone provide me any insight on what is going on, and where to
> > >>
> > >> look
> > >>
> > >> >> to
> > >> >> possibly attempt to trace and fix this behaviour?
> > >> >>
> > >> >> I have an Oregon scientific WMR-968 wirelessweather station that I do
> > >>
> > >> nt
> > >>
> > >> >> seem
> > >> >> to be able to connect to xastir directly (I tried several
> > >>
> > >> combinations
> > >>
> > >> >> of
> > >> >> baud rates with no sucess), so wx200d is my only alternative at the
> > >> >> moment,
> > >> >> and I would also like to be able to use the wx200d daemon to upload
> > >>
> > >> wx
> > >>
> > >> >> data
> > >> >> to other applications in the future.
> > >> >>
> > >> >> The above behaviour happens in 1.3.1, 1.3.2 and in a CVS update from
> > >> >> last
> > >> >> week.
> > >> >>
> > >> >> Thanks for the help,
> > >> >>
> > >> >> Sam
> > >> >> N8USY
> > >> >
> > >> > --
> > >> > ------------------------------------------------------------------
> > >> > Brian D Heaton
> > >> > Senior Network Engineer
> > >> > Janus Research Group
> > >> > (706) 791-8342
> > >> > GPG Fingerprint: C99E 3E9C E23A 4E47 46F4 0A77 3A45 CB65 9E19 5B0A
> > >
> > > --
> > > ------------------------------------------------------------------
> > > Brian D Heaton
> > > Senior Network Engineer
> > > Janus Research Group
> > > (706) 791-8342
> > > GPG Fingerprint: C99E 3E9C E23A 4E47 46F4 0A77 3A45 CB65 9E19 5B0A
-- 
------------------------------------------------------------------
Brian D Heaton
Senior Network Engineer
Janus Research Group
(706) 791-8342
GPG Fingerprint: C99E 3E9C E23A 4E47 46F4 0A77 3A45 CB65 9E19 5B0A




More information about the Xastir mailing list