Race condition in DHCP client

Mon Oct 6 21:31:46 UTC 2014

Hi all,

I believe I've found a race condition in the DHCP client shipped
with RTEMS.  This affects setups where there are redundant DHCP
servers.

Background:
We run RTEMS in a fairly large farm of embedded nodes (~20k), most of
which get their network configuration using DHCP.  Since there are
so many nodes, there are multiple DHCP servers configured for load
balancing.  When there are multiple DHCP servers, we see several
failures during negotiation, which causes a delay (sometimes long) in
the start of networking.

Setup:
RTEMS 4.10.0.99 fetched from git:
    commit 88ef740ea89440bdcbf7b5108e033cbe1bf32792, February 14, 2014
Two load balanced DHCP servers

Symptoms:
When starting up DHCP, several messages which state "DHCP server did
not accept the DHCP request" appear on the console.  These messages
come at a one second cadence and delay the system startup.

Adding some debug prints to rtems_dhcp.c [1] shows that dhcp_init() is
returning after it sends its DHCPREQUEST.  It returns because the
first packet after the REQUEST is *not* DHCPACK, the packet is
actually the DHCPOFFER from the redundant server.  Looking deeper,
rtems_dhcp.c does not correctly implement the state machine for
clients as documented in RFC-2131 [2].

Here's the relevant portion of Figure 5 of RFC 2131 which shows the
state machine fragment we care about.  This is best viewed with a
fixed width font.

                                         -------
                                        |       |
                  +-------------------->| INIT  |
                  |         +---------->|       |
                  |         |            -------
               DHCPNAK/     |               |
            Discard offer   |      -/Send DHCPDISCOVER
                  |         |               |
                  |      DHCPACK            v
                  |   (not accept.)/   -----------
                  |  Send DHCPDECLINE |           |
                  |         |         | SELECTING |<----+
                  |        /          |           |     |DHCPOFFER/
                  |       /            -----------      |Collect
                  |      /                  |   |       |  replies
                  |     /  +----------------+   +-------+
                  |    |   v   Select offer/
                 ------------  send DHCPREQUEST
         +----->|            |
         |      | REQUESTING |
     DHCPOFFER/ |            |
     Discard     ------------
         |        |        |
         +--------+     DHCPACK/
                    Record lease, set
                      timers T1, T2
                           |
                           v
                        -------
                       |       |
                       | BOUND |
                       |       |
                        -------

The error in the RTEMS implementation [2] is simply that the whole
system is implemented with atomic transactions [3].  Send one packet and
expect one and only one packet back.  This is definitely *not*
correct because:

1) The loop at the SELECTING state is not implemented.
    This doesn't really hurt us.

2) The loop at REQUESTING is also not implemented.
    This is what kills us.  A DHCPOFFER from one of the redundant
    servers is interpreted as a DHCPNACK instead of being discarded, and
    the system prints the message and goes back to INIT.

The only reason that we sometimes win is a race condition created by
using bootpc_call() for the send/receive logic.  This function does
the following:
* build the sockets for bootp/dhcp
* send the client packet
* wait for one server packet
* destroy the sockets

dhcp_init is implemented (in pseudocode) as:
* build DHCPDISCOVER packet
* bootpc_call(&discover, &reply)
* if reply != DHCPOFFER, return
* build DHCPREQUEST packet
* bootpc_call(&request, &reply)
* if reply != DHCPACK, return
* initialize network

The race condition is: if the second DHCPOFFER arrives between the
time the first bootpc_call exits and the second one configures the
socket, then it is discarded, and everything is fine.  If the second
DHCPOFFER is delayed slightly, then the error occurs.

My question is: does anyone have an implementation of a DHCP client
(working with RTEMS) that doesn't exhibit this race condition?

   --Jim Panetta
     panetta at slac.stanford.edu

[1]  http://git.rtems.org/rtems/tree/cpukit/libnetworking/rtems/rtems_dhcp.c
[2]  https://www.ietf.org/rfc/rfc2131.txt
[3]  This atomic transaction is 'bootpc_call(call,reply,procp)'
      implemented in:
      http://git.rtems.org/rtems/tree/cpukit/libnetworking/nfs/bootp_subr.c

-- 
My opinions are mine...not DOE's...not SLAC's...mine.
(except by random, unforseeable coincidences)
panetta at slac.stanford.edu  --  Save the whales!  Free the mallocs!