dhcp: rtems_event_receive() doesn't return

Tim Cussins timcussins at eml.cc
Wed Apr 30 09:17:31 UTC 2014


Hi Chris,

On 29/04/14 23:55, Chris Johns wrote:
> On 29/04/2014 9:13 pm, Tim Cussins wrote:
>> Hi all,
>>
>> We use the DHCP mechanism provided by RTEMS, and have noticed a curious
>> issue that manifests in long-running devices.
>>
>> The method 'dhcp_task' uses rtems_event_receive() with a timeout of
>> 1000ms to measure time (receiving the event itself causes dhcp_task to
>> exit). We're using 4.9 but with dhcp related files backported from 4.10,
>> so we look like this:
>>
>> http://git.rtems.org/rtems/tree/cpukit/libnetworking/rtems/rtems_dhcp.c?h=4.10#n705
>>
>>
>> Our problem is that the software can get into a state where
>> rtems_event_receive() never returns - in a recent case, the count was
>> 25287 seconds - well into the lease, but not of any particular
>> significance.
>>
>> I had a network monitor interposed, recording all inbound and outbound
>> traffic to the device: There is no network traffic of note at the time
>> of failure.
>>
>> I've scanned bugzilla and the mailing list, but couldn't find a mention
>> of a similar issue.
>>
>> Has anyone spotted behaviour like this before? The rarity of the issue
>> feels like a subtle race to me - better thought would be very welcome :)
>>
>> FWIW: We're using the virtex bsp (ppc405).
>>
>> Any tips or thoughts would be much appreciated!
>>
> 
> Just a simple one. Are the priorities such that it can run ? That is
> nothing else of a higher priority is running all the time.
> 

Great call ... I definitely did *not* check to see if a higher prio task
was stuck a busy-loop... :/

The device has been rebooted so I can't check this now. Thanks for the
tip :)

We have Till's GDB stub on device, so there's an opportunity to inspect
some internal structures when the issue appears again.

Thanks,
Tim



More information about the users mailing list