[PATCH v2 5/5] libtest: Add packet processor

Sebastian Huber sebastian.huber at embedded-brains.de
Tue Feb 13 08:32:24 UTC 2024


Hello Chris,

sorry for the delay.

On 07.02.24 00:56, Chris Johns wrote:
> On 5/2/2024 9:13 pm, Sebastian Huber wrote:
>> Hello Chris,
>>
>> thanks for having a look at it.
>>
>> On 02.02.24 00:14, Chris Johns wrote:
>>> Hi,
>>>
>>> Thanks for the updated documentation, protocol and use cases. It has helped. Now
>>> I understand some of the context I have raised further questions about it I feel
>>> we should resolve. Without a protocol version number being exchanged it limits
>>> how we can grow and develop this protocol beyond where it is.
>>>
>>> On 26/1/2024 6:23 am, Sebastian Huber wrote:
>>>> The RTEMS Test Framework packet processor provides a simple mechanism to
>>>> exchange reliable and in-order data through transmitting and receiving
>>>> one character at a time.
>>>>
>>>> The packet processor does not buffer data.  The processor uses a
>>>> stop-and-wait automatic repeat request method. There is at most one packet
>>>> in transmission.  The data transfer is done using a single character input
>>>> and output method.  The protocol uses 12-bit sequence numbers, so a host
>>>> could use a sliding window method to increase throughput.  All integers and
>>>> data are base64url encoded.  A 24-bit CRC is used to ensure the data
>>>> integrity.  The '{' character starts a packet.  The '}' character terminates
>>>> a packet.  The '#' character prefixes a 24-bit CRC value.  The ':' character
>>>> separates fields.  The '+' character prefixes data fields.
>>>
>>> Is the line encoding a subset of the ASCII character set? Is the line disciple
>>> raw?
>>
>> yes, only ASCII (7-bit) is used. The processor uses a character input/output
>> handler. It is up to the user to do the transfer. It could be for example also
>> CAN or UDP packets.
> 
> I think adding something about this will help the definition of the protocol.

Ok, I will add this to the group description.

> 
>>> I have reviewed the protocol as it is and I have some suggestions. The data link
>>> layer messages are mixed with the pay load messages. If the messages followed:
>>>
>>>    {<12-bit seq><12-bit ack>:<packet>[:[data]]]#<24-bit CRC>}
>>>
>>> where:
>>>
>>> <packet> is `H`, `A`, `R` or `P` are data link packets and `P` for payload can
>>> contain `J`, `L`, `S` etc for the test and chain loader protocol. For example:
>>>
>>>    {<12-bit seq><12-bit ack>:P:J:<64-bit address>#<24-bit CRC>}
>>
>> In my proposal, the payload has its own CRC so that you can first validate the
>> header before you do something with the payload.
> 
> How do you validate the header? Protocols like GFP have a simple length encoded
> into the start of the message to frame it and to allow sync/hunting to happen to
> resync when errors happen and is more robust than pattern matching. GFP is
> unique because it byte aligns bit level streams when implemented in hardware,
> something that is not important here.

The header is validated by the state machine and the 24-bit CRC. The a 
'{' always starts the processing of a new packet.

> 
> Is there an assumption the link is of a good quality and error free?

No, the link can be noisy. We had for example occasional data loss via 
an USB UART in a VM.

> 
>>> Note, `P` could be a protocol id and so `T` could denote "test and chain loader".
>>>
>>> [:] denotes this is optional depending on pay load data being present or it
>>> could mean data link vs payload packets, what ever works here.
>>>
>>>>    The following packets are defined:
>>>>
>>>> * hello: {<12-bit seq><12-bit ack>:H#<24-bit CRC>}
>>>
>>> Are you able to have the `H` be a query and return data? A hello that contains
>>> protocol versioning data is always useful for long lived protocols. Version
>>> numbering should be 1, 2, 3, 4 etc and 0 is reserved.
>>
>> Adding a protocol version to the hello message is a good idea.
>>
>>>> * acknowledge: {<12-bit seq><12-bit ack>:A#<24-bit CRC>}
>>>
>>> Does this ack all seq numbers up to this number?
>>
>> Yes, it is like in TCP, however, currently only a stop-and-wait automatic repeat
>> request method is implemented so that we don't need buffers.
> 
> Great and please add this as a note to the doco.

I think this is already there.

> 
>>>> * reject: {<12-bit seq><12-bit ack>
>>>>     :R:<12-bit seq of rejected packet>#<24-bit CRC>}
>>>
>>> Are the `<>` fields binary and so base64 encoded?
>>
>> Yes, everything in <> is base64 encoded.
> 
> Please add to the doco.

What I have right now:

"All integers and data are base64url encoded."

> 
>>>> * jump: {<12-bit seq><12-bit ack>:J:<64-bit address>#<24-bit CRC>}
>>>>
>>>> * load: {<12-bit seq><12-bit ack>:L:<64-bit load address>:
>>>>     <64-bit load size in bytes>#<24-bit CRC>+<data to load>#<24-bit CRC>}
>>>>
>>>> * signal: {<12-bit seq><12-bit ack>:S:<64-bit signal number>:
>>>>     <64-bit signal value>#<24-bit CRC>}
>>>
>>> Are the signals defined? If so maybe a reference to the enum or whatever?
>>
>> No, the signal and channel numbers are user-defined.
>>
>>>
>>>> * channel: {<12-bit seq><12-bit ack>:C:<64-bit channel number>:
>>>>     <64-bit data size in bytes>#<24-bit CRC>+<channel data>#<24-bit CRC>}
>>>
>>> How would I add a packet type to the protocol? For example a 'D' packet could
>>> transport GDB remote protocol for use on targets with limited resources and the
>>> need to share this channel.
>>
>> You don't need a new packet type for this. You can use a channel. We have a
>> 64-bit channel number, so plenty of channels.
> 
> Then why no use this for the packets that are specific to bootloading/testing?
> Why have this multiplexer then not use it? It seems there is mixing of layering.

Yes, it is a mix of layering. I will restructure this to remove the 'L' 
and 'J' special cases.

> 
>>> The data link and framwwork is useful for our project beyond the test use case
>>> so it would be good to see if something more useful is within reach. :)
>>>
>>>> The intended use case are boot loaders and test runners.  For example, test
>>>> runners may interface with an external test server performing equipment
>>>> handling on request using the packet processor.
>>>>
>>>> Use T_packet_initialize() to initialize the packet processor.  Use
>>>> T_packet_process() to drive the packet processing.  You can enqueue
>>>> packets for transmission with T_packet_enqueue().  You can reliably send
>>>> signals with T_packet_send().  You can reliably transmit and receive
>>>> channel data with T_packet_channel_push() and
>>>> T_packet_channel_exchange().
>>>>
>>>> A simple boot loader could be implemented like this:
>>>
>>> If this is RTEMS running the loader and then jumping to something loaded would
>>> be a chain loader and not a boot loader? The use of the term boot loader
>>> confused me. Chain loading is tricky because you need to handle the address
>>> spaces to avoid clashing or you need to have a way to relocate the loaded data.
>>> How is this handled?
>>
>> I don't know what a chain loader is. If more than one program is involved in the
>> boot process, I would call this first-stage, second-stage, etc. boot loader.
> 
> It is normally referred to as chainloading ..
> https://wiki.gentoo.org/wiki/GRUB/Chainloading. If RTEMS has been boot loaded
> and is loading another RTEMS exe, bootloader or another OS then I see it has
> chainloading.
> 
>> It is up to the user to deal with memory space collisions. The event handler can
>> check the load address for example. I used this packet processor to run a boot
>> loader in internal flash and load an application to external SDRAM.
> 
> This is a nice and simple but important example as it avoids the reflash to test
> debug cycle. I would like to discuss and understand how chainloading could be
> handled. It may effect this protocol or it may not, I do not yet know. My
> concern as things are here is the distance between this protocol and that use
> case it a long way as no concrete and specific example is being provided and I
> do not know of a host PC piece of code that a user can make use of. Is a tool
> for rtems-tools going to be provided?
> 
> As an OS I think chainloading is something we should handle and provide. A BSP
> should be able to specify how this happens. For fully RAM base systems I think
> it is a kernel and hardware resources shutdown, a relocate type copy and then
> entry. That functionality is beyond most RTEMS users. This whole operation would
> be atomic and could not be broken down into a load and jump. A chainloader
> concept would allow a chainloader packet the BSP can interpret to make the
> chaining happen.
> 
> There are other valid use cases for chainloading on other systems. For example
> loading via RTEMS on custom hardware with MACs in FPGA devices.

 From my point of view this is just a multi-stage boot process. I have a 
module in rtems-central which loads a test using this packet protocol.

> 
>>> A chain loader package in libmisc (or where ever) would be a very nice addition.
>>>
>>> Given this I question the prefixing with T_ for the protocol. Again I see it as
>>> more useful than just testing. :) I am sure testing has been the leading use
>>> case for you but the T_ fingerprint assumes it is just for testing. :)
>>
>> In this case, you can think of 'T' as transfer.
>>
>> [...]
>>
>>>> +static void do_type(T_packet_control* self, uint8_t ch) {
>>>> +  update_crc(self, ch);
>>>> +  self->packet_type = ch;
>>>> +
>>>> +  switch (ch) {
>>>> +    case 'C':
>>>> +      self->packet_done_event = T_PACKET_EVENT_CHANNEL_END;
>>>> +      self->state = T_PACKET_STATE_COLON;
>>>> +      break;
>>>> +    case 'S':
>>>> +      self->packet_done_event = T_PACKET_EVENT_SIGNAL;
>>>> +      self->state = T_PACKET_STATE_COLON;
>>>> +      break;
>>>> +    case 'L':
>>>> +      self->packet_done_event = T_PACKET_EVENT_LOAD_END;
>>>> +      self->state = T_PACKET_STATE_COLON;
>>>> +      break;
>>>> +    case 'J':
>>>> +      self->packet_done_event = T_PACKET_EVENT_JUMP;
>>>> +      self->state = T_PACKET_STATE_COLON;
>>>> +      break;
>>>> +    case 'H':
>>>> +      self->packet_done_event = T_PACKET_EVENT_HELLO;
>>>> +      self->state = T_PACKET_STATE_HASH;
>>>> +      break;
>>>> +    case 'A':
>>>> +      self->packet_done_event = T_PACKET_EVENT_ACKNOWLEDGE;
>>>> +      self->state = T_PACKET_STATE_HASH;
>>>> +      break;
>>>> +    default:
>>>> +      self->packet_done_event = T_PACKET_EVENT_REJECT;
>>>> +      self->state = T_PACKET_STATE_REJECT;
>>>> +      break;
>>>> +  }
>>>> +}
>>>
>>> You have handlers for the time, ie clock_monotonic, so why not one to handle the
>>> payload? My quick review would indicate the code is clean enough to handle this
>>> but I am not sure.
>>
>> There is a handler to deal with the payload. This is the event handler. For
>> example, for a channel receive, the event handler has to set a target address
>> for the data.  In the test code:
>>
>> +    case T_PACKET_EVENT_CHANNEL_BEGIN: {
>> +      output_char(base, 'C');
>> +      T_eq_u64(T_packet_get_channel_number(base), UINT64_C(0x1234567887654321));
>> +      size_t size = T_packet_get_channel_size(base);
>> +
>> +      if (size != 0) {
>> +        T_packet_set_channel_target(base, &self->load_buf[0]);
>> +        T_eq_sz(size, 0xd);
>> +      }
>> +
>> +      break;
>> +    }
>>
>> [...]
>>
> 
> That was not apparent in my reading of the protocol and I am with wondering how
> you imagine I or other users would figure this out? Why not separate the
> layering and clean up the protocol?

The high-level function to work with channels is T_packet_channel_push() 
and T_packet_channel_exchange().

> 
> If I have to add an event or channel or some other hack to allow a RAM base
> chainloading as I described above then I feel load and jump should also use that
> mechanism. If you follow that line of reasoning to its end you will have clear
> layering in the protocol. I suggest you consider this path.

I will work on that.

> 
>>>> +static void idle_processing(T_packet_control* self) {
>>>> +  event(self, T_PACKET_EVENT_NOTHING);
>>>> +
>>>> +  if (self->state != T_PACKET_STATE_START) {
>>>> +    return;
>>>> +  }
>>>> +
>>>> +  T_packet_packet* head = self->snd_head;
>>>> +
>>>> +  if (head == NULL) {
>>>> +    return;
>>>> +  }
>>>> +
>>>> +  uint32_t now = (*self->clock_monotonic)(self);
>>>> +
>>>> +  if (self->snd_pending != NULL) {
>>>> +    if ((self->snd_timeout - now) <= UINT32_MAX / 2) {
>>>> +      return;
>>>> +    }
>>>
>>> The now value is 32bits so not nano-seconds and UINT32_MAX / 2 seems like a long
>>> time? There are no commants about this timeout handling and I would have to
>>> guess what is happening here. I would prefer having comments to guide me.
>>
>> The time is defined by the clock monotonic handler of the user. This code
>> detects an unsigned overflow. I will clarify this code block a bit.
> 
> Then this needs more than a comment. A user defined time with hard coded
> timeouts seems to point to future problems for anyone other than you using this
> code. We use 64bit for timestamps in our APIs so why not here?

The stuff should work right in boot_card(). The timeout is based on the 
transmission time of the sent packet.

-- 
embedded brains GmbH & Co. KG
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.huber at embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/


More information about the devel mailing list