[PATCH v2 5/5] libtest: Add packet processor

Tue Feb 6 23:56:19 UTC 2024

On 5/2/2024 9:13 pm, Sebastian Huber wrote:
> Hello Chris,
> 
> thanks for having a look at it.
> 
> On 02.02.24 00:14, Chris Johns wrote:
>> Hi,
>>
>> Thanks for the updated documentation, protocol and use cases. It has helped. Now
>> I understand some of the context I have raised further questions about it I feel
>> we should resolve. Without a protocol version number being exchanged it limits
>> how we can grow and develop this protocol beyond where it is.
>>
>> On 26/1/2024 6:23 am, Sebastian Huber wrote:
>>> The RTEMS Test Framework packet processor provides a simple mechanism to
>>> exchange reliable and in-order data through transmitting and receiving
>>> one character at a time.
>>>
>>> The packet processor does not buffer data.  The processor uses a
>>> stop-and-wait automatic repeat request method. There is at most one packet
>>> in transmission.  The data transfer is done using a single character input
>>> and output method.  The protocol uses 12-bit sequence numbers, so a host
>>> could use a sliding window method to increase throughput.  All integers and
>>> data are base64url encoded.  A 24-bit CRC is used to ensure the data
>>> integrity.  The '{' character starts a packet.  The '}' character terminates
>>> a packet.  The '#' character prefixes a 24-bit CRC value.  The ':' character
>>> separates fields.  The '+' character prefixes data fields.
>>
>> Is the line encoding a subset of the ASCII character set? Is the line disciple
>> raw?
> 
> yes, only ASCII (7-bit) is used. The processor uses a character input/output
> handler. It is up to the user to do the transfer. It could be for example also
> CAN or UDP packets.

I think adding something about this will help the definition of the protocol.

>> I have reviewed the protocol as it is and I have some suggestions. The data link
>> layer messages are mixed with the pay load messages. If the messages followed:
>>
>>   {<12-bit seq><12-bit ack>:<packet>[:[data]]]#<24-bit CRC>}
>>
>> where:
>>
>> <packet> is `H`, `A`, `R` or `P` are data link packets and `P` for payload can
>> contain `J`, `L`, `S` etc for the test and chain loader protocol. For example:
>>
>>   {<12-bit seq><12-bit ack>:P:J:<64-bit address>#<24-bit CRC>}
> 
> In my proposal, the payload has its own CRC so that you can first validate the
> header before you do something with the payload.

How do you validate the header? Protocols like GFP have a simple length encoded
into the start of the message to frame it and to allow sync/hunting to happen to
resync when errors happen and is more robust than pattern matching. GFP is
unique because it byte aligns bit level streams when implemented in hardware,
something that is not important here.

Is there an assumption the link is of a good quality and error free?

>> Note, `P` could be a protocol id and so `T` could denote "test and chain loader".
>>
>> [:] denotes this is optional depending on pay load data being present or it
>> could mean data link vs payload packets, what ever works here.
>>
>>>   The following packets are defined:
>>>
>>> * hello: {<12-bit seq><12-bit ack>:H#<24-bit CRC>}
>>
>> Are you able to have the `H` be a query and return data? A hello that contains
>> protocol versioning data is always useful for long lived protocols. Version
>> numbering should be 1, 2, 3, 4 etc and 0 is reserved.
> 
> Adding a protocol version to the hello message is a good idea.
> 
>>> * acknowledge: {<12-bit seq><12-bit ack>:A#<24-bit CRC>}
>>
>> Does this ack all seq numbers up to this number?
> 
> Yes, it is like in TCP, however, currently only a stop-and-wait automatic repeat
> request method is implemented so that we don't need buffers.

Great and please add this as a note to the doco.

>>> * reject: {<12-bit seq><12-bit ack>
>>>    :R:<12-bit seq of rejected packet>#<24-bit CRC>}
>>
>> Are the `<>` fields binary and so base64 encoded?
> 
> Yes, everything in <> is base64 encoded.

Please add to the doco.

>>> * jump: {<12-bit seq><12-bit ack>:J:<64-bit address>#<24-bit CRC>}
>>>
>>> * load: {<12-bit seq><12-bit ack>:L:<64-bit load address>:
>>>    <64-bit load size in bytes>#<24-bit CRC>+<data to load>#<24-bit CRC>}
>>>
>>> * signal: {<12-bit seq><12-bit ack>:S:<64-bit signal number>:
>>>    <64-bit signal value>#<24-bit CRC>}
>>
>> Are the signals defined? If so maybe a reference to the enum or whatever?
> 
> No, the signal and channel numbers are user-defined.
> 
>>
>>> * channel: {<12-bit seq><12-bit ack>:C:<64-bit channel number>:
>>>    <64-bit data size in bytes>#<24-bit CRC>+<channel data>#<24-bit CRC>}
>>
>> How would I add a packet type to the protocol? For example a 'D' packet could
>> transport GDB remote protocol for use on targets with limited resources and the
>> need to share this channel.
> 
> You don't need a new packet type for this. You can use a channel. We have a
> 64-bit channel number, so plenty of channels.

Then why no use this for the packets that are specific to bootloading/testing?
Why have this multiplexer then not use it? It seems there is mixing of layering.

>> The data link and framwwork is useful for our project beyond the test use case
>> so it would be good to see if something more useful is within reach. :)
>>
>>> The intended use case are boot loaders and test runners.  For example, test
>>> runners may interface with an external test server performing equipment
>>> handling on request using the packet processor.
>>>
>>> Use T_packet_initialize() to initialize the packet processor.  Use
>>> T_packet_process() to drive the packet processing.  You can enqueue
>>> packets for transmission with T_packet_enqueue().  You can reliably send
>>> signals with T_packet_send().  You can reliably transmit and receive
>>> channel data with T_packet_channel_push() and
>>> T_packet_channel_exchange().
>>>
>>> A simple boot loader could be implemented like this:
>>
>> If this is RTEMS running the loader and then jumping to something loaded would
>> be a chain loader and not a boot loader? The use of the term boot loader
>> confused me. Chain loading is tricky because you need to handle the address
>> spaces to avoid clashing or you need to have a way to relocate the loaded data.
>> How is this handled?
> 
> I don't know what a chain loader is. If more than one program is involved in the
> boot process, I would call this first-stage, second-stage, etc. boot loader.

It is normally referred to as chainloading ..
https://wiki.gentoo.org/wiki/GRUB/Chainloading. If RTEMS has been boot loaded
and is loading another RTEMS exe, bootloader or another OS then I see it has
chainloading.

> It is up to the user to deal with memory space collisions. The event handler can
> check the load address for example. I used this packet processor to run a boot
> loader in internal flash and load an application to external SDRAM.

This is a nice and simple but important example as it avoids the reflash to test
debug cycle. I would like to discuss and understand how chainloading could be
handled. It may effect this protocol or it may not, I do not yet know. My
concern as things are here is the distance between this protocol and that use
case it a long way as no concrete and specific example is being provided and I
do not know of a host PC piece of code that a user can make use of. Is a tool
for rtems-tools going to be provided?

As an OS I think chainloading is something we should handle and provide. A BSP
should be able to specify how this happens. For fully RAM base systems I think
it is a kernel and hardware resources shutdown, a relocate type copy and then
entry. That functionality is beyond most RTEMS users. This whole operation would
be atomic and could not be broken down into a load and jump. A chainloader
concept would allow a chainloader packet the BSP can interpret to make the
chaining happen.

There are other valid use cases for chainloading on other systems. For example
loading via RTEMS on custom hardware with MACs in FPGA devices.

>> A chain loader package in libmisc (or where ever) would be a very nice addition.
>>
>> Given this I question the prefixing with T_ for the protocol. Again I see it as
>> more useful than just testing. :) I am sure testing has been the leading use
>> case for you but the T_ fingerprint assumes it is just for testing. :)
> 
> In this case, you can think of 'T' as transfer.
> 
> [...]
> 
>>> +static void do_type(T_packet_control* self, uint8_t ch) {
>>> +  update_crc(self, ch);
>>> +  self->packet_type = ch;
>>> +
>>> +  switch (ch) {
>>> +    case 'C':
>>> +      self->packet_done_event = T_PACKET_EVENT_CHANNEL_END;
>>> +      self->state = T_PACKET_STATE_COLON;
>>> +      break;
>>> +    case 'S':
>>> +      self->packet_done_event = T_PACKET_EVENT_SIGNAL;
>>> +      self->state = T_PACKET_STATE_COLON;
>>> +      break;
>>> +    case 'L':
>>> +      self->packet_done_event = T_PACKET_EVENT_LOAD_END;
>>> +      self->state = T_PACKET_STATE_COLON;
>>> +      break;
>>> +    case 'J':
>>> +      self->packet_done_event = T_PACKET_EVENT_JUMP;
>>> +      self->state = T_PACKET_STATE_COLON;
>>> +      break;
>>> +    case 'H':
>>> +      self->packet_done_event = T_PACKET_EVENT_HELLO;
>>> +      self->state = T_PACKET_STATE_HASH;
>>> +      break;
>>> +    case 'A':
>>> +      self->packet_done_event = T_PACKET_EVENT_ACKNOWLEDGE;
>>> +      self->state = T_PACKET_STATE_HASH;
>>> +      break;
>>> +    default:
>>> +      self->packet_done_event = T_PACKET_EVENT_REJECT;
>>> +      self->state = T_PACKET_STATE_REJECT;
>>> +      break;
>>> +  }
>>> +}
>>
>> You have handlers for the time, ie clock_monotonic, so why not one to handle the
>> payload? My quick review would indicate the code is clean enough to handle this
>> but I am not sure.
> 
> There is a handler to deal with the payload. This is the event handler. For
> example, for a channel receive, the event handler has to set a target address
> for the data.  In the test code:
> 
> +    case T_PACKET_EVENT_CHANNEL_BEGIN: {
> +      output_char(base, 'C');
> +      T_eq_u64(T_packet_get_channel_number(base), UINT64_C(0x1234567887654321));
> +      size_t size = T_packet_get_channel_size(base);
> +
> +      if (size != 0) {
> +        T_packet_set_channel_target(base, &self->load_buf[0]);
> +        T_eq_sz(size, 0xd);
> +      }
> +
> +      break;
> +    }
> 
> [...]
> 

That was not apparent in my reading of the protocol and I am with wondering how
you imagine I or other users would figure this out? Why not separate the
layering and clean up the protocol?

If I have to add an event or channel or some other hack to allow a RAM base
chainloading as I described above then I feel load and jump should also use that
mechanism. If you follow that line of reasoning to its end you will have clear
layering in the protocol. I suggest you consider this path.

>>> +static void idle_processing(T_packet_control* self) {
>>> +  event(self, T_PACKET_EVENT_NOTHING);
>>> +
>>> +  if (self->state != T_PACKET_STATE_START) {
>>> +    return;
>>> +  }
>>> +
>>> +  T_packet_packet* head = self->snd_head;
>>> +
>>> +  if (head == NULL) {
>>> +    return;
>>> +  }
>>> +
>>> +  uint32_t now = (*self->clock_monotonic)(self);
>>> +
>>> +  if (self->snd_pending != NULL) {
>>> +    if ((self->snd_timeout - now) <= UINT32_MAX / 2) {
>>> +      return;
>>> +    }
>>
>> The now value is 32bits so not nano-seconds and UINT32_MAX / 2 seems like a long
>> time? There are no commants about this timeout handling and I would have to
>> guess what is happening here. I would prefer having comments to guide me.
> 
> The time is defined by the clock monotonic handler of the user. This code
> detects an unsigned overflow. I will clarify this code block a bit.

Then this needs more than a comment. A user defined time with hard coded
timeouts seems to point to future problems for anyone other than you using this
code. We use 64bit for timestamps in our APIs so why not here?

Chris