[PATCH v2 5/5] libtest: Add packet processor

Fri Feb 16 02:26:29 UTC 2024

On 13/2/2024 7:32 pm, Sebastian Huber wrote:
> Hello Chris,
> 
> sorry for the delay.
> 
> On 07.02.24 00:56, Chris Johns wrote:
>> On 5/2/2024 9:13 pm, Sebastian Huber wrote:
>>> Hello Chris,
>>>
>>> thanks for having a look at it.
>>>
>>> On 02.02.24 00:14, Chris Johns wrote:
>>>> Hi,
>>>>
>>>> Thanks for the updated documentation, protocol and use cases. It has helped.
>>>> Now
>>>> I understand some of the context I have raised further questions about it I
>>>> feel
>>>> we should resolve. Without a protocol version number being exchanged it limits
>>>> how we can grow and develop this protocol beyond where it is.
>>>>
>>>> On 26/1/2024 6:23 am, Sebastian Huber wrote:
>>>>> The RTEMS Test Framework packet processor provides a simple mechanism to
>>>>> exchange reliable and in-order data through transmitting and receiving
>>>>> one character at a time.
>>>>>
>>>>> The packet processor does not buffer data.  The processor uses a
>>>>> stop-and-wait automatic repeat request method. There is at most one packet
>>>>> in transmission.  The data transfer is done using a single character input
>>>>> and output method.  The protocol uses 12-bit sequence numbers, so a host
>>>>> could use a sliding window method to increase throughput.  All integers and
>>>>> data are base64url encoded.  A 24-bit CRC is used to ensure the data
>>>>> integrity.  The '{' character starts a packet.  The '}' character terminates
>>>>> a packet.  The '#' character prefixes a 24-bit CRC value.  The ':' character
>>>>> separates fields.  The '+' character prefixes data fields.
>>>>
>>>> Is the line encoding a subset of the ASCII character set? Is the line disciple
>>>> raw?
>>>
>>> yes, only ASCII (7-bit) is used. The processor uses a character input/output
>>> handler. It is up to the user to do the transfer. It could be for example also
>>> CAN or UDP packets.
>>
>> I think adding something about this will help the definition of the protocol.
> 
> Ok, I will add this to the group description.
> 
>>
>>>> I have reviewed the protocol as it is and I have some suggestions. The data
>>>> link
>>>> layer messages are mixed with the pay load messages. If the messages followed:
>>>>
>>>>    {<12-bit seq><12-bit ack>:<packet>[:[data]]]#<24-bit CRC>}
>>>>
>>>> where:
>>>>
>>>> <packet> is `H`, `A`, `R` or `P` are data link packets and `P` for payload can
>>>> contain `J`, `L`, `S` etc for the test and chain loader protocol. For example:
>>>>
>>>>    {<12-bit seq><12-bit ack>:P:J:<64-bit address>#<24-bit CRC>}
>>>
>>> In my proposal, the payload has its own CRC so that you can first validate the
>>> header before you do something with the payload.
>>
>> How do you validate the header? Protocols like GFP have a simple length encoded
>> into the start of the message to frame it and to allow sync/hunting to happen to
>> resync when errors happen and is more robust than pattern matching. GFP is
>> unique because it byte aligns bit level streams when implemented in hardware,
>> something that is not important here.
> 
> The header is validated by the state machine and the 24-bit CRC. The a '{'
> always starts the processing of a new packet.

Oh I had not considered the `{` was an actual character on the line and thought
it was some syntax used to describe the protocol. Seeing it I am reminded of the
MS DAP protocol and it's header ...
https://microsoft.github.io/debug-adapter-protocol/overview and scroll down to
Content Part and Example.

>> Is there an assumption the link is of a good quality and error free?
> 
> No, the link can be noisy. We had for example occasional data loss via an USB
> UART in a VM.

OK, so being able to resync is important. Now I understand `{` is unique and a
starting delimiter it is easier see the recovery process.

>>>> Note, `P` could be a protocol id and so `T` could denote "test and chain
>>>> loader".
>>>>
>>>> [:] denotes this is optional depending on pay load data being present or it
>>>> could mean data link vs payload packets, what ever works here.
>>>>
>>>>>    The following packets are defined:
>>>>>
>>>>> * hello: {<12-bit seq><12-bit ack>:H#<24-bit CRC>}
>>>>
>>>> Are you able to have the `H` be a query and return data? A hello that contains
>>>> protocol versioning data is always useful for long lived protocols. Version
>>>> numbering should be 1, 2, 3, 4 etc and 0 is reserved.
>>>
>>> Adding a protocol version to the hello message is a good idea.
>>>
>>>>> * acknowledge: {<12-bit seq><12-bit ack>:A#<24-bit CRC>}
>>>>
>>>> Does this ack all seq numbers up to this number?
>>>
>>> Yes, it is like in TCP, however, currently only a stop-and-wait automatic repeat
>>> request method is implemented so that we don't need buffers.
>>
>> Great and please add this as a note to the doco.
> 
> I think this is already there.
> 
>>
>>>>> * reject: {<12-bit seq><12-bit ack>
>>>>>     :R:<12-bit seq of rejected packet>#<24-bit CRC>}
>>>>
>>>> Are the `<>` fields binary and so base64 encoded?
>>>
>>> Yes, everything in <> is base64 encoded.
>>
>> Please add to the doco.
> 
> What I have right now:
> 
> "All integers and data are base64url encoded."

Is the base64 wrapped in literal `<` and `>` characters?

>>>>> * jump: {<12-bit seq><12-bit ack>:J:<64-bit address>#<24-bit CRC>}
>>>>>
>>>>> * load: {<12-bit seq><12-bit ack>:L:<64-bit load address>:
>>>>>     <64-bit load size in bytes>#<24-bit CRC>+<data to load>#<24-bit CRC>}
>>>>>
>>>>> * signal: {<12-bit seq><12-bit ack>:S:<64-bit signal number>:
>>>>>     <64-bit signal value>#<24-bit CRC>}
>>>>
>>>> Are the signals defined? If so maybe a reference to the enum or whatever?
>>>
>>> No, the signal and channel numbers are user-defined.
>>>
>>>>
>>>>> * channel: {<12-bit seq><12-bit ack>:C:<64-bit channel number>:
>>>>>     <64-bit data size in bytes>#<24-bit CRC>+<channel data>#<24-bit CRC>}
>>>>
>>>> How would I add a packet type to the protocol? For example a 'D' packet could
>>>> transport GDB remote protocol for use on targets with limited resources and the
>>>> need to share this channel.
>>>
>>> You don't need a new packet type for this. You can use a channel. We have a
>>> 64-bit channel number, so plenty of channels.
>>
>> Then why no use this for the packets that are specific to bootloading/testing?
>> Why have this multiplexer then not use it? It seems there is mixing of layering.
> 
> Yes, it is a mix of layering. I will restructure this to remove the 'L' and 'J'
> special cases.
> 
>>
>>>> The data link and framwwork is useful for our project beyond the test use case
>>>> so it would be good to see if something more useful is within reach. :)
>>>>
>>>>> The intended use case are boot loaders and test runners.  For example, test
>>>>> runners may interface with an external test server performing equipment
>>>>> handling on request using the packet processor.
>>>>>
>>>>> Use T_packet_initialize() to initialize the packet processor.  Use
>>>>> T_packet_process() to drive the packet processing.  You can enqueue
>>>>> packets for transmission with T_packet_enqueue().  You can reliably send
>>>>> signals with T_packet_send().  You can reliably transmit and receive
>>>>> channel data with T_packet_channel_push() and
>>>>> T_packet_channel_exchange().
>>>>>
>>>>> A simple boot loader could be implemented like this:
>>>>
>>>> If this is RTEMS running the loader and then jumping to something loaded would
>>>> be a chain loader and not a boot loader? The use of the term boot loader
>>>> confused me. Chain loading is tricky because you need to handle the address
>>>> spaces to avoid clashing or you need to have a way to relocate the loaded data.
>>>> How is this handled?
>>>
>>> I don't know what a chain loader is. If more than one program is involved in the
>>> boot process, I would call this first-stage, second-stage, etc. boot loader.
>>
>> It is normally referred to as chainloading ..
>> https://wiki.gentoo.org/wiki/GRUB/Chainloading. If RTEMS has been boot loaded
>> and is loading another RTEMS exe, bootloader or another OS then I see it has
>> chainloading.
>>
>>> It is up to the user to deal with memory space collisions. The event handler can
>>> check the load address for example. I used this packet processor to run a boot
>>> loader in internal flash and load an application to external SDRAM.
>>
>> This is a nice and simple but important example as it avoids the reflash to test
>> debug cycle. I would like to discuss and understand how chainloading could be
>> handled. It may effect this protocol or it may not, I do not yet know. My
>> concern as things are here is the distance between this protocol and that use
>> case it a long way as no concrete and specific example is being provided and I
>> do not know of a host PC piece of code that a user can make use of. Is a tool
>> for rtems-tools going to be provided?
>>
>> As an OS I think chainloading is something we should handle and provide. A BSP
>> should be able to specify how this happens. For fully RAM base systems I think
>> it is a kernel and hardware resources shutdown, a relocate type copy and then
>> entry. That functionality is beyond most RTEMS users. This whole operation would
>> be atomic and could not be broken down into a load and jump. A chainloader
>> concept would allow a chainloader packet the BSP can interpret to make the
>> chaining happen.
>>
>> There are other valid use cases for chainloading on other systems. For example
>> loading via RTEMS on custom hardware with MACs in FPGA devices.
> 
> From my point of view this is just a multi-stage boot process.

I have only seen it referred to as chainloading.

> I have a module
> in rtems-central which loads a test using this packet protocol.

Can this be moved to rtems-tools?

>>>> A chain loader package in libmisc (or where ever) would be a very nice
>>>> addition.
>>>>
>>>> Given this I question the prefixing with T_ for the protocol. Again I see it as
>>>> more useful than just testing. :) I am sure testing has been the leading use
>>>> case for you but the T_ fingerprint assumes it is just for testing. :)
>>>
>>> In this case, you can think of 'T' as transfer.
>>>
>>> [...]
>>>
>>>>> +static void do_type(T_packet_control* self, uint8_t ch) {
>>>>> +  update_crc(self, ch);
>>>>> +  self->packet_type = ch;
>>>>> +
>>>>> +  switch (ch) {
>>>>> +    case 'C':
>>>>> +      self->packet_done_event = T_PACKET_EVENT_CHANNEL_END;
>>>>> +      self->state = T_PACKET_STATE_COLON;
>>>>> +      break;
>>>>> +    case 'S':
>>>>> +      self->packet_done_event = T_PACKET_EVENT_SIGNAL;
>>>>> +      self->state = T_PACKET_STATE_COLON;
>>>>> +      break;
>>>>> +    case 'L':
>>>>> +      self->packet_done_event = T_PACKET_EVENT_LOAD_END;
>>>>> +      self->state = T_PACKET_STATE_COLON;
>>>>> +      break;
>>>>> +    case 'J':
>>>>> +      self->packet_done_event = T_PACKET_EVENT_JUMP;
>>>>> +      self->state = T_PACKET_STATE_COLON;
>>>>> +      break;
>>>>> +    case 'H':
>>>>> +      self->packet_done_event = T_PACKET_EVENT_HELLO;
>>>>> +      self->state = T_PACKET_STATE_HASH;
>>>>> +      break;
>>>>> +    case 'A':
>>>>> +      self->packet_done_event = T_PACKET_EVENT_ACKNOWLEDGE;
>>>>> +      self->state = T_PACKET_STATE_HASH;
>>>>> +      break;
>>>>> +    default:
>>>>> +      self->packet_done_event = T_PACKET_EVENT_REJECT;
>>>>> +      self->state = T_PACKET_STATE_REJECT;
>>>>> +      break;
>>>>> +  }
>>>>> +}
>>>>
>>>> You have handlers for the time, ie clock_monotonic, so why not one to handle
>>>> the
>>>> payload? My quick review would indicate the code is clean enough to handle this
>>>> but I am not sure.
>>>
>>> There is a handler to deal with the payload. This is the event handler. For
>>> example, for a channel receive, the event handler has to set a target address
>>> for the data.  In the test code:
>>>
>>> +    case T_PACKET_EVENT_CHANNEL_BEGIN: {
>>> +      output_char(base, 'C');
>>> +      T_eq_u64(T_packet_get_channel_number(base),
>>> UINT64_C(0x1234567887654321));
>>> +      size_t size = T_packet_get_channel_size(base);
>>> +
>>> +      if (size != 0) {
>>> +        T_packet_set_channel_target(base, &self->load_buf[0]);
>>> +        T_eq_sz(size, 0xd);
>>> +      }
>>> +
>>> +      break;
>>> +    }
>>>
>>> [...]
>>>
>>
>> That was not apparent in my reading of the protocol and I am with wondering how
>> you imagine I or other users would figure this out? Why not separate the
>> layering and clean up the protocol?
> 
> The high-level function to work with channels is T_packet_channel_push() and
> T_packet_channel_exchange().
> 
>>
>> If I have to add an event or channel or some other hack to allow a RAM base
>> chainloading as I described above then I feel load and jump should also use that
>> mechanism. If you follow that line of reasoning to its end you will have clear
>> layering in the protocol. I suggest you consider this path.
> 
> I will work on that.
> 
>>
>>>>> +static void idle_processing(T_packet_control* self) {
>>>>> +  event(self, T_PACKET_EVENT_NOTHING);
>>>>> +
>>>>> +  if (self->state != T_PACKET_STATE_START) {
>>>>> +    return;
>>>>> +  }
>>>>> +
>>>>> +  T_packet_packet* head = self->snd_head;
>>>>> +
>>>>> +  if (head == NULL) {
>>>>> +    return;
>>>>> +  }
>>>>> +
>>>>> +  uint32_t now = (*self->clock_monotonic)(self);
>>>>> +
>>>>> +  if (self->snd_pending != NULL) {
>>>>> +    if ((self->snd_timeout - now) <= UINT32_MAX / 2) {
>>>>> +      return;
>>>>> +    }
>>>>
>>>> The now value is 32bits so not nano-seconds and UINT32_MAX / 2 seems like a
>>>> long
>>>> time? There are no commants about this timeout handling and I would have to
>>>> guess what is happening here. I would prefer having comments to guide me.
>>>
>>> The time is defined by the clock monotonic handler of the user. This code
>>> detects an unsigned overflow. I will clarify this code block a bit.
>>
>> Then this needs more than a comment. A user defined time with hard coded
>> timeouts seems to point to future problems for anyone other than you using this
>> code. We use 64bit for timestamps in our APIs so why not here?
> 
> The stuff should work right in boot_card(). The timeout is based on the
> transmission time of the sent packet.

I do not follow how "UINT32_MAX / 2" relates?

Chris