[RFC] generic CAN/CAN FD susbsytem for RTEMS from scratch - online documentation

Christian MAUDERER christian.mauderer at embedded-brains.de
Mon May 13 15:40:22 UTC 2024


Hello Pavel and Michal,

sorry for the late reply. I was on vacation last week.

On 2024-05-06 11:27, Pavel Pisa wrote:
> Dear Christian,
> 
> On Tuesday 30 of April 2024 08:40:43 Christian MAUDERER wrote:
>>> For others, code under review hosted in CTU university GitLab
>>> server
>>>     https://gitlab.fel.cvut.cz/otrees/rtems/rtems-canfd
>>> Documentation
>>>     https://otrees.pages.fel.cvut.cz/rtems/rtems-canfd/doc/can/can-html/can.html
>>>     https://otrees.pages.fel.cvut.cz/rtems/rtems-canfd/doc/doxygen/html/index.html
>>>
>>> Main developer behind extension to CAN FD and switch to RTEMS
>>> is Michal Lenc.
>>>
>>> The intention is to (hopefully) reach state when it meets criteria
>>> to mainlining int RTEMS CPU kit under
>>>
>>>     cpukit/dev/can
> ...
>>> I agree, that it is compromise. But adding yet another file descriptor
>>> like multiplexor for queues to each file descriptor seems to me as
>>> too much complexity. But it can be added. even later as IOCTL to remove
>>> individual queues based on CAN ID matches or queues IDs if create
>>> is modified to return internal queue IDs...
>>
>> I somehow missed that you can open the device multiple times and get
>> independent queues. With that, it's completely OK and should be flexible
>> enough for most applications.
>>
>> It's great that you already have put some thought into how it could be
>> extended later if some application needs more flexibility.
> ...
>>>> Did you check with
>>>> some other hardware controller, whether the whole structures / defines
>>>> / flags close to the hardware do work well for other controllers too?
>>>
>>> The code/concept is based on my previous LinCAN and OrtCAN work
>>>
>>> https://ortcan.sourceforge.net/lincan/
> ...
>> I didn't want to doubt your competence. Like I said it's some trap that
>> I have fallen into often enough myself (like when guiding Prashanths
>> GSoC project). But it's clear that you have put a lot of thought into
>> that. So I would expect that there shouldn't be much trouble with most
>> controllers. Maybe except for the ones where a semiconductor vendor
>> thought it would be a good idea to create a completely different
>> concept. But these are always difficult.
> 
> I agree with discussion and searching for hard arguments.
> 
> The solution is compromise and in general CAN bus concept
> is optimized for direct replacement of wires in car
> going between distinc units and its use as general
> communication solution has some difficulties and requires
> some compromises.
> 
> For small devices with predefined purpose and Autosar,
> it is ideal to allocate for each CAN ID (wire signal)
> to be sent one communication object on the controller.
> Same for each received signal value or their set in the
> single frame. The most controllers are equipped by filters
> and mechanism to do so including selection of the
> Tx message object for physical bus-link arbitration
> according to the priority. Then sending side updates
> signal value in corresponding Tx object and receiving
> side sees most actual one usually on the best effort basis,
> older unread frames are overwritten by updated value.
> 
> But even in simple ECU, there are obstacles to use this
> principle in all kind of the communication. CAN bus is used
> for firmware updates and general configuration. In this
> case, the reliable delivery of all messages with given
> CAN ID is required because whole sequence has to be
> received and processed and the state evolution is associated
> to the sequence. If a single message is lost, then all
> data are unusable. Because sequence requires exact ordering
> it is typical that only single Tx object is used. On Rx
> side there can be problem to capture all frames without
> overwrite by single Rx object so some controllers ad FIFO
> which can be attached to each object or some mechanism
> how to allocate more Rx objects and pass them to the user
> in FIFO order.
> 
> That works for small ECUs with single purpose firmware.
> But on general purpose operating system which should
> allow even complete monitoring of the CAN bus, allows
> dynamically started applications and even whole virtual
> CAN/CANopen nodes, allocation the controller Tx/Rx message
> objects for each specific purpose is impossible.
> 
> That is why all generic CAN subsystems which I know
> (CAN4Linux, LinCAN, SockteCAN, NuttX char device CAN,
> windows Peak's drivers etc.) define API based on
> opening driver and presenting received messages
> in FIFO order to application (with options for software
> filtering but usually not propagated to controller,
> HW - LinCAN has some option to union user FIFOs to
> mask and ID propagated to HW, but you usually end with
> fully end with need to receve all anyway and it has not been
> used at the end). The Tx FIFO order is required for messages
> with same ID or even sometimes between same stream of mesages
> even wit altering ID for correct realization of some higher
> level protocols.
> 
> The result is that even on hardware equipped with multiple
> Tx objects but without special Tx FIFO order preserving
> cyclic queue only single Tx object is used to realize
> transmission of all messages, for example SocketCAN on
> XCAN controller. So only part of the CAN bus media
> badwidth can be utilized by single node. May it be, it is sometimes
> a luck, because CAN IDs are not correctly allocated according
> to priority even on cars critical subsystems. On the Rx side original
> buffers approach is hard to use in order preserving FIFO concept,
> but the most of today controllers add some option to keep order
> and leave processing and distribution on software side.
> See evolution from CCAN to DCAN to overcome that problem.
> We have even made LinCAN for CCAN many many years ago
> which somehow kept required properties but it was headache.
> 
> So back to generic OS can interfaces, all I know are FIFO(s)
> based. Most of them keep strict FIFO order on Tx side
> which results in HoL (head-of-the-line) blocking and priority
> inversion on bus loaded by middle priority from other node.
> 
> That is why SocketCAN adds alloc_candev_mqs (multiple-queues) alternative
> for drivers
> 
> https://elixir.bootlin.com/linux/latest/source/drivers/net/can/dev/dev.c#L249
> 
> but as I know, no mainline kernel driver is using that.
> We have done some work to research and even a little extend
> Linux networking QoS subsystem to solve buffer bloat by old
> messages for traffic requiring best effort (most up to date
> data for control) for given IDs and to limit badwidth
> of others or virtual guests connected through QEMU to
> physical bus etc. may years ago at time when multi-queue
> has not been available on Linux side. I have long time plan
> to extend CTU CAN FD mainline Linux driver for this support
> and probably to be the first example how to overcome HoL/priority
> inversion in Linux CAN subsystem. It has been planned in original
> LinCAN before SoketCAN and it is now implemented in proposed
> RTEMS CAN/FD framework where application can setup multiple
> queues even for single open instance with different Tx priority
> class and when used and mapped correctly to CAN IDs, it can
> prevent priority inversion. It is not generic, because it is
> quite expensive for deeper FIFOs and even mutual order of
> Tx messages has to be preserved for many protocols as discussed
> earlier. CTU CAN FD IP core interface to software has been architected
> by me to allow maximal utilization of the Tx buffers and their
> reallocation when needed for higher priority message.
> Wait for DTP processing and publication of our international CAN
> Conference 2024 article or come and meet next week in Baden-Baden
> 
>    https://www.can-cia.org/icc/
> 
> There are two branches of the thought from this point
> 
> 1) how it maps to other controllers
> 
> For these equipped by single Tx object only (i.e. SJA1000),
> it maps well because attempt to repeat Tx and arbitration
> can be disabled when higher priority queue becomes ready
> and our CAN infrastructure allows to push back lower
> priority message and schedule higher one to be sent.
> 
> For more complex one, if they do not allow to control Tx objects
> order then only single Tx object can be used. Bad, link underutilization,
> but it is what is standard in SocketCAN and other CAN solutions
> for general purpose operating systems today. All controllers
> which I know allows to stop Tx attempt repeat and I hope to
> seen at all option t check if the latest attempt has been
> successful or not. So newt RTEMS CAN can use them same
> as on SJA1000. On Rx side, most have FIFO preserving
> option to use multiple buffers. Sometimes partially
> broken, burdened by erratas etc. (like iMX RT where
> we overcome these problems in NuttX drivers).
> When number of Tx priority classes is limited (for proposed
> system by default 3 but compile time configurable) then
> we can allocate one Tx buffer for each class, easy and
> preserves HoL priority inversion even on simple controllers.
> If there is option to order Tx according to the buffer
> index in the controller, then there is option for a little
> more performant solution when multiple Tx buffers are allocated
> for each class and they are sequentially filled till highest
> allocated buffer index is filled. Then there is some gap till
> all these buffers in given priority are sent because
> cyclic filling of the minimal index would result in reordering
> with possible break of some protocol requirements.
> Some controllers allows to attach DMA realized FIFOs to more
> Tx objects, in such case it would map to proposed design well
> too. Some newer controllers adds local priority bits above
> CAN ID ones (i.e. new NXP FlexCAN). This could allow cyclic
> use of some Tx objects/buffers similar to CTU CAN FD.
> There will be problems because multiple Tx buffers priorities
> are not reachable by single atomic operation like in CTU CAN FD
> case. But I have some idea how to implement sequential
> updates to ensure order in the class. There would be problem,
> that most controllers do not allow to update this information
> on the objects participating actively in arbitration. So it would
> lead to much more acrobation between eggs and some gap time,
> where none message is offered in the link arbitration even that
> there are pending user requests will be inevitable in some
> scenarios after some number of messages sent. That cannot
> be on the bus side worse that considering fixed order according
> to index. May be, it can be found that overhead does not worth
> that. But we preserve API in variants in all cases...
> 
> 2) use of the CAN bus in applications requiring maximal bus
> transparency with minimal latency and SW load. This is
> totally opposite of the general CAN bus subsystem for
> general purpose RTOS. The API in this case should allocated
> Tx and Rx controller objects for the individual purposes/CAN IDs.
> Rx side SW processing can be considered as alternative and proposed
> framework allows to setup queues, but it has overhead and under
> extreme load it can lost some messages if HW is not performant enough.
> On Tx side it is even more problematic.
> 
> But if this type of use of RTEMS for example for Autosar or Simulink
> generated code is considered then it is possible to extend actual
> proposed API by IOCTLs which allows to reserve some controller
> objects for specific purposes and allows to access them directly
> for minimal overhead and use under direct application control or attach
> separated controller side "canque_ends_dev_t" to such objects and
> propagate them to some clients to standard CAN read and write API.
> 
> So I think that the proposed framework provides what is expected
> bu most of general purpose CAN/CAN FD framework users, tries to
> perpare a little even for come of CAN XL, solves problems which
> may be practically unsolved by all other generic approaches still.
> And we have some clue how to extend support for most/all other
> controllers and even some open doors to offer even ECU style
> API for applications which benefit from direct controller
> buffers use/allocation which is possible on controllers
> with abundant number of buffers (not case of SJA1000
> and very limited on CTU CAN FD - max 8 can be configured
> to silicon under actual registers map).
> 
> I understand that the text is long but you have asked for
> it in the fact and I provide complete thought dump
> to analyes it.

Thanks for the (very) detailed explanation. My intention was to express 
that I'm completely OK with only one driver because you clearly have 
thought about other hardware too. Your explanation just makes it even 
more clear how much thought you put into it ;)

> 
> I would be happy if you and or others find time to look
> into actual code implementation to identify what could
> be issue for mainlining as soon as possible because
> after May 24 changes do not propagate into Michal Lenc's
> thesis text which can be alternative and more in depth
> documentation and analysis than what fits into official
> RTEMS one. The full document has already 47 pages and
> 34 of the actual text without content and appendices.
> Document includes benchmarks under RTEMS load by HTTP
> traffic, priority inversion prevention confirmation
> by measurements with performance data etc.
> It will be published on CTU in May or June
>    https://dspace.cvut.cz/
> and links will be added to
>    https://canbus.pages.fel.cvut.cz/
> same as for much shorter iCC article and presentation.
> 

Code review without patches or a review system is always a bit more 
effort because there is nothing to add comments directly. It seems that 
I can't register on the gitlab instance that you provided. So let's try 
it here.

I'll mainly take a look at the headers because they define the 
interface. That's the most important part if you ask me. Bugs in the 
code should be fixable later.

I'll try to categorize my comments a bit. If it has a *Style* or *Typo* 
in front of it, you can just ignore it. It's not really important. It's 
just something that I noted while reading through the code. *Question* 
or *Note* are more important.

And please note: You know CAN a lot better than me. So quite possible 
that I don't see a lot of stuff and that I might have some odd odd 
questions.


### First the ones that I plan to more or less ignore so that I can 
concentrate on the important parts:

Test or demo apps. So most likely not relevant for a review:

./rtems_can_test/can_test.h
./rtems_can_test/app_def.h
./rtems_can_test/system.h
./rtems_can_test/can_register.h
./zynq_template/app_def.h
./zynq_template/zynq_reg.h
./zynq_template/system.h

Seems to be left over from some tests:

./lib/libbar/bar.h

Driver specific files. I think these are not that high priority either:

./lib/candrv/ctucanfd/ctucanfd_txb.h
./lib/candrv/ctucanfd/ctucanfd_kframe.h
./lib/candrv/ctucanfd/ctucanfd_kregs.h
./lib/candrv/dev/can/ctucanfd.h


### Now the more important ones: The interfaces

#### ./lib/candrv/dev/can/can.h

*Style*: I would suggest to group defines a bit more. You already used 
prefixes like RTEMS_CAN_QUEUE_* which is great. You can improve that a 
bit more if you use Doxygen "@name" and "@{" ... "@}". For an example 
take a look at

https://gitlab.rtems.org/rtems/rtos/rtems/-/blob/main/cpukit/include/dev/i2c/i2c.h?ref_type=heads#L80

Which leads to a group in the doxygen output:

https://docs.rtems.org/doxygen/branches/master/cpukit_2include_2dev_2i2c_2i2c_8h.html

The same is true for some other defines in other files. I won't mention 
it every time.


*Question*: Why do you prefix some defines with RTEMS (like 
RTEMS_CAN_CHIP_MODE) and others don't have that prefix (like 
CAN_CTRLMODE_LOOPBACK)? The same is true for some other defines in other 
files. I won't mention it every time.


*Style*: Sometimes you use Doxygen @brief. Sometimes \ref. I think it 
works, but it's a bit odd.


*Question*: You have ioctls like RTEMS_CAN_DISCARD_QUEUES. According to 
the description, that ioctl has a parameter. Why is it an _IO and not an 
_IOW? The same is true for some more of the _IO ioctls.


*Typo*: The description of RTEMS_CAN_GET_BITTIMING has a "geets" in it's 
comment.


*Note*: struct can_chip_ops doesn't have a description. I'm not entirely 
sure what every member should do. For example: check_bittiming: From the 
name I would expect that it only checks a bit timing. From the 
ctucanfd.c it also sets a bit timing. I think it would be good if you 
would add short descriptions here like "Check and set a bittiming. 
Returns 0 on success or negated errno on error."


*Detail*: can_bus_register(...) and can_bitrate2bittiming are missing a 
description. If they are a public interface, it would be good if they 
would have one.


#### ./lib/candrv/dev/can/can-frame.h

*Detail*: Description of the can_frame_header. You have field 
descriptions like "This member holds the CAN ID value". My first thought 
was that it is some kind of address. But with taking a look at the code, 
it seems that it is a bit mask that is combined out of CAN_ERR_ID_* 
defines. If a field is expected to contain a certain group of defines: 
Can you add a note regarding that?


*Style*: You have a group of defines like CAN_ERR_*_DATA_BYTE. On the 
first glance, I thought it would be the same group as CAN_ERR_ID_*. I 
think I would have used CAN_ERR_DATA_BYTE_* instead. Of course that's a 
style question and there are always good arguments for any style. Again: 
A doxygen group using @{ and @} might achieve the same.


*Typo*: You have a CAN_ERR_PROT_LOC_DARA_BYTE instead of ..._DATA_BYTE.


*Question*: There are a lot of defines in can-frame.h like 
CAN_ERR_PROT_LOC_DATA or CAN_ERR_TRX_UNSPEC. Is it clear for someone 
more used to CAN, how to use these? Or would a description like 
"Possible values for the Byte xyz in a can message" help?


#### ./lib/candrv/dev/can/can-filter.h

*Detail*: Again: I'm not happy with the descriptions of the fields of 
the structure. A field "flags" that is described as "This member holds 
CAN flags" isn't really helpful. Which values can I assign to that 
field? Is it a bit mask? Is it a field defined according to some 
standard? In that case even a "Holds standard CAN flags" would be useful 
because then I know that I just have to take a look at any CAN 
documentation.


#### ./lib/candrv/dev/can/can-devcommon.h

OK.


#### ./lib/candrv/dev/can/can-helpers.h

*Note*: I don't like global defines like MAX_MSGOBJS without a prefix. 
That's polluting the name space. Is there a reason that it doesn't have 
the CAN or RTEMS_CAN prefix like all other defines?

Similar: There are defines like "BIT". Is there a reason for using such 
a generic name? If it (for example) helps porting existing drivers from 
another stack, that's great. Otherwise, I don't like these names.

Some more in this file are: len2dlc, GENMASK, FIELD_PREP, FIELD_GET


Now I'm running out of time. I'll try to take a look at the following 
files later or tomorrow:

#### ./lib/candrv/dev/can/can-queue.h
#### ./lib/candrv/dev/can/can-stats.h
#### ./lib/candrv/dev/can/can-virtual.h
#### ./lib/candrv/dev/can/can-bittiming.h

Best regards

Christian

> Best wishes,
> 
>                  Pavel
> --
>                  Pavel Pisa
> 
>      phone:      +420 603531357
>      e-mail:     pisa at cmp.felk.cvut.cz
>      Department of Control Engineering FEE CVUT
>      Karlovo namesti 13, 121 35, Prague 2
>      university: http://control.fel.cvut.cz/
>      personal:   http://cmp.felk.cvut.cz/~pisa
>      company:    https://pikron.com/ PiKRON s.r.o.
>      Kankovskeho 1235, 182 00 Praha 8, Czech Republic
>      projects:   https://www.openhub.net/accounts/ppisa
>      social:     https://social.kernel.org/ppisa
>      CAN related:http://canbus.pages.fel.cvut.cz/
>      RISC-V education: https://comparch.edu.cvut.cz/
>      Open Technologies Research Education and Exchange Services
>      https://gitlab.fel.cvut.cz/otrees/org/-/wikis/home

-- 
--------------------------------------------
embedded brains GmbH & Co. KG
Herr Christian MAUDERER
Dornierstr. 4
82178 Puchheim
Germany
email:  christian.mauderer at embedded-brains.de
phone:  +49-89-18 94 741 - 18
mobile: +49-176-152 206 08

Registergericht: Amtsgericht München
Registernummer: HRA 117265
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/


More information about the devel mailing list