[RFC] generic CAN/CAN FD susbsytem for RTEMS from scratch - online documentation

Tue Apr 30 06:40:43 UTC 2024

Hello Pavel,

thanks for your explanations.

On 2024-04-29 21:23, Pavel Pisa wrote:
> Dear Christian,
> 
> thanks a lot for finding time to read through documentation.
> 
> On Monday 29 of April 2024 10:56:29 Christian MAUDERER wrote:
>> it's quite a big work. So I've started to read through the
>> documentation to get an overview.
> 
> For others, code under review hosted in CTU university GitLab
> server
> 
>    https://gitlab.fel.cvut.cz/otrees/rtems/rtems-canfd
> 
> Documentation
> 
>    https://otrees.pages.fel.cvut.cz/rtems/rtems-canfd/doc/can/can-html/can.html
>    https://otrees.pages.fel.cvut.cz/rtems/rtems-canfd/doc/doxygen/html/index.html
> 
> Main developer behind extension to CAN FD and switch to RTEMS
> is Michal Lenc.
> 
> The intention is to (hopefully) reach state when it meets criteria
> to mainlining int RTEMS CPU kit under
> 
>    cpukit/dev/can
> 
> We plan to prefix all public functions by rtems_ prefix as the
> final step after review and acceptance for mainline to not
> pollute applications namespace. User visible structures
> types for IOCTLs and farmes are planned to stay without
> prefix for readability. They can be hidden by not including
> given header file.
> 
>> Some questions:
>> Do I understand that correctly, that the only user-facing interface is
>> via the "/dev/can*" files. There is no separate interface for special
>> operations, right? That's completely OK for me. I just want to make
>> sure that I understand it correctly.
> 
> Yes, it is the goal to use only character device as the only intreface
> to the user application. LinCAN worked in standard POSIX environment
> with MMU etc. and even on RTEMS we consider user and kernel
> space as fully separated and only read, write, IOCTL exchange
> data controlled way.

OK. That's reasonable.

> 
> There are more functions which are intended for drivers developers
> and for registration from BSP.
> 

I noted that the driver developer has a different interface.

>> Chapter "1.1.1.1: Managing Queues": I assume the limitation that
>> RTEMS_CAN_DISCARD_QUEUES removes all queues and not only selected ones
>> is a limitation due to the nature of the ioctl interface or the driver
>> capabilities? It could be a bit of an annoying limitation in an
>> application that dynamically wants to register or unregister queues.
> 
> The single chip can be opened multiple times, each open instance
> created their own canque_ends_user_t
> 
> https://otrees.pages.fel.cvut.cz/rtems/rtems-canfd/doc/doxygen/html/structcanque__ends__user__t.html
> 
> This structure allows to connect FIFO queues (canque_edge_t) at any time
> in any direction including echo to itself or some intermediate node
> to send messages to more links for redundancy etc.
>   
> https://otrees.pages.fel.cvut.cz/rtems/rtems-canfd/doc/doxygen/html/structcanque__edge__t.html
> 
> That is internal implementation which is intended to be really flexible
> and with minimal locking intervals - only during move to next queue
> during iterations.
> 
> But we are trying to keep external interface reasonably simple, so we consider
> only queues from single user (file-descriptor) to single chip corresponding
> to /dev/can during chosen during open for now.
> 
> The rationale for discard all only is that we if we allow manipulation of
> individual queues we cannot identify queues by pointers. We take as forbidden
> to expose kernel pointer addresses of canque_edge_t. It can be resolved
> by assigning some unique numbers for each edge created from given user
> ends and returning these to user from RTEMS_CAN_CREATE_QUEUE. The initial
> default connection can have ID 0. Then it would be easy iterate over
> all queues going to/from given user and remove only that queue where
> the specified ID matches. Other option is to limit which queues will
> be removed based on CAN ID and mask... but there can be problem to uniquely
> match such queue etc...
> 
> So the compromise is taken. You can remove all queues to given open instance
> and then add queues one by one as you want with controlled CAN ID
> filters and priority.
> 
> If you need to create dynamically some application which needs specific
> CAN IDs and priorities and then you expect that it should not interact
> with network any more, then you can open /dev/canX again and you have
> private queues pool. When the file representation is closed then all
> these queues will be removed. That all without any influence
> to other open instances to same or other CAN device {if we do not
> consider system load).
> 
> I agree, that it is compromise. But adding yet another file descriptor
> like multiplexor for queues to each file descriptor seems to me as
> too much complexity. But it can be added. even later as IOCTL to remove
> individual queues based on CAN ID matches or queues IDs if create
> is modified to return internal queue IDs...
> 

I somehow missed that you can open the device multiple times and get 
independent queues. With that, it's completely OK and should be flexible 
enough for most applications.

It's great that you already have put some thought into how it could be 
extended later if some application needs more flexibility.

>> Chapter "1.1.1.3: Setting Mode": These look quite controller specific.
>> If I want to add a controller that has a new mode: Would I just add a
>> new flag? What happens if we reach 33 flags? Would an array or maybe a
>> structure with a single uint32_t field be a better choice? I assume
>> that if a controller doesn't support a mode, (like setting FD on a
>> non-FD controller), the IOCTL will just return an error?
> 
> I hope that these modes should be mostly controller independent.
> Mode roughly correspond to same definitions found in Linux SocketCAN.
> There is list of our CAN_CTRLMODE_x
> 
> https://gitlab.fel.cvut.cz/otrees/rtems/rtems-canfd/-/blob/master/lib/candrv/dev/can/can.h?ref_type=heads#L98
> 
> May it be we should do review and remove some.
> I.e. CAN_CTRLMODE_3_SAMPLES fits better into bittiming...
> 

OK. Maybe then it would be good to make some notes in which cases it's 
OK to add another flag to avoid that someone adds a lot of hardware 
dependent flags here.

I assume if something is really very special to a specific hardware, the 
driver should get an extra driver specific IOCTL for that, right?

>> Chapter "1.1.3: Frame Transmission" and "1.1.4: Frame Reception": The
>> last argument of "write" is a count. If I see a write, my first guess
>> is that I have to call it like:
>>
>>     struct can_frame frames[10] = {... /* some values */ ...};
>>     ssize_t rv = write( fd, frames, sizeof(frames) );
> 
> This is intended to be used such way but we try to consider even CAN XL
> for future and require to write 2kB block from user space into kernel
> space even for 8 bytes CAN message is too big overhead.
> You can even limit size to store FD frames in application if you know
> that you do not exceed some data size. So it is possible to specify shorter
> size. Same for RX. But we keep one read/write to one CAN message
> correspondence. There has been some advantage of original LinCAN
> driver that more messages could be received or sent by single fort and back
> supervisor mode switch. But with FD and even XL frames in future
> and cheap "system" calls on RTEMS, I consider such optimization
> for multiple messages for single system call as too errors prone
> and complicating things... So one message, one read, write call.
> 

That's OK. Michal Lenc already adapted the documentation to make that 
more clear. Thank you.

>> But from taking a look at the tests in the repository, the count
>> is calculated by can_framelen(). Is it possible to send or receive
>> multiple CAN frames using write or read?
> 
> Not, not with single call
> 
>> Or is it always a single
>> frame?
> 
> Yes
> 
>> What happens if I pass a wrong length?
> 
> If you attempt to send mesage with data len where header+data do not fit
> into write specified size bytes, -EMSGSIZE is returned.
> 
>> Do I send wrong data,
>> crash the system or the whole CAN bus or do I just get an error?
> 
> Only error and nothing is send.
> 
>> Can you make that more clear in the documentation?
> 
> For the read you receive -EMSGSIZE and message is not
> disposed from queue, so you can reallocate buffer
> and attempt read again. This can be again big gain for
> CAN XL... You can receive short messages into some smaller
> user space buffers and XL to others etc...
> 
> I agree that it should be documented more explicitly
> in manual.
> 
>> Some details regarding "struct can_chip":
>>
>> * There is a pointer called "type". I would use a "const char *" for
>>     that. I expect that stuff like names will never change and having
>>     them constant allows to use a pointer to a flash memory area.
> 
> ACK
> 
>> * You have an "int irq". That's not fully compatible to the
>>     rtems_vector_number which is an uint32_t (at the moment).
> 
> OK, we should adjust that to match
> 
>> Regarding the CAN drivers: Do I see it correctly, that currently only
>> a single (real) device is supported (ctucanfd)?
> 
> Yes
> 
>> Did you check with
>> some other hardware controller, whether the whole structures / defines
>> / flags close to the hardware do work well for other controllers too?
> 
> The code/concept is based on my previous LinCAN and OrtCAN work
> 
> https://ortcan.sourceforge.net/lincan/
> 
> Adding SJA1000 support should be easy as the code is part of LinCAN
> and have been used and modified even for NXP LPC controllers.
> Whole infrastructure should be still portable to more systems
> when locking, tasks and interrupts are rewritten. On the other
> hand, we have removed LinCAN system adaptation layer to directly
> use RTEMS and be readable for maintainers.
> 
>> Like the mode flags from 1.1.1.3. It doesn't really matter which other
>> controller. From my own attempts to write a driver stack, I just know
>> that sometimes you make assumptions based on one controller that are
>> hard to implement on another one. Usually it's not even necessary to
>> really add a second controller. Just skimming through the manual can
>> be really helpful. On the other hand, the Doxygen documentation
>> mentions, that the concept is based on LinCAN. So maybe that already
>> helps to avoid that trap?
> 
> I cannot 100% guarantee that all is thought around and around.
> 
> I have experience with XCAN, LPC CAN, SJA1000, OpenCores SJA on NuttX
> in ESP32C3 (we have contributed driver into NuttX mainline). I have
> provided common bittrimming to SocketCAN long ago, Michal Lenc implemented
> SocketCAN NXP imxRT driver for NuttX (mainlined) already. I have
> looked even into Zephyr and some Autosar concepts etc... We have
> LinCAN variant including extension for USB based devices
> and for MCP5200 MSCAN. We have helped Honeywell with correcting
> a little RT behavior of SPI connected MCP2515 horror when they
> used it on Raspberry Pi with Simulink generated code on some
> experimental automotive ECU autotuning system. iMX6 with FlexCAN
> is used in Elektroline.cz, where I provide some consultations.
> 
> I have been architect of rapid prototyping platforms developed
> for Porsche Enginering Services based on TMS570LS3137 and we have
> implemented there TTCAN like synchronous messages exchange there
> with no timing and communication schedule execution dependency on CPU.
> The actual proposed CAN subsystem proposal is not optimized
> for such use, but is is very special and needs intercation
> of application with lower levels or provide API to setup schedule
> and update send data. I can think how to pas that by set of additional
> IOCTLs. But RTEMS need some common CAN/CAN FD stack with
> common functionality found on other systems.
> 
> So all in all I hope I have some idea what could appear in
> CAN controllers ZOO... But yes, we can overlook something
> and that is why I keep Oliver Hartkopp (SocketCAN architect)
> in the loop to be opponent to our decision if he thinks that
> our choices could lead to problems. Even decision/compromise
> if SocketCAN (and LibBSD) dependency or simpler approach should
> be our direction on simpler operating system as RTEMS is, has been
> discussed with him and forwarded to the list
> 
> https://lists.rtems.org/pipermail/devel/2022-June/071972.html
> 
> when Prashanth S has started his GSoC. That effort did not solve
> CAN driver design even that I have provided input into that GSOC
> project how to develop stack in the LinCAN queues direction or other
> reasonably usable FIFOs. The implemented buffers has been mess.
> But I hope that some part of his BeagleBoneBlack D-CAN support
> can be reused for D-CAN implementation based on our effort one day.
> 

I didn't want to doubt your competence. Like I said it's some trap that 
I have fallen into often enough myself (like when guiding Prashanths 
GSoC project). But it's clear that you have put a lot of thought into 
that. So I would expect that there shouldn't be much trouble with most 
controllers. Maybe except for the ones where a semiconductor vendor 
thought it would be a good idea to create a completely different 
concept. But these are always difficult.

>> Regarding the device names "/dev/can0" and similar: Currently that
>> seems to be a fixed scheme, correct?
> 
> Not really, the CAN subsystem provides function
> 
>    int can_bus_register( struct can_bus *bus, const char *bus_path );
> 
> so it is on the user, BSP integrator etc to decide naming scheme.
> Some scheme is not part of the  libcandrv.a. We use simple
> unique numbers generator in the test application during interfaces
> registration
> 
>   Atomic_Uint idx = _Atomic_Fetch_add_uint(
>      &candev_sqn,
>      1,
>      ATOMIC_ORDER_SEQ_CST
>    );
> 
>> In my experience, sometimes it
>> can be useful to use longer names instead. For example
>> "/dev/can_ctufd0". That's especially interesting if a system is
>> initialized dynamically.
> 
> Understand
> 
>> For example if you have a USB to CAN adapter
>> that is enumerated more or less randomly during startup. Is that
>> supported or are there some fixed assumptions that a can device is
>> always called "/dev/canX"?
> 
> The /dev/canX is standard and simplest scheme used on many systems.
> But internals of the project do not prevent to use PCI topological
> address as part of the pah name.
> 

Great. Thanks.

Best regards

Christian

> Best wishes,
> 
>                  Pavel
> 
> PS: Michal Lenc has succeed with this weekend experiment to add support
>      of proposed RTEMS CAN interface into evolving yet toy Rapid Prototyping
>      tool
>        https://github.com/robertobucher/pysimCoder
>      on base of experimental RTEMS target templates.
> 
> PS2: I have tested to build complete OrtCAN CANopen code against proposed
>     CAN subsystem as well. It builds but I need to prepare top level
>     application to link it successfully.
>   
> --
>                  Pavel Pisa
> 
>      phone:      +420 603531357
>      e-mail:     pisa at cmp.felk.cvut.cz
>      Department of Control Engineering FEE CVUT
>      Karlovo namesti 13, 121 35, Prague 2
>      university: http://control.fel.cvut.cz/
>      personal:   http://cmp.felk.cvut.cz/~pisa
>      company:    https://pikron.com/ PiKRON s.r.o.
>      Kankovskeho 1235, 182 00 Praha 8, Czech Republic
>      projects:   https://www.openhub.net/accounts/ppisa
>      social:     https://social.kernel.org/ppisa
>      CAN related:http://canbus.pages.fel.cvut.cz/
>      RISC-V education: https://comparch.edu.cvut.cz/
>      Open Technologies Research Education and Exchange Services
>      https://gitlab.fel.cvut.cz/otrees/org/-/wikis/home

-- 
--------------------------------------------
embedded brains GmbH & Co. KG
Herr Christian MAUDERER
Dornierstr. 4
82178 Puchheim
Germany
email:  christian.mauderer at embedded-brains.de
phone:  +49-89-18 94 741 - 18
mobile: +49-176-152 206 08

Registergericht: Amtsgericht München
Registernummer: HRA 117265
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/