Discussion: How to handle HALs, SDKs and libraries

Thu May 25 07:29:15 UTC 2023

On 2023-05-25 01:57, Chris Johns wrote:
> On 24/5/2023 5:07 pm, Christian MAUDERER wrote:
>> Hello Chris,
>>
>> On 2023-05-24 03:44, Chris Johns wrote:
>>> Hi Christian,
>>>
>>> Thanks for raising this topic. It is a tough one.
>>>
>>> On 24/5/2023 12:11 am, Kinsey Moore wrote:
>>>> On Tue, May 23, 2023 at 2:26 AM Christian MAUDERER
>>>> <christian.mauderer at embedded-brains.de
>>>> <mailto:christian.mauderer at embedded-brains.de>> wrote:
>>>>
>>>>       Hello,
>>>>
>>>>       I recently updated the HAL in the i.MXRT BSP. I used the same approach
>>>>       that we use for a lot of similar cases: Import the sources into RTEMS
>>>>       and adapt them slightly so that they work for us. So basically a
>>>>       Clone-and-Own approach.
>>>>
>>>>       During the discussion of the patches, some concerns were raised, whether
>>>>       we should find a better solution to handle HALs, SDKs and similar cases.
>>>>       We should start discussing a solution that can be used after the 6
>>>>       release so that maybe someone can start to work on a prototype.
>>>>
>>>>       Some example cases are:
>>>>
>>>>       - the mcux_sdk in the imxrt BSP
>>>>       - the hal in the stm32h7 BSP
>>>>       - general ARM CMSIS files
>>>>       - zlib
>>>>       - libfdt
>>>>
>>>>       One solution could be to build these libraries external and only link
>>>>       RTEMS with them. There are disadvantages to this aproach:
>>>>
>>>>       - Also in my experience, the API of the HALs / SDKs / libraries seems to
>>>>       be quite stable, it's possible that there are combinations where some
>>>>       unexpected change breaks a driver or makes it impossible to link the
>>>>       applications.
>>>
>>> Xilinx with the more complex devices like the Versal have been moving things
>>> about. The Versal SMC call set is fluid and the PM (platform manager) seems to
>>> functionally align to Xilinx tools releases plus Petalinux versions. For example
>>> there are stable defined API calls in Versal Linux (XRT/zocl) that depends on PM
>>> code that is commented in the code as "to be removed".
>>>
>>> When I first used the Zynq I used Xilinx's drivers like OAR is currently with
>>> the Microblaze. I could not release the results because of the license at the
>>> time. I quickly found the drivers lacked functionality for general use and broke
>>> under high loads and boundary conditions. The fixes are part of a project and
>>> cannot be released because the license at the time made it impossible. What I
>>> leant from the exercise is to not depend on their drivers.
>>
>> That sounds like a quite bad case. So it's a good example for that discussion.
>> Thanks for bringing it up.
> 
> I view the repo as open but not open source ... if that sentence makes sense?
> 

I think I understand what you mean. But it's still a good example for 
the discussion. If a solution theoretically works with that case, it 
should work with a lot of other cases too.

>>
>>>
>>> I feel what we considered stable will depend on the origin of the code and that
>>> will be case by case.
>>
>> Agreed.
>>
>>>
>>>>       - BSPs rely on basic drivers from these libraries (like console or clock
>>>>       driver). If we link against the libraries, the testsuite wouldn't build
>>>>       any more without preinstalled libraries.
>>>
>>> Yes the mutual dependence if built externally and before RTEMS is not easy to
>>> solve. The idea of the HAL code being supplied as .h and a .a does let a user
>>> update the drivers without needing an RTEMS version update.
>>>
>>>>       Another solution could be to include libraties like that as submodules
>>>>       and build them using the RTEMS build system. We could clone the repos
>>>>       onto the RTEMS git server, and add necessary patches. Advantage would be
>>>>       that it is more similar to the process that we currently have. Another
>>>>       advantage is that we have a known-working version of the files. Upstream
>>>>       updates could be either merged or we could rebase our patches to a new
>>>>       version.
>>>
>>> See below for the problems this creates.
>>>
>>>>        From my point of view, the second option would be the better one
>>>>       especially because we have a tested, fixed version of the library
>>>>       instead telling the user to just use some random version that might or
>>>>       might not work.
>>>
>>> This is important. We need to define what a release is and it is a requirement
>>> we provide all code as tarball files. This implies the release process knows how
>>> to create the tarfiles.
>>>
>>>>       Regardless which aproach we use: We have to think about how to handle
>>>>       that on releases. In the link aproach (first case), we have to somehow
>>>>       archive source tar balls and some kind of build recipe. In the submodule
>>>>       aproach, we could checkout all submodules and pack the files into the
>>>>       RTEMS release tar ball. So I would expect that the second aproach has
>>>>       less impact here too.
>>>>
>>>>       Comments? Improvements? Better suggestions?
>>>>
>>>> I would definitely prefer the submodule approach over the linking approach to
>>>> avoid the test issues since some of these HALs bring core functionality. The
>>>> Xilinx driver framework (embeddedsw repo on Github) would be well-suited to the
>>>> submodule approach since it is already broken out into the shared driver space
>>>> because it can apply to at least 3 architectures (ARM, AArch64, MicroBlaze).
>>>
>>> I suggest you avoid making that repo a submodule of anything. The code in that
>>> repo is "over the wall" and there is no continuity. I have it as a submodule in
>>> my XRT repo and a Xilinx push of the next release of tools broke the code. What
>>> I had depended on was removed and moved somewhere else. The Xilinx updates are
>>> based on the release cycle of their tools and they do not respond to issues or
>>> PRs. They are free to make what ever changes they like and they do that
>>> internally and what appears externally is based on changes across their internal
>>> repos. To make things harder there is no consistent point they update these
>>> public repos so the code they removed did not reappear for a long time.
>>>
>>>> One issue with either approach is the need to modify the HAL source to suit
>>>> RTEMS. As far as I'm aware, there is no tooling in place in git for applying
>>>> patches to submodules and in the external build scenario we'd end up maintaining
>>>> a branch of the origin repo with patches applied. Upstreaming the changes would
>>>> be ideal, but I wouldn't expect them to accept RTEMS-specific patches. The
>>>> Xilinx NAND driver already requires a minor modification because that driver
>>>> doesn't expose an option and instead has a defined macro that determines how
>>>> many chip selects are usable to address different parts of the NAND chip.
>>>> Technically, this particular change could be worked around with some include
>>>> path trickery to leave the original sources unmodified, but many other changes
>>>> would not be suited by that type of workaround and it makes the source less
>>>> maintainable. We would need to come up with our own tooling for submodule patch
>>>> application and silencing of warnings about dirty submodule trees due to applied
>>>> patches.
>>>
>>> Direct dependence on external repos we do not control is a long term maintenance
>>> problem. Repos move and change [1] and this makes maintaining past releases a
>>> challenge. Who is responsible for the long term release branch maintenance?
>>> Without a working submnodule a release cannot be made and that is not great.
>>> Expecting the release manager to clean up is not going to work given the task is
>>> unfunded.
>>
>> Let's make the dependencies indirect: We clone repos to git.rtems.org and to our
>> mirrors. Then we can either use a submodule URL starting with
>> git://git.rtems.org or even with a relative URL if we want to make better use of
>> the mirrors.
>>
>> If necessary, that approach allows adding an RTEMS-branch that adds patches.
>> It's more similar to the clone and own we do now. But having a clone of the
>> original repo makes it a lot simpler to merge upstream changes. Having an
>> RTEMS-branch makes it easier to see what has been changed for RTEMS.
> 
> Separate repos have the advantage of allowing per repo rules for maintenance and
> ownership and I like that. It would map nicely into gitlab.
> 
> 
>> We don't have to integrate automatic updates or similar. We only maintain and
>> keep a tested version. If a BSP maintainer or user wants to upgrade, he pulls
>> the changes from the upstream repo and merges them into the branch that includes
>> our patches.
>>
>> That should even work for your extreme case of the Xilinx repo. We have a tested
>> version on our server. If someone wants to update, he has to update, find out
>> what Xilinx did break during their updates and adapt to that. Then we can push
>> that new version to our clone of the Xilinx repo.
> 
> If we take a repo into git.rtems.org the project is undertaking long term
> maintenance of that code base. Is this what we want if we are using small pieces
> of it?

I don't think that we have to maintain the entire code base. There are a 
lot of unmaintained clones of a lot of software. Usually the last 
commits in a repo clearly show whether it isn't touched since 5 years or 
whether it is actively maintained.

> 
> I am still not sure what role this code base is performing? I know we need code
> and/or drivers to boot RTEMS, eg x86 and EFI, which has to be in rtems.git so
> the tests link and run. A list is SMP core starting, timer, MMU and console.
> What other drivers are being added?

I mentioned libraries in my original mail too. I think libfdt and zlib 
would be two candidates. Libfdt is necessary for some BSPs in basic 
drivers too.

> 
>>> Submodules in rtems.git is a change in policy. We allow submodules in add-on
>>> packages like libbsd but it has never been something we have allowed with
>>> rtems.git.
>>
>> I agree that it would be a change in policy. But that's the whole point of the
>> discussion:
> 
> Great. I made the statement to make sure we all understand this. :)
> 

Thanks for highlighting that point.

>> The current method makes it hard to maintain library code. Do we
>> find a better solution that either fits in current policies or do we find
>> sensible adaptions to the policies that are OK for everyone?
> 
> Yes it is not great. Please understand my question is to make sure we understand
> what we take on with a path we take.
> 

OK.

>> I don't see submodules as the only valid solution. But it's one that looks
>> promising to me, and therefore I brought it up. It is similar to the approach
>> that has worked well in libbsd. What I currently suggest only tries to avoid the
>> step of copying code between the upstream repo and the local one like we do in
>> libbsd.
> 
> Submodules could be made to work. We need to understand some issues it brings
> first. For example we would need to manage the eco-system side. An example is
> only needing the submodule if the related BSP or BSP family is being built? Or
> how we manage the submodule initialisation so users do not end up with custom
> commands they run which break if a BSP or arch is removed from the tree in a
> future release.
> 

Good point. You are right that we should think about how we want to 
initialize the submodules. From a user point of view, I only want to 
download the necessary code. So I want to initialize as few submodules 
as possible. Can be tricky.

Maybe we can teach waf to handle that. But that will add a difference 
between development branches and releases.

> We should consider merge requests. How does a merge request for a submodule get
> checked to make sure it does not break rtems.git? It is possible to check this
> at the submodule repo level?
> 

I'm not sure whether I understand you correctly here.

Usually if you update a submodule, you have to create a merge request in 
two repos. The submodule itself and in the main repo that uses the 
submodule.

I think your question is about how we check a merge request in the 
submodule before we merge it, correct?

To be honest: I don't have a good answer for that. Most likely it will 
be similar to what we do now: Apply patches locally, build the BSP and 
check whether it works. Difference is that we have to apply the patches 
to multiple repos.

When merging, we have to first merge the submodule patches, then most 
likely ask the patch creator to update the patches for RTEMS or update 
them ourself (the SHA1 of the submodule will change if we add a merge 
commit) and then merge the patches in RTEMS.

> FYI a release checks a repo for submodules and if present gets that code and
> merges it into the master repo source to make a complete source package. The git
> archive command does not include submodules. The rtems.git release tarfile will
> be the sum of all submodule repos. Is this something we need to consider if
> these repos are large?
> 

Yes it is. It will mean that the release tar balls will grow quite a lot.

If we use waf to build the sources from the submodules (which is 
currently what I expect that it will be the case) we could use waf to 
copy only the files that are used. That would shrink the size again.

Disadvantage is, that we maybe miss adding included files to the build 
system files. That will work well if the complete submodule is checked 
out, but it will break during the release.

>>
>> Do you have a good alternative idea that would need less changes in policy?
>>
> 
> No I do not have good alternatives. I feel what we end up with will be a
> compromise of some form. There is no perfect solution with an open project like
> we have.
> 
> Could we step away from submodules being in rtems.git and maybe there is a BSP
> option that points to a source tree? I have no idea if the build system could
> make this work. We would host the repo for that source tree on git.rtems.org and
> control it so users have a simple means to find the repo and the version they
> need. An advantage is the "driver" repo can be updated independently to RTEMS
> and for some users that may be a good thing. For example a long life cycle
> project is stable on a version of RTEMS however the drivers need bug fixes? The
> down side is the extra steps needed to get the code and to set it up. It is an
> alternative but not a good one. :)

Just to make sure that I understand that correctly:

A BSP uses for example

   https://git.rtems.org/hal/foo-hal/plain/driver.c

The build system would note that this file hasn't been downloaded yet 
and download it before using it. Is that correct?

You say that an advantage would be that I can update driver.c 
independently of RTEMS. I would see that as a big and problematic 
disadvantage:

If I update my foo-hal to a version 2 that changes something in the 
interface and that needs adaptions in RTEMS, that will automatically 
break old RTEMS versions.

Even if the interface doesn't change, but only some bug is fixed, I 
can't build a RTEMS BSP that has the bug and one that don't. The bug 
will just magically suddenly vanish. I think as a user I would have a 
very hard time figuring out what happened and why I can't reproduce the 
bug that has been there yesterday.

So in my opinion, a commit in RTEMS should always use fixed versions of 
the files in the HAL / library. We could also use URLs with a fixed ID like:

https://git.rtems.org/hal/foo-hal/plain/driver.c?id=698732a6c45424263d9de0ac23850c21383c4154

But I think that would make it harder to maintain compared to submodules 
that already use that ID.

> 
> Just as a sanity check on this discussion ... Would gitlib merge requests aid
> the management of the code in rtems.git or are there other factors complicating
> the maintenance task?
> 
 > Chris

I don't think gitlab (or any other similar system I know) will help a 
lot with these tasks. But most likely it will also not become more 
difficult to handle that with these systems. If you want, we can set up 
some simple test repos on gitlab.com to check how submodules are handled.

In general: Submodule will make some stuff harder and some stuff simpler.

It will be harder to merge patches. A patch will have two parts: One in 
the submodule to (for example) update the HAL. And one in RTEMS to use 
the new version and maybe adapt files using that HAL.

On the other hand, it will be a lot simpler to update HALs and libs:

- In the best case for an unchanged library it's just syncing with 
upstream - that would also result in a straightforward pull-request in 
the repo. Then a simple patch that updates the submodule in RTEMS. 
Syncing with upstream maybe could even be done with some automatism. 
Only the patches in RTEMS should be at least manually reviewed ones.

- If some RTEMS specific patches have been added, it will be a 'git 
merge upstream/main' and push to the RTEMS specific branch or 'git 
rebase upstream/main' and push to a new branch (we have to keep the old 
ones so that old RTEMS versions still find the commits). Then a simple 
patch that updates the submodule in RTEMS.

- In the worst case (like your Xilinx repo) it will be merging upstream 
changes, figuring out what has been broken by Xilinx and adapt the 
drivers in RTEMS to that. Then first push the submodule and after that 
push the necessary patches in RTEMS.

Compared to the current workflow, that's a lot simpler. My current 
workflow is usually something like this:

1. Figure out what has been changed compared to the original HAL version 
(that is hopefully noted somewhere in the commit message or the files).

2. Find all matching new HAL files and copy them over the old ones. 
Maybe throw away no longer existing ones.

3. Re-apply changes from step 1 manually. Hope that I didn't forget some 
important fix.

4. See whether everything still builds and works.

Step 2 and 3 have to be one commit because otherwise there would be a 
non-working commit in RTEMS. So the next update is even harder because 
there never have been unchanged files in the repo and figuring out the 
RTEMS specific changes means that I first have to find the original files.

Best regards

Christian

-- 
--------------------------------------------
embedded brains GmbH & Co. KG
Herr Christian MAUDERER
Dornierstr. 4
82178 Puchheim
Germany
email:  christian.mauderer at embedded-brains.de
phone:  +49-89-18 94 741 - 18
mobile: +49-176-152 206 08

Registergericht: Amtsgericht München
Registernummer: HRA 117265
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/