[PATCH 2/5] build: Use CSafeLoader if available

Sebastian Huber sebastian.huber at embedded-brains.de
Thu May 4 06:16:11 UTC 2023


On 04.05.23 05:35, Chris Johns wrote:
> On 3/5/2023 7:40 pm, Sebastian Huber wrote:
>> On 03.05.23 05:30, Chris Johns wrote:
>>> On 28/4/2023 3:38 pm, Sebastian Huber wrote:
>>>> On 27.04.23 20:27, Gedare Bloom wrote:
>>>>> On Wed, Apr 26, 2023 at 11:46 PM Sebastian Huber
>>>>> <sebastian.huber at embedded-brains.de>  wrote:
>>>>>> On 27.04.23 02:11, Chris Johns wrote:
>>>>>>> On 26/4/2023 6:04 pm, Sebastian Huber wrote:
>>>>>>>> The CSafeLoader uses the C libyaml libary to considerably speed up the
>>>>>>>> loading of YAML files.
>>>>>>> No from me.
>>>>>> What do you mean with not for me? You have the CSafeLoader available and
>>>>>> it is slow? Do you have some timings before and after the patch set for
>>>>>> a "./waf configure" and "./waf build"? On my systems the configure needs
>>>>>> less than a second with the CSafeLoader and the waf build setup time is
>>>>>> less than 100ms.
>>>>>>
>>>>>>> I do not agree with conditional states of operation in the build system that
>>>>>>> depend on packages a host has installed. If speed is an important factor all
>>>>>>> users then I suggest you find a means to have it available automatically
>>>>>>> on the
>>>>>>> hosts we support (Linux, FreeBSD, MacOS, Windows MINGW64 and Cygwin.
>>>>>> I am not sure if we should automatically install system Python packages
>>>>>> on user machines.
>>>>>>
>>>>>> The fall back is the Python PyYAML package available through the RTEMS
>>>>>> sources. This is what we use currently. For RTEMS users, this is
>>>>>> acceptable since they are not supposed to touch the YAML files. For
>>>>>> RTEMS maintainers, not having the cache makes working with the build
>>>>>> system more efficient.
>>>>>>
>>>>>> If they system PyYAML package is not installed, then you get now a hint
>>>>>> to install it:
>>>>>>
>>>>>> Setting top to                           : /home/EB/sebastian_h/src/rtems
>>>>>> Setting out to                           :
>>>>>> /home/EB/sebastian_h/src/rtems/build
>>>>>> Regenerate the build specification cache.  Install the PyYAML Python
>>>>>> package to avoid this.  The cache regeneration needs a couple of seconds...
>>>>>> Configure board support package (BSP)    : arm/realview_pbx_a9_qemu
>>>>>>
>>>>> I have two questions, which are related to Chris's concern I think.
>>>>> 1. Are the output of PyYAML and C libyaml guaranteed to be consistent?
>>>>
>>>> I trust the PyYAML maintainers that the SafeLoader and CSafeLoader produce the
>>>> same results. With respect to the alternative ItemCache class implementation in
>>>> the wscript I am quite confident that this produces the same results. This part
>>>> just has to load the item data from the files. The CSafeLoader based ItemCache
>>>> has 53 lines of code.
>>>>
>>>>>
>>>>> 2. Why not make C libyaml part of the RTEMS toolchain?
>>>>>
>>>>> Any dependencies that exist in the build system are (by definition)
>>>>> suitable to be checked/provided by the tool buildset.
>>>>
>>>> Yes, this is an option. If we remove the pickle cache, then we force everyone to
>>>> use the libyaml based PyYAML module. Is this really necessary right now?
>>>
>>> If we leave it who would do it? I would like to understand the next question
>>> before we decide if this is important. The key objective is to have consistent
>>> performance for every one. If the package is easy to build then we should do it
>>> when we build the tools and the questions we are having go away.
>>
>> The PyYAML package had some security issues in the past. If we ship this
>> package, who will monitor this package, update it, and write security advisories?
> 
> The same way we would handle any security issue. When we become aware we update
> what we provide.

This is a problem from my point of view. Maintenance activities 
(including security related topics) happen by accident in the RTEMS 
Project. In general, each mandatory host tool makes it harder to install 
RTEMS in certain environments.

> 
> Is PyYAML a pip package or is it provided by a distro package when using Linux?
> My assumption, which may be wrong, is building libyaml (the C part) is all we
> need to do?

You can install it through pip, conda, or whatever your host provides as 
packages. I guess you need to build also some Python bindings for 
libyaml to be able to use it.

> 
>>>> For
>>>> most use cases the Python only solution works fine. If you spend your time
>>>> developing BSPs, then the CSafeLoader pays off.
>>>
>>> Maybe I am not understanding how this works. Why is there a difference for
>>> developers vs a user who does not have this package installed? Does the
>>> difference scale?
>>
>> A user typically just uses a certain version of RTEMS. Then the BSPs of interest
>> are configured and built. A user is not supposed to touch the spec files.
> 
> My experience is different.
> 
> I do not agree with different levels of performance and build experience based
> on the host operating system being used. We need to support all hosts in the
> same way and this seems to favour users who have an OS that can provide the
> package. We have had host biases other places in RTEMS and it takes a long time
> to remove it. The policy I work to is RTEMS developers and users use the same
> tools and processes and this has been working well through my time with this
> project. I see no reason to move away from this.

I don't see the problem here, PyYAML is a widely used package. When I 
install it through pip, I get the CSafeLoader on my machine. I don't 
have a libyaml development package installed.

The pickle cache approach is not that bad, it just doesn't support some 
use cases well.

> 
>> A maintainer adds, modifies, removes spec files during development. With the
>> item cache, this always involves a time to wait of several seconds. the time to
>> wait depends on the total number of spec files. With the CSafeLoader this time
>> is reduced to a fraction of a second.
> 
> If a user downloads a release is the intermediate data present or do they need
> to wait while the it is parsed using what ever system they have?
> 
> I am sorry if I am not understanding something in how this all works. I cannot
> tell if your statement implies we are holding intermediate data in the repo or
> releases need some extra processing before being packaged?

The pickle cache is not in the repository. We could add it to the 
release archive, but I am not sure if this is a good idea. The pickle 
format is Python version dependent.

-- 
embedded brains GmbH
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.huber at embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/


More information about the devel mailing list