Performance problem with PyYAML

Sebastian Huber sebastian.huber at embedded-brains.de
Mon Nov 11 07:38:00 UTC 2019


On 09/11/2019 00:10, Chris Johns wrote:
> 
> 
> On 9/11/19 10:02 am, Joel Sherrill wrote:
>>
>>
>> On Fri, Nov 8, 2019 at 4:58 PM Chris Johns <chrisj at rtems.org
>> <mailto:chrisj at rtems.org>> wrote:
>>
>>      On 9/11/19 9:52 am, Joel Sherrill wrote:
>>      > On Fri, Nov 8, 2019 at 4:26 PM Chris Johns <chrisj at rtems.org
>>      <mailto:chrisj at rtems.org>
>>      > <mailto:chrisj at rtems.org <mailto:chrisj at rtems.org>>> wrote:
>>      >
>>      >     On 9/11/19 2:57 am, Gedare Bloom wrote:
>>      >     > Ah, pickling makes sense.
>>      >     >
>>      >     > On Fri, Nov 8, 2019 at 4:03 AM Sebastian Huber
>>      >     > <sebastian.huber at embedded-brains.de
>>      <mailto:sebastian.huber at embedded-brains.de>
>>      >     <mailto:sebastian.huber at embedded-brains.de
>>      <mailto:sebastian.huber at embedded-brains.de>>> wrote:
>>      >     >>
>>      >     >> Sorry, for the frequent updates. It turned out to be a very small
>>      >     >> modification to add a build specification item cache with pickle.
>>      >
>>      >     I agree pickle and generation is a good solution.
>>      >
>>      > I don't mind processing human readable input to pickle. Let's just keep that
>>      > VERY human readable.
>>
>>      I can only suggest reviewing the repo and the changes as "VERY human readable"
>>      is subjective. I am not sure how else we can determine readability as it is a
>>      personal thing. :)
>>
>> Gedare's post earlier to insert "\n\" manually to improve readability takes
>> something unacceptable to marginally acceptable.
> 
> I thought that was in the json version of the data? My understanding is the
> format is YAML as per the repo and pickle is generated once when the build first
> runs or during the configure stage. I do not think anyone will look at the
> pickle data. I am now confused.

Yes, this how I do it now. Originally, I wanted to avoid any 
bootstrapping, but unfortunately it turned out that the YAML is too slow 
to parse.

The wscript generates the pickle cache from the build specification 
items in YAML. The cache is rebuilt if the modification time of an item 
changes. I moved the pickle cache into the build directory 
"build/c4ache", so it gets removed by a "./waf distclean".

> 
>> It is subjective but editing long strings of magic input with no normal line or
>> paragraph breaks is bad bad bad.
> 
> I reviewed some of the YAML files in to the repo and they looked fine.

I think it is very important that the specification raw data is in a 
good human and machine readable format. Multiline text in a one liner is 
definitely not human readable. It make also diffs hard to read.

-- 
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax     : +49 89 189 47 41-09
E-Mail  : sebastian.huber at embedded-brains.de
PGP     : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.


More information about the devel mailing list