Performance problem with PyYAML
Sebastian Huber
sebastian.huber at embedded-brains.de
Fri Nov 8 08:26:51 UTC 2019
Hello,
I added the build specifications for most of the test programs. This
resulted in about 656 *.yml files. It seems this is a bit too much for
the PyYAML module which is written purely in Python. It needs three to
four seconds on my machine to load the files. The BSPs will add another
couple of hundred files. Converting the format to JSON solves the
performance issues. The time to load using JSON files drops to 0.2s to
0.3s. Using JSON has the benefit that this is a standard Python library
module.
There are two problems with JSON:
1. Doorstop currently supports only YAML. I guess support for JSON can
be added in principle, but it is not a small change.
2. Multi-line strings in JSON are less readable, e.g.
cat spec/build/bsps/riscv/riscv/RTEMS-BUILD-BSP-RISCV-RISCV-004.json
{
"active": true,
"build-type": "config-file",
"content": "MEMORY {\n RAM : ORIGIN = ${RISCV_RAM_REGION_BEGIN},
LENGTH = ${RISCV_RAM_REGION_SIZE}\n}\n\nREGION_ALIAS (\"REGION_START\",
RAM);\nREGION_ALIAS (\"REGION_TEXT\", RAM);\nREGION_ALIAS
(\"REGION_TEXT_LOAD\", RAM);\nREGION_ALIAS (\"REGION_FAST_TEXT\",
RAM);\nREGION_ALIAS (\"REGION_FAST_TEXT_LOAD\", RAM);\nREGION_ALIAS
(\"REGION_RODATA\", RAM);\nREGION_ALIAS (\"REGION_RODATA_LOAD\",
RAM);\nREGION_ALIAS (\"REGION_DATA\", RAM);\nREGION_ALIAS
(\"REGION_DATA_LOAD\", RAM);\nREGION_ALIAS (\"REGION_FAST_DATA\",
RAM);\nREGION_ALIAS (\"REGION_FAST_DATA_LOAD\", RAM);\nREGION_ALIAS
(\"REGION_RTEMSSTACK\", RAM);\nREGION_ALIAS (\"REGION_WORK\",
RAM);\n\nINCLUDE linkcmds.base\n",
"derived": false,
"destination": "${BSP_LIBDIR}/linkcmds",
"enabled-by": [],
"header": "",
"level": 1.3,
"links": [],
"normative": true,
"order": 1000,
"ref": "",
"reviewed": "E3oxPkiXxl6OF-CbAPybZ3Uj-yDa-gX0TNlCe8KI_AE=",
"target": "linkcmds",
"text": "",
"type": "build"
}
vs.
cat spec/build/bsps/riscv/riscv/RTEMS-BUILD-BSP-RISCV-RISCV-004.yml
active: true
build-type: config-file
content: |
MEMORY {
RAM : ORIGIN = ${RISCV_RAM_REGION_BEGIN}, LENGTH =
${RISCV_RAM_REGION_SIZE}
}
REGION_ALIAS ("REGION_START", RAM);
REGION_ALIAS ("REGION_TEXT", RAM);
REGION_ALIAS ("REGION_TEXT_LOAD", RAM);
REGION_ALIAS ("REGION_FAST_TEXT", RAM);
REGION_ALIAS ("REGION_FAST_TEXT_LOAD", RAM);
REGION_ALIAS ("REGION_RODATA", RAM);
REGION_ALIAS ("REGION_RODATA_LOAD", RAM);
REGION_ALIAS ("REGION_DATA", RAM);
REGION_ALIAS ("REGION_DATA_LOAD", RAM);
REGION_ALIAS ("REGION_FAST_DATA", RAM);
REGION_ALIAS ("REGION_FAST_DATA_LOAD", RAM);
REGION_ALIAS ("REGION_RTEMSSTACK", RAM);
REGION_ALIAS ("REGION_WORK", RAM);
INCLUDE linkcmds.base
derived: false
destination: ${BSP_LIBDIR}/linkcmds
enabled-by: []
header: ''
level: 1.3
links: []
normative: true
order: 1000
ref: ''
reviewed: E3oxPkiXxl6OF-CbAPybZ3Uj-yDa-gX0TNlCe8KI_AE=
target: linkcmds
text: ''
type: build
An alternative to using JSON would the addition of a post-processed file
which gathers all build specification items included in the RTEMS
sources. The PyYAML is then only necessary if external build
specification items are used (this should be not hundreds). For example
we could store the information of all the build specification items in a
file generated by the Python marshal module. Each time a build
specification item is added/changed/removed we have to update this file
as well (stored in the repository).
--
Sebastian Huber, embedded brains GmbH
Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail : sebastian.huber at embedded-brains.de
PGP : Public key available on request.
Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.
More information about the devel
mailing list