GSOC'18 project description

Chris Johns chrisj at rtems.org
Tue May 1 10:53:07 UTC 2018


On 1/5/18 7:34 pm, Sebastian Huber wrote:
> ----- Am 1. Mai 2018 um 5:38 schrieb Chris Johns chrisj at rtems.org:
> 
>> On 30/04/2018 20:05, Sebastian Huber wrote:
>>>> 3) Investigate the importance of live tracing functionality and addition of
>>>> kernel level tracing. I will design both possibilities and decide which one to
>>>> work on by the end of phase 1 of my evaluation.
>>>
>>> I work currently on a new low level infrastructure to support gathering of
>>> thread (create, delete, terminate, switch) and interrupt (entry, exit) events.
>>> Similar to the capture engine, but with less overhead.
>>
>> Why not improve the existing code?
> 
> There are some problems with the capture engine in terms of data structures, synchronization, timestamping and the API.
> 
> * The sizeof(rtems_capture_record) is 20 bytes, I use only 12 bytes (16 on a 64-bit machine). I want to record high frequency events, so this 33% reduction makes a difference.
> 
> * Synchronization is done via atomic read-modify-write operations (ticket locks). The alternative implementation simply uses interrupt disable/enable.
> 
> * The timestamps use 64-bits and by default rtems_clock_get_uptime_nanoseconds() is used. This function has a high overhead for this purpose. The alternative implementation uses a 32-bit timestamp and the CPU counter. The uptime and CPU counter values are synchronized via a periodic event.
> 
> * Variable size records need more logic (branches) than fixed size records.
> 

Thank you, these points are reasonable.

I think it would be good to present a path for what you propose and how it
integrates into the eco-system. The issues you have raised with the capture
engine are valid and I feel there maybe a case for more than one method of
capturing data and so integrating them would be good. For example what you have
seems ideal for the low level critical paths while other external instrumented
approaches can also exist, ie tracing malloc in newlib. The user should have a
unified means of gathering this data.

I think this topic should be moved to the devel list, a detailed design is off
topic for this list.

Chris

>>
>> Does this create a new interface and management structure?
>> Do you have the requirements for this design documented anywhere?
> 
> There is not much to design. I just need a single producer single consumer queue (ring buffer) per processor. Here is the assembler code of the main function:
> 
> typedef CPU_Uint32ptr rtems_record_data;
> 
> #define RTEMS_RECORD_EVENT_EXTRA( domain, extra ) \
>   ( ( ( extra ) << 6 ) | ( domain ) )
> 
> typedef enum {
> #define RECORD_EVENT_SYSTEM_0( extra ) \
>   RTEMS_RECORD_EVENT_EXTRA( 0, extra )
> 
>   RTEMS_RECORD_EVENT_OVERFLOW = RECORD_EVENT_SYSTEM_0( 0 ),
>   RTEMS_RECORD_EVENT_UPTIME_LOW = RECORD_EVENT_SYSTEM_0( 1 ),
>   RTEMS_RECORD_EVENT_UPTIME_HIGH = RECORD_EVENT_SYSTEM_0( 2 ),
>   RTEMS_RECORD_EVENT_PROCESSOR = RECORD_EVENT_SYSTEM_0( 3 ),
>   RTEMS_RECORD_EVENT_THREAD_CREATED = RECORD_EVENT_SYSTEM_0( 4 ),
>   RTEMS_RECORD_EVENT_THREAD_CREATED_BY = RECORD_EVENT_SYSTEM_0( 5 ),
>   RTEMS_RECORD_EVENT_THREAD_SWITCH = RECORD_EVENT_SYSTEM_0( 6 ),
>   RTEMS_RECORD_EVENT_THREAD_DELETED = RECORD_EVENT_SYSTEM_0( 7 ),
>   RTEMS_RECORD_EVENT_THREAD_DELETED_BY = RECORD_EVENT_SYSTEM_0( 8 ),
>   RTEMS_RECORD_EVENT_THREAD_TERMINATED = RECORD_EVENT_SYSTEM_0( 9 ),
>   RTEMS_RECORD_EVENT_INTERUPT_INSTALL = RECORD_EVENT_SYSTEM_0( 10 ),
>   RTEMS_RECORD_EVENT_INTERUPT_REMOVE = RECORD_EVENT_SYSTEM_0( 11 ),
>   RTEMS_RECORD_EVENT_INTERUPT_BEGIN = RECORD_EVENT_SYSTEM_0( 12 ),
>   RTEMS_RECORD_EVENT_INTERUPT_END = RECORD_EVENT_SYSTEM_0( 13 ),
>
>   /* Reserved system event domains */
>   RTEMS_RECORD_EVENT_SYSTEM_0 = 0,
>   RTEMS_RECORD_EVENT_SYSTEM_1,
>   RTEMS_RECORD_EVENT_SYSTEM_2,
>   RTEMS_RECORD_EVENT_SYSTEM_3,
>   RTEMS_RECORD_EVENT_SYSTEM_4,
>   RTEMS_RECORD_EVENT_SYSTEM_5,
>   RTEMS_RECORD_EVENT_SYSTEM_6,
>   RTEMS_RECORD_EVENT_SYSTEM_7,
>   RTEMS_RECORD_EVENT_SYSTEM_8,
>   RTEMS_RECORD_EVENT_SYSTEM_9,
>   RTEMS_RECORD_EVENT_SYSTEM_10,
>   RTEMS_RECORD_EVENT_SYSTEM_11,
>   RTEMS_RECORD_EVENT_SYSTEM_12,
>   RTEMS_RECORD_EVENT_SYSTEM_13,
>   RTEMS_RECORD_EVENT_SYSTEM_14,
>   RTEMS_RECORD_EVENT_SYSTEM_15,
> 
>   /* User event domains */
>   RTEMS_RECORD_EVENT_USER_0,
>   RTEMS_RECORD_EVENT_USER_1,
>   RTEMS_RECORD_EVENT_USER_2,
>   RTEMS_RECORD_EVENT_USER_3,
>   RTEMS_RECORD_EVENT_USER_4,
>   RTEMS_RECORD_EVENT_USER_5,
>   RTEMS_RECORD_EVENT_USER_6,
>   RTEMS_RECORD_EVENT_USER_7,
>   RTEMS_RECORD_EVENT_USER_8,
>   RTEMS_RECORD_EVENT_USER_9,
>   RTEMS_RECORD_EVENT_USER_10,
>   RTEMS_RECORD_EVENT_USER_11,
>   RTEMS_RECORD_EVENT_USER_12,
>   RTEMS_RECORD_EVENT_USER_13,
>   RTEMS_RECORD_EVENT_USER_14,
>   RTEMS_RECORD_EVENT_USER_15,
>   RTEMS_RECORD_EVENT_USER_16,
>   RTEMS_RECORD_EVENT_USER_17,
>   RTEMS_RECORD_EVENT_USER_18,
>   RTEMS_RECORD_EVENT_USER_19,
>   RTEMS_RECORD_EVENT_USER_20,
>   RTEMS_RECORD_EVENT_USER_21,
>   RTEMS_RECORD_EVENT_USER_22,
>   RTEMS_RECORD_EVENT_USER_23,
>   RTEMS_RECORD_EVENT_USER_24,
>   RTEMS_RECORD_EVENT_USER_25,
>   RTEMS_RECORD_EVENT_USER_26,
>   RTEMS_RECORD_EVENT_USER_27,
>   RTEMS_RECORD_EVENT_USER_28,
>   RTEMS_RECORD_EVENT_USER_29,
>   RTEMS_RECORD_EVENT_USER_30,
>   RTEMS_RECORD_EVENT_USER_31,
>   RTEMS_RECORD_EVENT_USER_32,
>   RTEMS_RECORD_EVENT_USER_33,
>   RTEMS_RECORD_EVENT_USER_34,
>   RTEMS_RECORD_EVENT_USER_35,
>   RTEMS_RECORD_EVENT_USER_36,
>   RTEMS_RECORD_EVENT_USER_37,
>   RTEMS_RECORD_EVENT_USER_38,
>   RTEMS_RECORD_EVENT_USER_39,
>   RTEMS_RECORD_EVENT_USER_40,
>   RTEMS_RECORD_EVENT_USER_41,
>   RTEMS_RECORD_EVENT_USER_42,
>   RTEMS_RECORD_EVENT_USER_43,
>   RTEMS_RECORD_EVENT_USER_44,
>   RTEMS_RECORD_EVENT_USER_45,
>   RTEMS_RECORD_EVENT_USER_46,
>   RTEMS_RECORD_EVENT_USER_47,
> 
>   RTEMS_RECORD_EVENT_LAST = 0xffffffff
> } rtems_record_event;
> 
> void rtems_record_produce( rtems_record_event event, rtems_record_data data );
> 
> 00000000 <rtems_record_produce>:
>    0:   7c a0 00 a6     mfmsr   r5
>    4:   7c 00 01 46     wrteei  0
>    8:   7d 6e 82 a6     mfspr   r11,526
>    c:   7d 30 42 a6     mfsprg  r9,0
>   10:   81 09 02 b0     lwz     r8,688(r9)
>   14:   80 e8 00 00     lwz     r7,0(r8)
>   18:   81 28 00 04     lwz     r9,4(r8)
>   1c:   80 c8 00 08     lwz     r6,8(r8)
>   20:   39 29 ff ff     addi    r9,r9,-1
>   24:   7d 27 48 50     subf    r9,r7,r9
>   28:   7d 29 30 38     and     r9,r9,r6
>   2c:   2b 89 00 01     cmplwi  cr7,r9,1
> 
> This is the only branch in the function:
> 
>   30:   41 dd 00 40     bgt-    cr7,70 <rtems_record_produce+0x70>
>   34:   54 ea 08 3c     rlwinm  r10,r7,1,0,30
>   38:   38 60 00 00     li      r3,0
>   3c:   7d 4a 3a 14     add     r10,r10,r7
>   40:   7d 27 4a 14     add     r9,r7,r9
>   44:   55 4a 10 3a     rlwinm  r10,r10,2,0,29
>   48:   7d 29 30 38     and     r9,r9,r6
>   4c:   7d 48 52 14     add     r10,r8,r10
>   50:   90 6a 00 0c     stw     r3,12(r10)
>   54:   91 6a 00 10     stw     r11,16(r10)
>   58:   90 8a 00 14     stw     r4,20(r10)
>   5c:   7c 20 04 ac     lwsync
>   60:   91 28 00 00     stw     r9,0(r8)
>   64:   7c a0 01 06     wrtee   r5
>   68:   4e 80 00 20     blr
> 
> Only 27 instructions in the common path (no ring buffer overflow, overflows are special events).
> 
>   6c:   60 00 00 00     nop
>   70:   54 ea 08 3c     rlwinm  r10,r7,1,0,30
>   74:   39 20 00 01     li      r9,1
>   78:   7d 4a 3a 14     add     r10,r10,r7
>   7c:   7d 27 4a 14     add     r9,r7,r9
>   80:   55 4a 10 3a     rlwinm  r10,r10,2,0,29
>   84:   7d 29 30 38     and     r9,r9,r6
>   88:   7d 48 52 14     add     r10,r8,r10
>   8c:   90 6a 00 0c     stw     r3,12(r10)
>   90:   91 6a 00 10     stw     r11,16(r10)
>   94:   90 8a 00 14     stw     r4,20(r10)
>   98:   7c 20 04 ac     lwsync
>   9c:   91 28 00 00     stw     r9,0(r8)
>   a0:   7c a0 01 06     wrtee   r5
>   a4:   4e 80 00 20     blr
> 



More information about the users mailing list