student for gsoc2008

Chris Johns chrisj at rtems.org
Thu Mar 27 02:41:22 UTC 2008


Joel Sherrill wrote:
> 
> Reng Zeng wrote:
>> Hi Joel, Chris,
>>
>> I do not understand your "class" and data along with it below, Could 
>> you please explain it a little more?
>>
> Did my separate post help?
>> That is still TBD.  It could be a "class" indicator byte followed by an
>> class specific
>> event.  Something like:
>>
>> + class
>> + class specific message indicator
>> + size of data following
>> + data
>>
>> For the trace points to add, I understand as below:
>>
>> 1. Write code to generate the wrapped version for each function to 
>> trace, such as
>> void *__wrap_symbol(arguments)
>> {
>>   void *p;
>>   void *__real_symbol(arguments);
>>
>>   Event_ID eid = Get_Event_ID(symbol);
>>  
>>   if (TriggerSet(eid))
>>          pass all the arguments to the SOLE handler in capture engine.
>>          together with the data refered by arguments.
>>      
>>   p = __real_symbol(size);
>>  
>>   if (TriggerSet(eid))
>>       pass the return value to the SOLE handler in capture engine.
>>     together with the data refered by return value.
>>  
>>   return p;
>> }
> I wish Chris was back from his honeymoon.

Wish we could have one but have no time at the moment. Anyway I am back and here.

> 
> But as far I understand that is generally correct.  You want to log entry
> and exit status from every wrapped call. 
> 

Yes this is correct. We need to flatten the argument list to a block of memory 
to capture. The wrapper code generator will know all the details of the 
function's signature so can determine the amount of stack used to hold 
arguments. The variable argument functions such as printf are something I am 
yet to figure out. This is part of the work.

The capture engine's role changes a little when logging wrapped calls. When in 
the context switch path performance is key while in the wrapped mode the 
overhead of the block copy of arguments is something we know has to happen and 
have to factor in.

The example code above it is close. We need to decide if we log only once when 
we have the return value with 2 time stamps (entry/exit). This lowers the 
overheads down but complicates the data stream on the host side as nested 
traced calls will appear before the current call.

> I believe Chris has an idea where the trigger is actually in the capture
> engine and users can provide boolean expressions.  Greatly simplifies
> the logging.

We need generate some sort of id which is specific to a function's signature 
and then have a way to match this with the running target image and finally 
the resulting data. Once we have this the capture engine can handle the 
trigger and filter processing keeping this code in one place.

We may have the tool generate code that is compiled into the application to 
handle static filters and triggers rather than a user interface. This code is 
just making calls to the capture engine.

Let me explain. We could have a target that is connected to via TCP. The user 
has wrapped a list of functions then interacts with the running target to set 
filters and triggers dynamically in a series of runs to isolate a bug. The 
second example is a system where the list of wrapped functions is generated 
from the static filter and trigger commands which are also turned into target 
code run during initialisation. Both systems use the same capture engine code.

> Plus with the wrapping idea, you turn the wrapper completely off via
> ld arguments.  That part can be done manually or via a GUI to generate
> a file that is passed to the ld command line.

This is something we need to figure out. For example do I build my RTEMS 
application as I do then say capture tool relink to be a traceable executable 
or do users application build systems need to know what to do. The issue here 
is how we manage or help the user with the build process when wrapping.

>> 2. Triggers are to be managed by the event ID.

I call "event ID" trace points. We can have system trace points like we 
currently have in the capture enigine, function trace points (wrapped 
functions) and user trace points where a user makes an explicit call.

>>
>> 3. I has no idea about the filter for these events. For existing task 
>> related one, we could filter by priority, but how about the wrapped 
>> one? Could you tell some example for me to consider?
>>
> Based upon the id of the executing thread. Try this example:
> 
> log all semaphore manager calls for thread 32.
> log all context switches
> log all malloc/free calls for thread 75.
> 
> Start tracing after 10 seconds and log for 15 seconds.

The first examples are filters while the last one is a trigger. The way to 
think about this is to consider a hardware logic analyser. You can filter the 
signals you want to look at then you specify a trigger. Once triggered you 
capture based on the filtering.

The detail of the implementation is the way in which a user says "malloc" and 
we match that to a specific trace point (id) which has a set of flags we check 
against. The example filter:

  log all semaphore manager calls for thread 32

is rather complex because we need to pattern match "rtems_semaphore_*" to find 
all the calls and the id's then add the thread id to each trace point's 
control data. The check of the thread id is similar to the "by" fields 
currently in the capture control structure. The above could be written as:

  filter rtems_semaphore_* from 32

To filter another task as well run the command again with a new thread id.

>> Regarding to other parts of the project, I understand as below, please 
>> comment.
>> 1. Transmission of logged data to host
>>
>>           o The first challenge of this part is to make the system and
>>             the log survive from high traffic.
>>           o The second challenge is to implement it in TCP while
>>             providing the flexibility for other transports in future.
>>           o The third challenge is to trade off minimizing the data to
>>             be sent across the network against the over-head to
>>             compress data.
>>
> Right.  This is something to investigate and do the trade-offs in the 
> early phase of the project.

We also need to handle the control data to manage dynamic triggers and filters.

> 
> The "communications transport" code must be pluggable and replaceable
> byu the user.  This simplifies your job because a simple TCP stream could
> be the first implementation.  Leaving it as a plugin point, lets it be 
> changed
> by a user or a subsequent developer.  You provide a baseline.
>> 2. Receipt of logged data on host
>>
>>           o Use Python to write a CLI to receive data into a file.
>>
> That's the first step. You have to receive the raw log data.
> 
> Chris is a Python fan.. I am concerned about the functionality.  If python
> makes it easier, so be it.  It does make it portable to other hosts.
>> 3. Decoding of logged data on host
>>
>>           o From the compressed data received, decode it for
>>             high-level application to display as tracing data.
>>
> Right.  Next step.. an ascii dump.  Not pretty but accurate. 
>> 4. GUI Timeline Visualization tool
>>
>>           o Use GNUplot to integrate the decoded data into
>>             visualization format. What kind of visualization is expected?
>>
>>
> You don't have to use GNUplot and you might not even get to this phase.  
> There
> is likely a lot of work to do.
> 
> The Linux Trace Toolkit, WindView, and Spyker are similar tools for other
> OSes.  Take a look at them.  I know Skyker has a scrolling timeline type
> interface where you are kind of reading an "EKG" type look at system
> events marked on the timeline.
>> Thanks,
>> Alan
>>
>>
>> 2008/3/23, Joel Sherrill <joel.sherrill at oarcorp.com 
>> <mailto:joel.sherrill at oarcorp.com>>:
>>
>>     > [Alan] What is the event type indicator here? Current capture engine
>>     > provide specific handler for each event. Now to integrate to capture
>>     > engine from ALL the wrapper function, it would have SOLE handler in
>>     > capture engine, is that right? Then besides all the arguments
>>     could be
>>     > passed into
>>
>>     That is still TBD.  It could be a "class" indicator byte followed
>>     by an
>>     class specific
>>     event.  Something like:
>>
>>     + class
>>     + class specific message indicator
>>     + size of data following
>>     + data
>>
>>     Then the SOLE capture engine is just moving data which is decoded
>>     on the
>>     host side.  Each "class" could provide a decode routine.
>>
>>     > capture engine, what kind of event type indicator could be passed
>>     > along here?
>>     >
>>     >     Chris was sure he could take the same list of function
>>     prototypes
>>     >     and auto
>>     >     generate python code to decode the trace log.
>>
>>
> 
> 



More information about the users mailing list