<div dir="ltr"><br><div class="gmail_extra"><br><br><div class="gmail_quote">On Tue, Mar 11, 2014 at 3:46 PM, Philipp Eppelt <span dir="ltr"><<a href="mailto:philipp.eppelt@mailbox.tu-dresden.de" target="_blank">philipp.eppelt@mailbox.tu-dresden.de</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">On 03/11/2014 01:28 AM, Gedare Bloom wrote:<br>

> On Mon, Mar 10, 2014 at 6:48 PM, Philipp Eppelt<br>

> <<a href="mailto:philipp.eppelt@mailbox.tu-dresden.de">philipp.eppelt@mailbox.tu-dresden.de</a>> wrote:<br>

>> On 03/10/2014 04:24 PM, Youren Shen wrote:<br>

>>> What make me confused is the relation<br>

>>> between pok_arch_event_register and pok_meta_handler_init. It seems you<br>

>>> divided the irq vector to two parts in pok_arch_event_register, Less 32<br>

>>> or more than 32. It looks like you have already design some hypercall<br>

>>> interface. (just like pok_irq_prologue_0 for clock?)  But what's<br>

>>> the meaning of pok_meta_handler_init? I still can't understand it very<br>

>>> clearly.Could you give me some outline about IRQ handlind in POK which<br>

>>> invoke this two functions?<br>

>>><br>

>>> If you can provide me a brief overview about the way how you consider<br>

>>> this Issues and a brief description about your design,  it will be<br>

>>> really helpful to me.<br>

>><br>

>> There are 16 (0 - 15)  interrupt lines for hardware interrupts on x86.<br>

>> If a line is triggered, the PIC will send an interrupt to the CPU.<br>

>> If interrupts are enabled the CPU will ask for the interrupt number and<br>

>> looks up this number in the Interrupt Descriptor Table (IDT).<br>

>> The IDT for HW interrupts looks like this:<br>

>> 32 | clock  ISR (Interrupt Service Routine)<br>

>> 33 | keyboard ISR<br>

>> 34 | ...<br>

>> ...<br>

>> 47 | ...<br>

>><br>

>> INTEL reserved the first 32 (0-31) IRQ lines, so we start at 32 and go<br>

>> to 47. 32 corresponds to IRQ line 0, which is the clock interrupt. 33,<br>

>> is 1 is the keyboard (if I can trust my memory).<br>

>><br>

>> Now the CPU never tells you which IRQ line fires. Therefore, we register<br>

>> the prologue functions with the IDT, which knows its line number, pushes<br>

>> it on the stack and calls a general ISR handler.<br>

>> This general ISR handler checks the line number and calls the handler<br>

>> registered for this line. Therefore the general ISR handler maintains<br>

>> its own IDT, a software IDT.<br>

>> This enables us to register more than one ISR handler function for one<br>

>> interrupt line. For example, to handle the clock tick in the kernel and<br>

>> tell the guest system(RTEMS) running in a partition, that a clock tick<br>

>> occurred (two handlers).<br>

>><br>

>> But, we don't want the POK kernel to wait until the partition handled<br>

>> the interrupt.  So we acknowledge the interrupt with the PIC and then<br>

>> send the partition the soft-interrupt. And here we go from kernel to<br>

>> user space and this is the point, where I left of.<br>

>><br>

>> To be more specific in terms of source code.<br>

>> 'pok_arch_event_register' is called, if you want to register any kind of<br>

>> interrupt with the IDT. If this happens to be in the hardware interrupt<br>

>> range [32-47], it registers a prologue handler with the IDT.<br>

>><br>

>> all pok_irq_prologue functions call _ISR_Handler, which in turn calls<br>

>> _C_isr_handler. This is the general handler, first the asm part and<br>

>> second the C part.<br>

>> The _C_isr_handler  checks if the kernel has registered a handler for<br>

>> this IRQ number and calls it.<br>

>> Then it checks if the current partition has interrupts enabled, if yes,<br>

>> if there is a handler registered and if the partition isn't already<br>

>> servicing an earlier interrupt.<br>

>> If so, the registered handler is invoked.<br>

>><br>

>> If I am talking about 'registered handler' I am talking about the<br>

>> software IDT the kernel is maintaining.<br>

>> The software IDT for hardware interrupts is a static table consisting of<br>

>> 16 entries of the type 'meta_handler'.<br>

>> 'meta_handler' is a struct consisting of a vector number, and two tables<br>

>> of the size "kernel + configured number of partitions".<br>

>> The first table is for function pointers pointing to the<br>

>> partition's/kernel's hander function, the<br>

>> what-to-do-if-IRQ-occurrs-function.<br>

>> The second table flags if the partition is ready for an interrupt.<br>

>><br>

>> So for each interrupt entry in our software-IDT, we get a 'meta_handler'<br>

>> encapsulating a line number, atables with up to one handler per<br>

>> partition and a table if the partition is ready for interrupts.<br>

>><br>

>> Next to this software IDT, there is a table 'partition_irq_enabled',<br>

>> which has one flag per partition and is the software replacement for<br>

>> CLI/STI.<br>

>><br>

>> 'pok_meta_handler_init' sets up the software-IDT and fills all fields<br>

>> with start values (magic unused vector number, no handler present, but<br>

>> waiting)<br>

>> 'pok_partition_irq_init' sets up partition_irq_enabled table with the<br>

>> value for disabled (0), so initially no partition gets interrupts until<br>

>> it asks for them.<br>

>><br>

>><br>

>> How can partitions talk to the software-IDT?<br>

>> POK consists of kernel and partitions. Each partition has a libpok part.<br>

>> Libpok is the library that enables the partition to talk to other<br>

>> partitions and the kernel.<br>

>> An RTEMS guest has a POK partition part (libpart) and the RTEMS part.<br>

>> Libpart implements the communication with the POK kernel. So when RTEMS<br>

>> calls some virtualization layer function, the implementation present in<br>

>> libpart will emit a syscall to the pok kernel and pass along the IRQ<br>

>> callback function or it just tells to unregister, to<br>

>> enable/disable/acknowledge interrupts.<br>

>> Have a look at the virtualization layer functiosn in RTEMS's virtualpok<br>

>> BSP and examples/rtems-guest/ in POK.<br>

>> The syscall handling then forwards the request to the e.g.<br>

>> 'pok_bsp_irq_register_hw'.<br>

>><br>

>><br>

>><br>

>> I hope that fits into your definition of 'briefly explain'. But it<br>

>> should give you enough background and explanation to follow the code and<br>

>> understand the design.<br>

>><br></div></div></blockquote><div><br></div><div>Yes, it's more detailed than I expected. Thank you very much. I will understanding the code as soon as I can. Thank you. </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div class="HOEnZb"><div class="h5">

>> The really nasty bit happens in the '_C_isr_handler' function in<br>

>> x86-qemu/bsp.c.<br>

>> This is explained in my RTLWS'13 paper.<br>

> Link to paper please.<br>

</div></div><a href="https://wwwpub.zih.tu-dresden.de/~s8940405/rtlws13_rtems_in_pok_partitions.pdf" target="_blank">https://wwwpub.zih.tu-dresden.de/~s8940405/rtlws13_rtems_in_pok_partitions.pdf</a><br>

<div class="">><br>

>> In short: Each IRQ entry builds a stack frame, which saves the registers<br>

>> values on the stack, when the interrupt occurs, so we can continue<br>

>> execution at the same point.<br>

>> To handle the IRQ in user space and to return to the point of<br>

>> interruption, the user space handler needs this data. So the interrupt<br>

>> frame is copied from the kernel stack to the user stack. Then 'iret'<br>

>> makes the kernel-space to user-space transition. And that's where we get<br>

>> a GeneralProtectionFault.<br>

>><br>

> Can we just not use iret from the paravirtualized guest (RTEMS)?<br>

</div>With kernel-space and kernel stack, I mean the POK kernel-space and<br>

stack. Sorry, I should have made that clear.<br>

<div class=""><br></div></blockquote><div>In fact, the iret is a sensitive instruction in x86 paravirtualization. We have to replace it with a hypercall(syscall) in RTEMS. </div><div>I will start from the holes of x86 virtualization to decided what hypercall will be necessary.</div>

<div><br></div><div>As fot the iret General Protection Fault in POK, I need some times to review the code.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div class="">

This<br>

> problem reminds me of <a href="https://lkml.org/lkml/2011/12/16/460" target="_blank">https://lkml.org/lkml/2011/12/16/460</a><br>

</div>Interesting, I'll have a look.<br>

<div class="HOEnZb"><div class="h5"><br>

><br>

>> Have also a look at the interrupt_middleman function in<br>

>> rtems-guest/hello.c. This is the user space recovery code of the stack<br>

>> frame.<br>

>><br>

>><br>

>> Cheers,<br>

>> Philipp<br>

>><br>

>><br>

>><br>

>> p.s.<br>

>> This page has a couple of good tutorials for low level OS programming:<br>

>> <a href="http://www.brokenthorn.com/Resources/OSDev15.html" target="_blank">http://www.brokenthorn.com/Resources/OSDev15.html</a><br>

>> _______________________________________________<br>

>> rtems-devel mailing list<br>

>> <a href="mailto:rtems-devel@rtems.org">rtems-devel@rtems.org</a><br>

>> <a href="http://www.rtems.org/mailman/listinfo/rtems-devel" target="_blank">http://www.rtems.org/mailman/listinfo/rtems-devel</a><br>

<br>

</div></div></blockquote></div><br></div></div>