Bug with clock nanoseconds

Manuel Coutinho manuel.coutinho at edisoft.pt
Mon Mar 30 16:41:09 UTC 2009


Hi

 

We have discovered a bug of RTEMS relating to the clock. Even though we only
tested it for the ERC32 board (for now), we believe it is common throughout
all the targets.

 

The test consists in checking if the time given by rtems_clock_get is always
increasing. The code is as follows:

 

 

rtems_task Init(rtems_task_argument argument) {

    struct timespec old;

    struct timespec new;

 

    rtems_clock_get_uptime(&old);

 

    while (1) {

        rtems_clock_get_uptime(&new);

 

        if (new.tv_sec < old.tv_sec)

            failedTestExit(TEST_CASE, 10, "seconds decrease");

 

        if (new.tv_sec == old.tv_sec) {

            if (new.tv_nsec < old.tv_nsec) {

                /* this is the error given */

                printk("            new = %d %d\n", new.tv_sec,
new.tv_nsec);

                printk("            old = %d %d\n", old.tv_sec,
old.tv_nsec);

 

                /* commented just to continue the test */

                /*failedTestExit(TEST_CASE, 10, "second equals but
nanoseconds decreased");*/

            }

        }

 

        old = new;

 

        /* stop after 10 seconds */

        if (new.tv_sec > 10)

            sucessfullTestExit(TEST_CASE);

    }

 

    sucessfullTestExit(TEST_CASE);

}

 

 

This is the output we got using the SIS simulator:

            new = 0 50002000

            old = 0 59990000

            new = 0 90002000

            old = 0 99990000

            new = 0 100003000

            old = 0 109991000

            new = 0 120003000

            old = 0 129991000

            new = 0 160002000

            old = 0 169989000

            new = 0 210002000

            old = 0 219989000

            new = 0 260002000

            old = 0 269990000

            new = 0 300001000

            old = 0 309989000

            new = 0 320003000

            old = 0 329991000

            new = 0 360002000

            old = 0 369990000

            new = 0 370002000

            old = 0 379989000

            new = 0 380001000

            old = 0 389989000

            new = 0 420002000

            old = 0 429990000

            new = 0 470002000

            old = 0 479989000

            new = 0 500004000

            old = 0 509991000

            new = 0 520001000

            old = 0 529989000

            new = 0 550003000

            old = 0 559991000

            new = 0 600002000

            old = 0 609990000

            new = 0 650002000

            old = 0 659989000

            new = 0 670001000

            old = 0 679988000

            new = 0 690002000

            old = 0 699990000

            new = 0 730002000

            old = 0 739989000

            new = 0 770001000

            old = 0 779989000

            new = 0 790004000

            old = 0 799992000

            new = 0 800001000

            old = 0 809989000

            new = 0 840001000

            old = 0 849989000

            new = 0 860004000

            old = 0 869991000

            new = 0 870001000

            old = 0 879989000

            new = 0 890004000

            old = 0 899991000

            new = 0 900001000

            old = 0 909989000

            new = 0 920004000

            old = 0 929991000

            new = 0 930001000

            old = 0 939989000

            new = 0 950004000

            old = 0 959992000

            new = 0 960001000

            old = 0 969989000

            new = 1 40002000

            old = 1 49990000

            new = 1 60002000

 

 

As you can see, the problem is raised when a clock tick occurs: the time
read in "new" is older than "old". This cannot happen since "new" is updated
AFTER "old"! This problem occurs if a clock interrupt is triggered after the
number of clock ticks have been read but BEFORE the nanosecond field has
been read. The resulting number of clock ticks is "small" (should be plus
one). 

 

We can use the interrupt pending CPU registers info to determine if, while
we are reading the number of clock ticks that occurred and the nanoseconds,
a clock interrupt could be pending. 

 

We do not have enough knowledge to say that this solution can be used for
all boards (determining if an interrupt is pending). In the SPARC
architecture, at least, it is possible. If it is possible in all
architectures, then a solution to RTEMS would be to change the code of the
_TOD_Get_uptime function to:

 

 

void _TOD_Get_uptime(struct timespec *uptime) {

    ISR_Level level;

    struct timespec offset;

    volatile uint32_t pending;

 

    /* assume uptime checked by caller */

 

    offset.tv_sec = 0;

    offset.tv_nsec = 0;

 

    _ISR_Disable(level);

    *uptime = _TOD_Uptime;

    if (_Watchdog_Nanoseconds_since_tick_handler)

        offset.tv_nsec = (*_Watchdog_Nanoseconds_since_tick_handler)();

 

    /* code added: */

    pending = isClockInterruptPending();

 

    _ISR_Enable(level);

 

    /* code added */

    /* if an interrupt occurred while interrupts were disabled and the
nanoseconds is too little */

    /* means that a clock interrupt occurred BEFORE the nanoseconds were
read */

    if (pending && offset.tv_nsec <
rtems_configuration_get_microseconds_per_tick() / 2) {

        struct timespec clockTick = {0,
rtems_configuration_get_microseconds_per_tick()*1000};

        _Timespec_Add_to(&offset, &clockTick); /* so add one clock tick to
the offset */

    }

 

    /* else, the clock tick occurred AFTER the nanoseconds were read, so no
problem */

 

    _Timespec_Add_to(uptime, &offset);

}

 

 

At least, with these modifications, the test passes :)!

 

Kind regards

Manuel Coutinho

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rtems.org/pipermail/users/attachments/20090330/78ac94b8/attachment.html>


More information about the users mailing list