RFC: Bdbuf transfer error handling
Sebastian Huber
sebastian.huber at embedded-brains.de
Sat Nov 21 10:09:28 UTC 2009
Chris Johns wrote:
> Sebastian Huber wrote:
>> Write transfer errors happen in the context of the swapout task. We
>> have four
>> variants.
>>
>> W1. Disk Delete with User
>>
>> We set the buffer state to CACHED (this prevents transmission
>> retries) and the
>> error field to an error status. This is a TODO currently.
>>
>> W2. Disk Delete without User
>>
>> We discard the buffer. The modified data is lost. This is a TODO
>> currently.
>>
>> W3. Write Error with User
>>
>> a) We set the buffer state to MODIFIED and the error field to an error
>> status. This is the current approach. This may lead to an
>> infinite loop
>> of transmission retries. This buffer cannot be recycled until
>> it was
>> successfully transmitted.
>>
>> b) We set the buffer state to CACHED (this prevents transmission
>> retries) and
>> the error field to an error status. A read request will return
>> the cached
>> buffer unless someone recycled the buffer in the meantime. So the
>> behaviour is not deterministic and depends on the future cache
>> usage. The
>> user cannot determine if the error field indicates a read or
>> write error.
>>
>> c) We set the buffer state to PURGED. A read request will lead to
>> a read
>> transfer. This will return the block data on the media or an error
>> status. It is independent from the future cache usage. The
>> error field
>> is used only for read transfer errors. The modified data is lost.
>>
>> W4. Write Error without User
>>
>> a) We set the buffer state to MODIFIED and the error field to an error
>> status. This is the current approach. This may lead to an
>> infinite loop
>> of transmission retries. This buffer cannot be recycled until
>> it was
>> successfully transmitted.
>>
>> b) We discard the buffer. The modified data is lost.
>>
>> *******************************************************************************
>>
>>
>> I propose that we change the error handling in W3 to strategy c) and
>> in W4 to
>> b). This prevents possible infinite transmission loops. The error
>> field in
>> the buffer is used only for read transfer states. After a write
>> error we have
>> a deterministic behaviour.
>
> I agree the loop needs to be broken. I am confused about the term
> 'user' and have a few questions.
>
> What do you mean by a 'user' ?
A user is someone who waits for this buffer (we have bd->waiters > 0).
> Who would be looking at these errors in the cache ?
Since the write transfer is decoupled from the buffer release via the
swapout task we have no error indication at release time. Maybe we can
introduce a statistic which registers error conditions.
> What is the difference between PURGED and discarded ?
Discard means that the buffer will be removed from the AVL tree and
prepended to the LRU chain. This is only possible when nobody waits for it.
The PURGED state indicates that this buffer is in the AVL tree. It is
not in the LRU chain, since it has a user. This user may be a sync
waiter or a read/get waiter. A read request will trigger a read transfer.
>
> Chris
>
--
Sebastian Huber, Embedded Brains GmbH
Address : Obere Lagerstr. 30, D-82178 Puchheim, Germany
Phone : +49 89 18 90 80 79-6
Fax : +49 89 18 90 80 79-9
E-Mail : sebastian.huber at embedded-brains.de
PGP : Public key available on request
Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.
More information about the users
mailing list