[GSOC] Bdbuf improvements

Thu Mar 29 10:59:58 UTC 2012

Hi

I'm interested in Bdbuf improvements as my GSOC project this year. The
main goal of the project is to change existing replacement strategy
(LRU) to more advanced cache algorithms(for example LIRS). The IO
performance will benefit from this change. It will also bring some API
change in Disk and Driver module. Here are some details and comparison
about the cache algorithms:

LIRS[1] means Low Inter-reference Recency Set, which is a cache
algorithm introduced by Song Jiang and Xiaodong Zhang in
SIGMETRICS'02. The reason for LRU to behave poorly is that LRU makes a
bold assumption a block that has not been accessed the longest would
wait for relatively longest time to be accessed again. This assumption
cannot capture the access patterns exhibited in these workloads with
weak locality.The algorithm takes into consideration the
Inter-Reference Recency of pages as the dominant factor for eviction.
The Inter-Reference Recency (IRR) of a page refers to the number of
other pages accessed between two consecutive references to that page.
It is assumed that if current IRR of a page is large, then the next
IRR of the block is likely to be large again and hence the page is
suitable for eviction as per Belady's MIN. It needs to be noted that
the page with high IRR selected for eviction may also have been
recently used[2].

Song Jiang concludes the LIRS in his presentation as :
Effectively use deeper access history without explicit regularity
detection and high cost operations.
Outperform exiting replacement policies.
Its implementation as simple as LRU.
Applicable to virtual memory and database buffer management.

LIRS is used by MySQL.

The Adaptive Replacement Cache (ARC) is an adaptive page replacement
algorithm developed at the IBM Almaden Research Center [3]. It seems
that the performance of ARC is better than LIRS, but the algorithm is
patented (by IBM).

There also may other advanced cache algorithms like CLOCK-Pro [4],CAR
(CLOCK with Adaptive Replacement )algorithms, 2Q algorithm and etc. In
CLOCK-PRO paper, it is indicated that CLOCK-PRO is superior than LIRS
and other buffer algorithms (including ARC)[5].

I am not certain whether the above algorithms are all appropriate for
Block Buffer. I also take a look at Linux file cache. It seems that
Linux does not use a such complex strategy[6]. I believe the advanced
cache algorithms will gain the performance in IO heavy application
with adding some complexity and overload as a trade off. Since the
LIRS is widely used in database system, I think it's appropriate for
the Block Device Buffer Management in RTEMS. To apply these
algorithms, the existing API also need some change.

As the direction of the wiki[7], here is my draft plan:
The very first thing is to create a set of benchmarks and the test
cases for Block Device Buffer Management.
Then I will implement the algorithms in C language as a demo. To apply
the algorithms to RTEMS, some API will be changed. So I will carefully
discuss the change in the community.
The next thing should be running the benchmarks and test cases. Memory
footprint and other overload will be compared between two algorithms
if possible.
If everything goes well the new cache algorithm will be added as an
experimental option. And more optimization will be added.

Last year I helped to add the file system POSIX compatibility test
suite to RTEMS. I have more free time than last summer, so I think
this project is OK for me. I just start to read the Block Device
Buffer Management source code. Any advice are welcomed. After
gathering the feedback, I will complete my proposal as soon as
possible.

[1]LIRS
http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-02-6.pdf
[2]A Comparison of Page Replacement Algorithms	
http://www.ijetch.org/papers/218-T697.pdf
[3]ARC: A Self-Tuning, Low overhead Replacement Cache
http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.13.5210
[4] Clock-Pro
http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/abs05-3.html
[5]A discuss about buffer cache in  postgresql DB maill list
http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg173951.html
[6]Linux cache swap
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=blob_plain;f=mm/swap.c;hb=refs/heads/master
[7]Bdbuf improvements
http://www.rtems.org/wiki/index.php/Bdbuf_improvements