Memtest: Finding holes in the VM system
                    ***************************************
                                      WIP
                                      ***
                                 Juan Quintela
                                 =============
                         Department of Computer Science
                         ==============================
                             Universidade da Coruña
                             ======================
                             quintela@dc.fi.udc.es 
                             ======================
  
                Abstract:  This paper describes the development of a test suite
                for the VM subsystem and several of the resulting programs in
                detail. A proposal for dealing with the shown bottlenecks are
                made. This suite of programs is called memtest. The suite is
                composed of several programs that generate different kind of IO
                and memory loads, such as writing big files (mmap*), using a lot
                of shared memory (ipc*), programs that do a lot of memory
                allocations/frees. This test suite is not used for benchmarking,
                it is used to find bottlenecks.In the presentation we discuss
                the goals of several tests. The part of the kernel they affect
                and what are the good ways to handle them.The tests that form
                the suite has been contributed by several people. The suite
                intend to be a place to put tests when anybody find a bottleneck
                and write a program to show it, then it is easy to make sure
                that future versions don't have that problem. 
  

1   Introduction
*=*=*=*=*=*=*=*=

  This paper describes the development of a test suite for the VM  subsystem.
This test suite was used to let people get for a single place programs for
testing the system for errors, and a place where you can put code that found
bugs in previous implementations and then see that we don't have the same
problem again.In the section 2 is described the born of the memtest suite. In
the next sections, I describe several of the tests, what they do and what
problems they find, and what was the solution used to solve them.

2   Previous life
*=*=*=*=*=*=*=*=*

   In the beginning, the author was an happy PhD. student that was working in
his PhD. thesis. The programs related with his thesis stressed a lot the VM
layer and made my linux machine die hard (die in the sense of Oops). He used the
standard procedure in this cases: He wrote a nice Bug report to the Linux kernel
mailing list waiting for the nice kernel  hackers to fix his problem. That was
in the 2.2 kernel era. But nothing happened. He continued working in his thesis,
thinking that the problems will be solved in 2.4 with the new memory layer/page
cache .... Each time that Linus get out a new kernel version, he tested the new
kernel version, it normally solved some problems and other appeared ..... At the
end (end???) of the 2.3 era (i.e. 2.3.99-preX), he found that his problems has
not solved yet. Then he thought that it would be a good idea to try to  help the
kernel hackers to fix the problem. At the same time, it happened that Rik van
Riel came to my University to give a couple of conferences. He was the right
person to show the problems that he was having. He show him the problems, and he
asked for a small program which reproduced the hangs. Memtest was born. After I
have some programs written, I got other people who asked me to get more programs
inside the suite.

3   The tests
*=*=*=*=*=*=*

  One important thing about almost all the tests in memtest is that they are
test to check that the VM layer behaves well, they are not benchmarks in the
speed/space sense. It is good that this programs run well, but the important
thing is that they should not run too bad. In the future, I will try to also add
some speed tests to it, or at least to give some pointers to other benchmarks
and the way to use them for searching for several bottlenecks.

4   mmap001 tests
*=*=*=*=*=*=*=*=*

   The mmap tests are examples based in the works that I was doing in my PhD.
mmap001 is a test that creates a file the size of the physical memory of the
machine and then writes it sequentially. In the 2.3.99-preX kernel series, a
machine with 128MB of RAM will stall during as much time as 30 seconds because
in the kernel, it waited for starting writing something to disk until there was
no more free space. At that point the kernel started to write asynchronously the
dirty pages, but all the pages was dirty. Then it started to write
asynchronously the whole memory of the machine and that took a lot of time to
succeed. This test is one of the clearest examples that memtest is not a
benchmark. This test is supposed that for the system, it should be the same mmap
a big file and write to it sequentially than do normal writes to the same size
file. This test is not needed to run very fast, but a normal user will not
expect the whole system to stall for minutes. Once that the biggest stalls have
been solved there was still problems that the kernel got loads around 15 while
running this test, what is also not expected in a normal system.

5   mmap002
*=*=*=*=*=*

   This test is a continuation of the previous test. It mmap a file twice the
size of the memory, then it writes sequentially the first half of the file. Then
it opens an anonymous mapping and copies all the info from the file to the
anonymous mapping. After that it copies again the shared mapping to the other
half of the file. This test showed that the kernel at that moment begins to swap
when it was too late. It waited for doing any swap, writing to disk, etc until
there was no more clean pages, at that point, all the system was doing really
bad (read trashing). The problem here, is that the system tried too hard caching
pages (and specially dirty pages). Other problem was that all the process were
having problems for doing allocations, when there was only a memory hog, and it
was supposedly easy to detect it. Well, at the end it showed that it was not so
easy to detect the memory hog, when there was only one, and the problem become
really nasty when we had several memory hogs. This is another test, where all
the heuristics of the kernel made it to work worst (i.e. we never reused a
single page of the file, and we walk sequentially files bigger than physical
memory. That made that a system that don't do any cache will be faster running
that test. But for normal use, we prefer a system that do caching. The problem
here is that we want to detect when we are accessing a file only sequentially
and then not doing so much caching. One user could not wait a full speed write
with mmap002, but he will also not wait that the system enter trashing when
there is only a process that is doing linear copies of files.

6   ipc001
*=*=*=*=*=

   This test was created by Christoph Roland to test System V  shared memory. It
creates several shared memory segments and try to test that all the operations
in the system (writes, attach, dettach, ...).

7   Future Work
*=*=*=*=*=*=*=*

   There are several things that I have planned to do to make the memtest suite
more useful:
  
 -  Put more documentation for the suite, to make easy for other  people to use
   it and more importantly, to understand the results,  while they are wrong or
   correct.
 -  Get more tests for the suite. Folks, I am accepting submissions  of what are
   the tests/scripts/knowledge that you use for testing the  system.
 -  Write more documentation.
 -  Make the tests more modular. The idea is to be able to get an  easy way to
   simulate real loads. I want to make easy to define in  an easy way tests like
   misc001, where there are one process making  malloc()/free() for 1/3 of the
   memory, uses mmap()  for other third of the memory and fwrite() for other
   file.  That would be easy if the test where very simple, getting their 
   parameters from the command line.
 -  Have I told about having documentation to let other people doing  the
   previous kind of thing.
 -  Include benchmarks (or pointers to it) and document the way to  use them for
   measuring specific things. Just now everybody is using  some benchmarks for
   doing that, but each people use their small  subset and there is no way to
   know what benchmark is used for  measuring what, and what is the correct way
   to configure the  benchmarks. One example in this regard is comment things
   like  people use normally dbench 48 to measure the performance  of the file
   system and the page cache. An idea of what means the  parameter, what you
   should expect and the causes of previous  pitfalls will be very useful.
 -  Considering the possible integration of memtest with  LTP: Linux Test
   project  (http://oss.sgi.com/projects/ltp/). Their is a bigger and  more
   ambitious project, but there is a bit more difficult to write a  test for
   doing that. I am thinking about integrating both efforts  (i.e. let them to
   do all the difficult work), or at least making an  easy way to share code.
 -  Create a way to run the tests non-interactively. By non  interactively I
   mean that it should be possible to detect if the  system is responsible or
   not under high load created by the tests.  Just now, you have to guess if the
   system is more/less responsive  with a change, and there is not a way to run
   all the tests at night  and know at the following morning if some of the
   tests made the  interactive response bad. Playing music and hearing the pikes
   don't  work when you are out of the office. There is one program that does 
   something similar for the low-latency tests, I could use if for a  start.
  

8   Acknowledgements
*=*=*=*=*=*=*=*=*=*=

  I want to thank several people and institutions for the help that they have
giving me for doing memtest:
  
 -  Rik van Riel: He explained me a lot of things and helped me a  lot to begin
   working in the Linux Kernel.
 -  `#kernelnewbies' at irc.openprejects.net: There  is a lot of cool people
   connected there that helped me a lot with  discussions and explanations about
   all the questions that I have  about The Linux Kernel.
 -  All the people that contributed code and ideas to the suite.
 -  Conectiva and the Universidade da Coruña for  funding an SMP test machine
   that helped me to find holes, test bugs  and develop the suite (Like in the
   real life, almost all the bugs  shows up really faster when you are working
   with SMP). 
-------------------------------------------------------------------------------
  
   
                This document was translated from LaTeX by HeVeA
                (http://pauillac.inria.fr/~maranget/hevea/index.html).