12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576 |
- .\" $Id$
- .TH LAT_CTX 8 "$Date$" "(c)1994 Larry McVoy" "LMBENCH"
- .SH NAME
- lat_ctx \- context switching benchmark
- .SH SYNOPSIS
- .B lat_ctx
- .I [-s size_in_kbytes]
- .I #procs [#procs ...]
- .SH DESCRIPTION
- .B lat_ctx
- measures context switching time for any reasonable
- number of processes of any reasonable size.
- The processes are connected in a ring of Unix pipes. Each process
- reads a token from its pipe, possibly does some work, and then writes
- the token to the next process.
- .PP
- Processes may vary in number. Smaller numbers of processes result in
- faster context switches. More than 20 processes is not supported.
- .PP
- Processes may vary in size. A size of zero is the baseline process that
- does nothing except pass the token on to the next process. A process size
- of greater than zero means that the process does some work before passing
- on the token. The work is simulated as the summing up of an array of the
- specified size. The summing is an unrolled loop of about a 2.7 thousand
- instructions.
- .PP
- The effect is that both the data and the instruction cache
- get polluted by some amount before the token is passed on. The data
- cache gets polluted by approximately the process ``size''. The instruction
- cache gets polluted by a constant amount, approximately 2.7
- thousand instructions.
- .PP
- The pollution of the caches results in larger context switching times for
- the larger processes. This may be confusing because the benchmark takes
- pains to measure only the context switch time, not including the overhead
- of doing the work. The subtle point is that the overhead is measured using
- hot caches. As the number and size of the processes increases, the caches
- are more and more polluted until the set of processes do not fit. The
- context switch times go up because a context switch is defined as the switch
- time
- plus the time it takes to restore all of the process state, including
- cache state. This means that the switch includes the time for the cache
- misses on larger processes.
- .SH OUTPUT
- Output format is intended as input to \fBxgraph\fP or some similar program.
- The format is multi line, the first line is a title that specifies the
- size and non-context switching overhead of the test. Each subsequent
- line is a pair of numbers that indicates the number of processes and
- the cost of a context switch. The overhead and the context switch times are
- in micro second units. The numbers below are for a SPARCstation 2.
- .sp
- .ft CB
- .nf
- "size=0 ovr=179
- 2 71
- 4 104
- 8 134
- 16 333
- 20 438
- .br
- .fi
- .ft
- .SH BUGS
- The numbers produced by this benchmark are somewhat inaccurate; they vary
- by about 10 to 15% from run to run. A series of runs may be done and the
- lowest numbers reported. The lower the number the more accurate the results.
- .PP
- The reasons for the inaccuracies are possibly interaction between the
- VM system and the processor caches. It is possible that sometimes the
- benchmark processes are laid out in memory such that there are fewer
- TLB/cache conflicts than other times. This is pure speculation on my part.
- .SH ACKNOWLEDGEMENT
- Funding for the development of
- this tool was provided by Sun Microsystems Computer Corporation.
- .SH "SEE ALSO"
- lmbench(8).
|