Andi's occasional blog.
Recent presentations and papers
- Mental Models for modern program tuning at ACM Applicative 2016. How to think of modern program performance, in particular caches, and how to calibrate performance using hardware monitoring techiques.
- Introduction to last branch records and Advanced uses of last branch records at Linux Weekly News. Theory and practice of last branch record (LBR) profiling with Linux perf and other usages.
Intel Processor Trace on Linux, Aug 2015, at Linux Trace Summit 2015. Overview of Processor Trace on Linux.
A section on Linux perf in Energy Efficient Servers. Blueprints for Data Center Optimization, Apress, 2015, a book describing modern power management.
Adding Processor Trace to Linux, LWN, Jul 2015, describes adding the Processor Trace mechanism to Linux perf, which allows control program flow tracing in the CPU.
TSX anti-patterns in lock elision code,
Blog post. Common mistakes in TSX lock elision implementation.
Scaling Existing Lock-based Applications with Lock Elision,
CACM/ACM Queue, Feb 2014. Introductionary article for lock elision with Intel TSX.
Improving Linux development with better tools, Oct 2013, at CLK 2013, Shanghai.
Existing tools and challenges for improving Linux kernel development with better tools, including
static checkers, dynamic checkers, tracers.
TSX Linux update, Sep 2013, at Linux Plumbers. Current state of lock elision on Linux.
Modern Locking, Apr 2013, at Linux FS/MM/Storage summit. How to get good locking performance.
This is an extract from a recent Intel lock performance paper I contributed to.
gcc link time optimization and the Linux kernel, Apr 2013, at Linux collab summit 2013. Compiler bottlenecks with the gcc 4.7+ LTO implementation.
Lock elision in the GNU C library, Feb 2013, Article in LWN. Introduction to lock elision and describes the glibc lock elision implementation.
Adding Lock elision to Linux at Linux Plumbers 2012. How to add lock elision to glibc and the Linux kernel. The glibc elision implementation is also available.
- Linux kernel scaling at Linux Plumbers 2011 (with Tim Chen)
Short overview over current scalability problems in the Linux kernel,
and some improvements that have been done.
- Problems in fork and locking primitives (with Tim Chen)
Scalability problems in fork and some early data on better spinlocks
for the Linux kernel.
- Modern CPU Performance Analysis on Linux at
LinuxCon Japan 2011. Overview of simple and advanced profiling features on Intel CPUs
using free Linux tools. Introduces simple-pmu.
- TSX enabled glibc. Use lock elision with unchanged Linux binaries. Experimential elision for Linux kernel locks linux-misc (branch names hle*/combined)
- tsx-tools to use TSX with older compilers.
- PCMPSTR calculator. Interactive exploration of the SSE4.2 string instructions.
- Linux kernel Link Time Optimization at github(branch lto*)
- Advanced profiling with pmu-tools on Intel systems. Includes tools to do cycle decomposition, use all Intel events and profile the uncore counters. See the pmu-tools introduction and a description of the toplevel tool.
- ftracer to trace function calls for user space programs
- Random hacks at github
Old inactive projects
The 2.6.35 long term linux kernel tree
is a simple reliable way to measure cycles in a program on a Intel x86 CPU.
mcelog: the machine check handling daemon
User space backend to process machine check events (hardware errors)
on x86-64 Linux. Should run on every x86-64 Linux box now
on every 32bit x86 Linux machine too.
More information on mcelog.org.
Test suite for Linux machine checks, see git trees at
The Linux kernel machine check update tree is in various branches
This implements support for MCA recovery on x86 CPUs among other things.
- mcelog - memory error handling in user space
at Linux Kongress 2010 (
Introduction to memory errors on modern systems and a description how
the mcelog daemon handles and avoids them. Includes an overview of
- Linux multi-core scalability at Linux Kongress 2009
(paper, slides). Introduction to multi-core scalability tuning, with a focus on the Linux kernel and some application workarounds.
- On submitting kernel patches
for Ottawa Linux Symposium 2008 (
slides). Describes strategies to successfully
submit new technology to the Linux kernel. While this has been written for the Linux
kernel, many of the approaches described in the paper should apply to other large free
software projects too.
- The Linux Kernel hacker generations for
Linux.conf.au 2007. Attempts to characterize different groups of Linux kernel developers.
- Where is the memory going? Memory waste under Linux for Linux Kongress 2006 (
paper, slides). Overview of where memory is consumed in the Linux kernel.
- Machine check handling on Linux (
paper, slides) for Linux Kongress 2004. Slides from the conference covering some areas in more details. Describes the design of a rewrite of the Linux x86-64 machine
check code. Machine checks are used by the CPU to report data corruption.
- Another presentation on Linux x86 machine checks aimed at their analysis in the field.
- Porting Linux to x86-64
unsearchable version of the paper, but including graphics,
slides) from Ottawa Linux Symposium 2001.
This was the original paper describing the Linux x86-64 kernel port back when x86-64 was only available on simulators. These days the architecture
is also known as x86_64, AMD64, Intel 64, EM64T, IA32e, x64 (in general
it was not lucky with its name choices)
The x86-64 port has significantly progressed since back then and some details in the papers are outdated by now.
Still the paper gives a reasonable introduction to the kernel port and rationale for many of the design
I apologize for the poor fonts quality. The original TeX source is lost and there seems to be something wrong
with the fonts in the remaining pdfs.
- Overview of the x86-64 kernel for Linux Bangalore 2004 (
slides). Introduction of the x86-64 Linux kernel. Note some APIs
described in there like the 32bit compat emulation are obsolete by now.
- A NUMA API for Linux. Original,
This paper describes the Linux interfaces that can be used to optimize Linux applications for NUMA (Non Uniform
Memory Architecture) systems.
Note the paper describes the state of 2004. There has been further Linux NUMA development since then, but all
the basic interfaces and concepts described in the paper are still valid.
- Porting to Hammer: Some pitfalls to avoid
when porting 32bit Linux software to x86-64. Very old -- dates back to the early
days of the Linux/x86-64 port -- but should be still useful. All issues
described in there are still valid.
- Scalability of modern Linux kernels
at LinuxCon Japan 2010. Provides an introduction to scalability concepts
on modern multi-core systems. Followed by overview of scalability on modern
Linux kernels, with some tips on how applications can help the kernel.
- Unified error handling -- A worthy goal?
at Linux Plumbers 2009. Towards an improved
error reporting infrastructure for Linux.
- Ongoing evolution of Linux x86 machine check handling at LinuxCon 2009.
Introduction to platform hardware errors on modern x86 machines
(including detailed flows) and recent improvements to the Linux x86 machine
check handling, with a focus on memory errors. Includes an overview of
MCA recovery and a description of the Linux 2.6.32+ application memory
error handling interface.
- Experiences of a x86 maintainer, Tokio Linux User's group Feb 2009, Some thoughts
on the job of a Linux subsystem maintainer with a bit of history of
the x86-64 project thrown in. It covers how a project like this changes
from initial obscurity to mass market.
- Predictive bitmaps, Linux Session in Wroclaw, 2008, An experiment in
improving the performance of Linux demand paging. Contains a introduction to Linux demand paging.
- How to do nothing efficiently or better laziness: No idle tick on x86-64 (slides), 2006.
Describes an outdated experimental implementation of noidletick for x86-64. The current mainline kernels support this too (CONFIG_NOHZ),
but the implementation is significantly different.
- How to find bug fixes in the mainline Linux kernel for backporting
- Hardware timer requirements in x86 Linux.
- autoboot. autoboot allows automatic compiling and testing of Linux kernel on a test machine pool with automatic recovery from any hangs. The scripts are available on
- Linux networking topics slides
from Ottawa Linux Symposium 2000 (HTML version). Covers advanced, not widely known
Linux network stack user interfaces including
error reporting for unconnected sockets,
path MTU discovery for UDP/RAW datagram sockets, the abstract name space for Unix sockets, queued SIGIO, netlink, fd/credentials passing over Unix sockets and others.
Aimed at advanced Linux user space network application programmers. More details,
originally from this presentation,
are also available in the Linux networking man pages, originally
in the netman project,
but now shipped as standard Linux
man pages with Linux
distributions. For the topics in the presentation please see the
manpages. In kernel netlink and queued SIGIO is currently not covered well by the man pages.
Really old projects
- linux32: Change architecture of processes on x86-64 to ix86.
- Various patchkits for different kernel projects are available at
halobates.de/pub/ak. Usually in quilt format. Includes the mask allocator and pbitmaps (see also the presentation). Also has various other older kernel projects now being worked on by other people (e.g. PAT , gbpages).
Access Linux kernel logs using IEEE1394/firewire. Originally from
Ben Herrenschmidt. See also
Bernd Kaindl's fireproxy for GDB access over
firewire and some other related patches.
- Perl hacks
- Shell hacks
- elisp hacks
- C hacks
Automatic booting and testing of Linux kernels. See the presentation
- More projects available at the kernel.org
git server in /pub/scm/linux/kernel/git/ak.
a tree with the
SCSI ISA rework part of the mask allocator, the
machine check tree for various machine check patches and a
misc tree for various stuff.
- numactl: User space tools for Linux NUMA policy control. Includes
the libnuma programmer API, the numactl process control
tool, numastat and other tools. See the
paper for an overview.
In early 2008 I passed on maintenance
of the numactl package to Cliff Wickman at SGI.
The new numactl homepage is
located at SGI
and a new mailing list at email@example.com. Please send any
queries regarding numactl to the mailing list.
- Various kernel projects in quilt format:
merged into kernel.org mainline),
dma mask allocator,
- x86-64 Linux tree
My old x86--64 linux kernel tree in quilt or patch format. Mostly x86 patches, but some other
(broken out patches against
This is outdated by now and not kept uptodate. The i386 and
x86-64 architectures are now maintained as a new x86
separate x86-64 architecture before the "merged" 32bit/64bit x86 architecture is also still available. I put merged in quotations marks because most code is not merged
yet (as of 2.6.27), but just moved.
- Perl hacks
- Shell hacks
- elisp hacks
- C hacks (with some newer programs)
- Updated network manpages for Linux 2.1. Now part of standard Linux
- A collection of scripts to generate colorful
xterms. This is useful if you're bored of always having the same
color in your xterm. Including
blackterm (shades of grey with dark background),
whiteterm (shades of grey with light background) and
with random colors (put
into your $HOME/lib).
These all require the excellent sgi-screen fonts installed (on opensuse
in the sgi-fonts package)
What is halobates?
Halobates is a family of water striders. Five species of them live on the surface of the
high ocean. They are the only insects to inhabit the high ocean. They never dive.
How they survive storms I have no idea. They are one of the few species that benefit from the Great Pacific Garbagepatch.
I find that impressive so I named a domain after them. More information at wikipedia