[linux.git] / Documentation / filesystems / ramfs-rootfs-initramfs.txt

ramfs, rootfs and initramfs
October 17, 2005
Rob Landley <[email protected]>
=============================

What is ramfs?
--------------

Ramfs is a very simple filesystem that exports Linux's disk caching
mechanisms (the page cache and dentry cache) as a dynamically resizable
RAM-based filesystem.

Normally all files are cached in memory by Linux.  Pages of data read from
backing store (usually the block device the filesystem is mounted on) are kept
around in case it's needed again, but marked as clean (freeable) in case the
Virtual Memory system needs the memory for something else.  Similarly, data
written to files is marked clean as soon as it has been written to backing
store, but kept around for caching purposes until the VM reallocates the
memory.  A similar mechanism (the dentry cache) greatly speeds up access to
directories.

With ramfs, there is no backing store.  Files written into ramfs allocate
dentries and page cache as usual, but there's nowhere to write them to.
This means the pages are never marked clean, so they can't be freed by the
VM when it's looking to recycle memory.

The amount of code required to implement ramfs is tiny, because all the
work is done by the existing Linux caching infrastructure.  Basically,
you're mounting the disk cache as a filesystem.  Because of this, ramfs is not
an optional component removable via menuconfig, since there would be negligible
space savings.

ramfs and ramdisk:
------------------

The older "ram disk" mechanism created a synthetic block device out of
an area of RAM and used it as backing store for a filesystem.  This block
device was of fixed size, so the filesystem mounted on it was of fixed
size.  Using a ram disk also required unnecessarily copying memory from the
fake block device into the page cache (and copying changes back out), as well
as creating and destroying dentries.  Plus it needed a filesystem driver
(such as ext2) to format and interpret this data.

Compared to ramfs, this wastes memory (and memory bus bandwidth), creates
unnecessary work for the CPU, and pollutes the CPU caches.  (There are tricks
to avoid this copying by playing with the page tables, but they're unpleasantly
complicated and turn out to be about as expensive as the copying anyway.)
More to the point, all the work ramfs is doing has to happen _anyway_,
since all file access goes through the page and dentry caches.  The RAM
disk is simply unnecessary; ramfs is internally much simpler.

Another reason ramdisks are semi-obsolete is that the introduction of
loopback devices offered a more flexible and convenient way to create
synthetic block devices, now from files instead of from chunks of memory.
See losetup (8) for details.

ramfs and tmpfs:
----------------

One downside of ramfs is you can keep writing data into it until you fill
up all memory, and the VM can't free it because the VM thinks that files
should get written to backing store (rather than swap space), but ramfs hasn't
got any backing store.  Because of this, only root (or a trusted user) should
be allowed write access to a ramfs mount.

A ramfs derivative called tmpfs was created to add size limits, and the ability
to write the data to swap space.  Normal users can be allowed write access to
tmpfs mounts.  See Documentation/filesystems/tmpfs.txt for more information.

What is rootfs?
---------------

Rootfs is a special instance of ramfs (or tmpfs, if that's enabled), which is
always present in 2.6 systems.  You can't unmount rootfs for approximately the
same reason you can't kill the init process; rather than having special code
to check for and handle an empty list, it's smaller and simpler for the kernel
to just make sure certain lists can't become empty.

Most systems just mount another filesystem over rootfs and ignore it.  The
amount of space an empty instance of ramfs takes up is tiny.

What is initramfs?
------------------

All 2.6 Linux kernels contain a gzipped "cpio" format archive, which is
extracted into rootfs when the kernel boots up.  After extracting, the kernel
checks to see if rootfs contains a file "init", and if so it executes it as PID
1.  If found, this init process is responsible for bringing the system the
rest of the way up, including locating and mounting the real root device (if
any).  If rootfs does not contain an init program after the embedded cpio
archive is extracted into it, the kernel will fall through to the older code
to locate and mount a root partition, then exec some variant of /sbin/init
out of that.

All this differs from the old initrd in several ways:

  - The old initrd was always a separate file, while the initramfs archive is
    linked into the linux kernel image.  (The directory linux-*/usr is devoted
    to generating this archive during the build.)

  - The old initrd file was a gzipped filesystem image (in some file format,
    such as ext2, that needed a driver built into the kernel), while the new
    initramfs archive is a gzipped cpio archive (like tar only simpler,
    see cpio(1) and Documentation/early-userspace/buffer-format.txt).  The
    kernel's cpio extraction code is not only extremely small, it's also
    __init text and data that can be discarded during the boot process.

  - The program run by the old initrd (which was called /initrd, not /init) did
    some setup and then returned to the kernel, while the init program from
    initramfs is not expected to return to the kernel.  (If /init needs to hand
    off control it can overmount / with a new root device and exec another init
    program.  See the switch_root utility, below.)

  - When switching another root device, initrd would pivot_root and then
    umount the ramdisk.  But initramfs is rootfs: you can neither pivot_root
    rootfs, nor unmount it.  Instead delete everything out of rootfs to
    free up the space (find -xdev / -exec rm '{}' ';'), overmount rootfs
    with the new root (cd /newmount; mount --move . /; chroot .), attach
    stdin/stdout/stderr to the new /dev/console, and exec the new init.

    Since this is a remarkably persnickety process (and involves deleting
    commands before you can run them), the klibc package introduced a helper
    program (utils/run_init.c) to do all this for you.  Most other packages
    (such as busybox) have named this command "switch_root".

Populating initramfs:
---------------------

The 2.6 kernel build process always creates a gzipped cpio format initramfs
archive and links it into the resulting kernel binary.  By default, this
archive is empty (consuming 134 bytes on x86).

The config option CONFIG_INITRAMFS_SOURCE (in General Setup in menuconfig,
and living in usr/Kconfig) can be used to specify a source for the
initramfs archive, which will automatically be incorporated into the
resulting binary.  This option can point to an existing gzipped cpio
archive, a directory containing files to be archived, or a text file
specification such as the following example:

  dir /dev 755 0 0
  nod /dev/console 644 0 0 c 5 1
  nod /dev/loop0 644 0 0 b 7 0
  dir /bin 755 1000 1000
  slink /bin/sh busybox 777 0 0
  file /bin/busybox initramfs/busybox 755 0 0
  dir /proc 755 0 0
  dir /sys 755 0 0
  dir /mnt 755 0 0
  file /init initramfs/init.sh 755 0 0

Run "usr/gen_init_cpio" (after the kernel build) to get a usage message
documenting the above file format.

One advantage of the configuration file is that root access is not required to
set permissions or create device nodes in the new archive.  (Note that those
two example "file" entries expect to find files named "init.sh" and "busybox" in
a directory called "initramfs", under the linux-2.6.* directory.  See
Documentation/early-userspace/README for more details.)

The kernel does not depend on external cpio tools.  If you specify a
directory instead of a configuration file, the kernel's build infrastructure
creates a configuration file from that directory (usr/Makefile calls
scripts/gen_initramfs_list.sh), and proceeds to package up that directory
using the config file (by feeding it to usr/gen_init_cpio, which is created
from usr/gen_init_cpio.c).  The kernel's build-time cpio creation code is
entirely self-contained, and the kernel's boot-time extractor is also
(obviously) self-contained.

The one thing you might need external cpio utilities installed for is creating
or extracting your own preprepared cpio files to feed to the kernel build
(instead of a config file or directory).

The following command line can extract a cpio image (either by the above script
or by the kernel build) back into its component files:

  cpio -i -d -H newc -F initramfs_data.cpio --no-absolute-filenames

The following shell script can create a prebuilt cpio archive you can
use in place of the above config file:

  #!/bin/sh

  # Copyright 2006 Rob Landley <[email protected]> and TimeSys Corporation.
  # Licensed under GPL version 2

  if [ $# -ne 2 ]
  then
    echo "usage: mkinitramfs directory imagename.cpio.gz"
    exit 1
  fi

  if [ -d "$1" ]
  then
    echo "creating $2 from $1"
    (cd "$1"; find . | cpio -o -H newc | gzip) > "$2"
  else
    echo "First argument must be a directory"
    exit 1
  fi

Note: The cpio man page contains some bad advice that will break your initramfs
archive if you follow it.  It says "A typical way to generate the list
of filenames is with the find command; you should give find the -depth option
to minimize problems with permissions on directories that are unwritable or not
searchable."  Don't do this when creating initramfs.cpio.gz images, it won't
work.  The Linux kernel cpio extractor won't create files in a directory that
doesn't exist, so the directory entries must go before the files that go in
those directories.  The above script gets them in the right order.

External initramfs images:
--------------------------

If the kernel has initrd support enabled, an external cpio.gz archive can also
be passed into a 2.6 kernel in place of an initrd.  In this case, the kernel
will autodetect the type (initramfs, not initrd) and extract the external cpio
archive into rootfs before trying to run /init.

This has the memory efficiency advantages of initramfs (no ramdisk block
device) but the separate packaging of initrd (which is nice if you have
non-GPL code you'd like to run from initramfs, without conflating it with
the GPL licensed Linux kernel binary).

It can also be used to supplement the kernel's built-in initramfs image.  The
files in the external archive will overwrite any conflicting files in
the built-in initramfs archive.  Some distributors also prefer to customize
a single kernel image with task-specific initramfs images, without recompiling.

Contents of initramfs:
----------------------

An initramfs archive is a complete self-contained root filesystem for Linux.
If you don't already understand what shared libraries, devices, and paths
you need to get a minimal root filesystem up and running, here are some
references:
http://www.tldp.org/HOWTO/Bootdisk-HOWTO/
http://www.tldp.org/HOWTO/From-PowerUp-To-Bash-Prompt-HOWTO.html
http://www.linuxfromscratch.org/lfs/view/stable/

The "klibc" package (http://www.kernel.org/pub/linux/libs/klibc) is
designed to be a tiny C library to statically link early userspace
code against, along with some related utilities.  It is BSD licensed.

I use uClibc (http://www.uclibc.org) and busybox (http://www.busybox.net)
myself.  These are LGPL and GPL, respectively.  (A self-contained initramfs
package is planned for the busybox 1.3 release.)

In theory you could use glibc, but that's not well suited for small embedded
uses like this.  (A "hello world" program statically linked against glibc is
over 400k.  With uClibc it's 7k.  Also note that glibc dlopens libnss to do
name lookups, even when otherwise statically linked.)

A good first step is to get initramfs to run a statically linked "hello world"
program as init, and test it under an emulator like qemu (www.qemu.org) or
User Mode Linux, like so:

  cat > hello.c << EOF
  #include <stdio.h>
  #include <unistd.h>

  int main(int argc, char *argv[])
  {
    printf("Hello world!\n");
    sleep(999999999);
  }
  EOF
  gcc -static hello.c -o init
  echo init | cpio -o -H newc | gzip > test.cpio.gz
  # Testing external initramfs using the initrd loading mechanism.
  qemu -kernel /boot/vmlinuz -initrd test.cpio.gz /dev/zero

When debugging a normal root filesystem, it's nice to be able to boot with
"init=/bin/sh".  The initramfs equivalent is "rdinit=/bin/sh", and it's
just as useful.

Why cpio rather than tar?
-------------------------

This decision was made back in December, 2001.  The discussion started here:

  http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1538.html

And spawned a second thread (specifically on tar vs cpio), starting here:

  http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1587.html

The quick and dirty summary version (which is no substitute for reading
the above threads) is:

1) cpio is a standard.  It's decades old (from the AT&T days), and already
   widely used on Linux (inside RPM, Red Hat's device driver disks).  Here's
   a Linux Journal article about it from 1996:

      http://www.linuxjournal.com/article/1213

   It's not as popular as tar because the traditional cpio command line tools
   require _truly_hideous_ command line arguments.  But that says nothing
   either way about the archive format, and there are alternative tools,
   such as:

     http://freecode.com/projects/afio

2) The cpio archive format chosen by the kernel is simpler and cleaner (and
   thus easier to create and parse) than any of the (literally dozens of)
   various tar archive formats.  The complete initramfs archive format is
   explained in buffer-format.txt, created in usr/gen_init_cpio.c, and
   extracted in init/initramfs.c.  All three together come to less than 26k
   total of human-readable text.

3) The GNU project standardizing on tar is approximately as relevant as
   Windows standardizing on zip.  Linux is not part of either, and is free
   to make its own technical decisions.

4) Since this is a kernel internal format, it could easily have been
   something brand new.  The kernel provides its own tools to create and
   extract this format anyway.  Using an existing standard was preferable,
   but not essential.

5) Al Viro made the decision (quote: "tar is ugly as hell and not going to be
   supported on the kernel side"):

      http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1540.html

   explained his reasoning:

      http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1550.html
      http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1638.html

   and, most importantly, designed and implemented the initramfs code.

Future directions:
------------------

Today (2.6.16), initramfs is always compiled in, but not always used.  The
kernel falls back to legacy boot code that is reached only if initramfs does
not contain an /init program.  The fallback is legacy code, there to ensure a
smooth transition and allowing early boot functionality to gradually move to
"early userspace" (I.E. initramfs).

The move to early userspace is necessary because finding and mounting the real
root device is complex.  Root partitions can span multiple devices (raid or
separate journal).  They can be out on the network (requiring dhcp, setting a
specific MAC address, logging into a server, etc).  They can live on removable
media, with dynamically allocated major/minor numbers and persistent naming
issues requiring a full udev implementation to sort out.  They can be
compressed, encrypted, copy-on-write, loopback mounted, strangely partitioned,
and so on.

This kind of complexity (which inevitably includes policy) is rightly handled
in userspace.  Both klibc and busybox/uClibc are working on simple initramfs
packages to drop into a kernel build.

The klibc package has now been accepted into Andrew Morton's 2.6.17-mm tree.
The kernel's current early boot code (partition detection, etc) will probably
be migrated into a default initramfs, automatically created and used by the
kernel build.
Commit	Line	Data
7f46a240 RL	1	ramfs, rootfs and initramfs
	2	October 17, 2005
	3	Rob Landley <[email protected]>
	4	=============================
	5
	6	What is ramfs?
	7	--------------
	8
	9	Ramfs is a very simple filesystem that exports Linux's disk caching
	10	mechanisms (the page cache and dentry cache) as a dynamically resizable
1810732e	11	RAM-based filesystem.
7f46a240 RL	12
	13	Normally all files are cached in memory by Linux. Pages of data read from
	14	backing store (usually the block device the filesystem is mounted on) are kept
	15	around in case it's needed again, but marked as clean (freeable) in case the
	16	Virtual Memory system needs the memory for something else. Similarly, data
	17	written to files is marked clean as soon as it has been written to backing
	18	store, but kept around for caching purposes until the VM reallocates the
	19	memory. A similar mechanism (the dentry cache) greatly speeds up access to
	20	directories.
	21
	22	With ramfs, there is no backing store. Files written into ramfs allocate
	23	dentries and page cache as usual, but there's nowhere to write them to.
	24	This means the pages are never marked clean, so they can't be freed by the
	25	VM when it's looking to recycle memory.
	26
	27	The amount of code required to implement ramfs is tiny, because all the
	28	work is done by the existing Linux caching infrastructure. Basically,
	29	you're mounting the disk cache as a filesystem. Because of this, ramfs is not
	30	an optional component removable via menuconfig, since there would be negligible
	31	space savings.
	32
	33	ramfs and ramdisk:
	34	------------------
	35
	36	The older "ram disk" mechanism created a synthetic block device out of
1810732e	37	an area of RAM and used it as backing store for a filesystem. This block
7f46a240 RL	38	device was of fixed size, so the filesystem mounted on it was of fixed
	39	size. Using a ram disk also required unnecessarily copying memory from the
	40	fake block device into the page cache (and copying changes back out), as well
	41	as creating and destroying dentries. Plus it needed a filesystem driver
	42	(such as ext2) to format and interpret this data.
	43
	44	Compared to ramfs, this wastes memory (and memory bus bandwidth), creates
	45	unnecessary work for the CPU, and pollutes the CPU caches. (There are tricks
	46	to avoid this copying by playing with the page tables, but they're unpleasantly
	47	complicated and turn out to be about as expensive as the copying anyway.)
	48	More to the point, all the work ramfs is doing has to happen _anyway_,
1810732e RD	49	since all file access goes through the page and dentry caches. The RAM
1810732e RD	50	disk is simply unnecessary; ramfs is internally much simpler.
7f46a240 RL	51
	52	Another reason ramdisks are semi-obsolete is that the introduction of
	53	loopback devices offered a more flexible and convenient way to create
	54	synthetic block devices, now from files instead of from chunks of memory.
	55	See losetup (8) for details.
	56
	57	ramfs and tmpfs:
	58	----------------
	59
	60	One downside of ramfs is you can keep writing data into it until you fill
	61	up all memory, and the VM can't free it because the VM thinks that files
	62	should get written to backing store (rather than swap space), but ramfs hasn't
	63	got any backing store. Because of this, only root (or a trusted user) should
	64	be allowed write access to a ramfs mount.
	65
	66	A ramfs derivative called tmpfs was created to add size limits, and the ability
	67	to write the data to swap space. Normal users can be allowed write access to
	68	tmpfs mounts. See Documentation/filesystems/tmpfs.txt for more information.
	69
	70	What is rootfs?
	71	---------------
	72
e7b69055 RL	73	Rootfs is a special instance of ramfs (or tmpfs, if that's enabled), which is
	74	always present in 2.6 systems. You can't unmount rootfs for approximately the
	75	same reason you can't kill the init process; rather than having special code
	76	to check for and handle an empty list, it's smaller and simpler for the kernel
	77	to just make sure certain lists can't become empty.
7f46a240	78
e7b69055	79	Most systems just mount another filesystem over rootfs and ignore it. The
7f46a240 RL	80	amount of space an empty instance of ramfs takes up is tiny.
	81
	82	What is initramfs?
	83	------------------
	84
	85	All 2.6 Linux kernels contain a gzipped "cpio" format archive, which is
	86	extracted into rootfs when the kernel boots up. After extracting, the kernel
	87	checks to see if rootfs contains a file "init", and if so it executes it as PID
	88	1. If found, this init process is responsible for bringing the system the
	89	rest of the way up, including locating and mounting the real root device (if
	90	any). If rootfs does not contain an init program after the embedded cpio
	91	archive is extracted into it, the kernel will fall through to the older code
	92	to locate and mount a root partition, then exec some variant of /sbin/init
	93	out of that.
	94
	95	All this differs from the old initrd in several ways:
	96
e7b69055 RL	97	- The old initrd was always a separate file, while the initramfs archive is
	98	linked into the linux kernel image. (The directory linux-*/usr is devoted
	99	to generating this archive during the build.)
7f46a240 RL	100
7f46a240 RL	101	- The old initrd file was a gzipped filesystem image (in some file format,
e7b69055	102	such as ext2, that needed a driver built into the kernel), while the new
7f46a240	103	initramfs archive is a gzipped cpio archive (like tar only simpler,
e7b69055 RL	104	see cpio(1) and Documentation/early-userspace/buffer-format.txt). The
e7b69055 RL	105	kernel's cpio extraction code is not only extremely small, it's also
1810732e	106	__init text and data that can be discarded during the boot process.
7f46a240 RL	107
	108	- The program run by the old initrd (which was called /initrd, not /init) did
	109	some setup and then returned to the kernel, while the init program from
	110	initramfs is not expected to return to the kernel. (If /init needs to hand
	111	off control it can overmount / with a new root device and exec another init
	112	program. See the switch_root utility, below.)
	113
	114	- When switching another root device, initrd would pivot_root and then
	115	umount the ramdisk. But initramfs is rootfs: you can neither pivot_root
	116	rootfs, nor unmount it. Instead delete everything out of rootfs to
	117	free up the space (find -xdev / -exec rm '{}' ';'), overmount rootfs
	118	with the new root (cd /newmount; mount --move . /; chroot .), attach
	119	stdin/stdout/stderr to the new /dev/console, and exec the new init.
	120
33b13025	121	Since this is a remarkably persnickety process (and involves deleting
7f46a240 RL	122	commands before you can run them), the klibc package introduced a helper
	123	program (utils/run_init.c) to do all this for you. Most other packages
	124	(such as busybox) have named this command "switch_root".
	125
	126	Populating initramfs:
	127	---------------------
	128
	129	The 2.6 kernel build process always creates a gzipped cpio format initramfs
	130	archive and links it into the resulting kernel binary. By default, this
e7b69055 RL	131	archive is empty (consuming 134 bytes on x86).
e7b69055 RL	132
1838e392	133	The config option CONFIG_INITRAMFS_SOURCE (in General Setup in menuconfig,
	134	and living in usr/Kconfig) can be used to specify a source for the
	135	initramfs archive, which will automatically be incorporated into the
	136	resulting binary. This option can point to an existing gzipped cpio
	137	archive, a directory containing files to be archived, or a text file
	138	specification such as the following example:
7f46a240 RL	139
	140	dir /dev 755 0 0
	141	nod /dev/console 644 0 0 c 5 1
	142	nod /dev/loop0 644 0 0 b 7 0
	143	dir /bin 755 1000 1000
	144	slink /bin/sh busybox 777 0 0
	145	file /bin/busybox initramfs/busybox 755 0 0
	146	dir /proc 755 0 0
	147	dir /sys 755 0 0
	148	dir /mnt 755 0 0
	149	file /init initramfs/init.sh 755 0 0
	150
99aef427 RL	151	Run "usr/gen_init_cpio" (after the kernel build) to get a usage message
	152	documenting the above file format.
	153
e7b69055	154	One advantage of the configuration file is that root access is not required to
7f46a240 RL	155	set permissions or create device nodes in the new archive. (Note that those
	156	two example "file" entries expect to find files named "init.sh" and "busybox" in
	157	a directory called "initramfs", under the linux-2.6.* directory. See
	158	Documentation/early-userspace/README for more details.)
	159
e7b69055 RL	160	The kernel does not depend on external cpio tools. If you specify a
	161	directory instead of a configuration file, the kernel's build infrastructure
	162	creates a configuration file from that directory (usr/Makefile calls
	163	scripts/gen_initramfs_list.sh), and proceeds to package up that directory
	164	using the config file (by feeding it to usr/gen_init_cpio, which is created
	165	from usr/gen_init_cpio.c). The kernel's build-time cpio creation code is
	166	entirely self-contained, and the kernel's boot-time extractor is also
	167	(obviously) self-contained.
	168
	169	The one thing you might need external cpio utilities installed for is creating
	170	or extracting your own preprepared cpio files to feed to the kernel build
	171	(instead of a config file or directory).
	172
	173	The following command line can extract a cpio image (either by the above script
	174	or by the kernel build) back into its component files:
99aef427 RL	175
	176	cpio -i -d -H newc -F initramfs_data.cpio --no-absolute-filenames
	177
e7b69055 RL	178	The following shell script can create a prebuilt cpio archive you can
	179	use in place of the above config file:
	180
	181	#!/bin/sh
	182
	183	# Copyright 2006 Rob Landley <[email protected]> and TimeSys Corporation.
	184	# Licensed under GPL version 2
	185
	186	if [ $# -ne 2 ]
	187	then
	188	echo "usage: mkinitramfs directory imagename.cpio.gz"
	189	exit 1
	190	fi
	191
	192	if [ -d "$1" ]
	193	then
	194	echo "creating $2 from $1"
	195	(cd "$1"; find . \| cpio -o -H newc \| gzip) > "$2"
	196	else
	197	echo "First argument must be a directory"
	198	exit 1
	199	fi
	200
	201	Note: The cpio man page contains some bad advice that will break your initramfs
	202	archive if you follow it. It says "A typical way to generate the list
	203	of filenames is with the find command; you should give find the -depth option
	204	to minimize problems with permissions on directories that are unwritable or not
	205	searchable." Don't do this when creating initramfs.cpio.gz images, it won't
	206	work. The Linux kernel cpio extractor won't create files in a directory that
	207	doesn't exist, so the directory entries must go before the files that go in
	208	those directories. The above script gets them in the right order.
	209
	210	External initramfs images:
	211	--------------------------
	212
	213	If the kernel has initrd support enabled, an external cpio.gz archive can also
	214	be passed into a 2.6 kernel in place of an initrd. In this case, the kernel
	215	will autodetect the type (initramfs, not initrd) and extract the external cpio
	216	archive into rootfs before trying to run /init.
	217
	218	This has the memory efficiency advantages of initramfs (no ramdisk block
	219	device) but the separate packaging of initrd (which is nice if you have
	220	non-GPL code you'd like to run from initramfs, without conflating it with
	221	the GPL licensed Linux kernel binary).
	222
1810732e	223	It can also be used to supplement the kernel's built-in initramfs image. The
e7b69055 RL	224	files in the external archive will overwrite any conflicting files in
	225	the built-in initramfs archive. Some distributors also prefer to customize
	226	a single kernel image with task-specific initramfs images, without recompiling.
	227
99aef427 RL	228	Contents of initramfs:
	229	----------------------
	230
e7b69055	231	An initramfs archive is a complete self-contained root filesystem for Linux.
7f46a240 RL	232	If you don't already understand what shared libraries, devices, and paths
	233	you need to get a minimal root filesystem up and running, here are some
	234	references:
	235	http://www.tldp.org/HOWTO/Bootdisk-HOWTO/
	236	http://www.tldp.org/HOWTO/From-PowerUp-To-Bash-Prompt-HOWTO.html
	237	http://www.linuxfromscratch.org/lfs/view/stable/
	238
	239	The "klibc" package (http://www.kernel.org/pub/linux/libs/klibc) is
	240	designed to be a tiny C library to statically link early userspace
	241	code against, along with some related utilities. It is BSD licensed.
	242
	243	I use uClibc (http://www.uclibc.org) and busybox (http://www.busybox.net)
99aef427	244	myself. These are LGPL and GPL, respectively. (A self-contained initramfs
e7b69055	245	package is planned for the busybox 1.3 release.)
7f46a240 RL	246
	247	In theory you could use glibc, but that's not well suited for small embedded
	248	uses like this. (A "hello world" program statically linked against glibc is
	249	over 400k. With uClibc it's 7k. Also note that glibc dlopens libnss to do
	250	name lookups, even when otherwise statically linked.)
	251
e7b69055 RL	252	A good first step is to get initramfs to run a statically linked "hello world"
	253	program as init, and test it under an emulator like qemu (www.qemu.org) or
	254	User Mode Linux, like so:
	255
	256	cat > hello.c << EOF
	257	#include <stdio.h>
	258	#include <unistd.h>
	259
	260	int main(int argc, char *argv[])
	261	{
	262	printf("Hello world!\n");
	263	sleep(999999999);
	264	}
	265	EOF
dd1c53a6	266	gcc -static hello.c -o init
e7b69055 RL	267	echo init \| cpio -o -H newc \| gzip > test.cpio.gz
	268	# Testing external initramfs using the initrd loading mechanism.
	269	qemu -kernel /boot/vmlinuz -initrd test.cpio.gz /dev/zero
	270
	271	When debugging a normal root filesystem, it's nice to be able to boot with
	272	"init=/bin/sh". The initramfs equivalent is "rdinit=/bin/sh", and it's
	273	just as useful.
	274
99aef427 RL	275	Why cpio rather than tar?
	276	-------------------------
	277
	278	This decision was made back in December, 2001. The discussion started here:
	279
	280	http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1538.html
	281
	282	And spawned a second thread (specifically on tar vs cpio), starting here:
	283
	284	http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1587.html
	285
	286	The quick and dirty summary version (which is no substitute for reading
	287	the above threads) is:
	288
	289	1) cpio is a standard. It's decades old (from the AT&T days), and already
	290	widely used on Linux (inside RPM, Red Hat's device driver disks). Here's
	291	a Linux Journal article about it from 1996:
	292
	293	http://www.linuxjournal.com/article/1213
	294
	295	It's not as popular as tar because the traditional cpio command line tools
	296	require _truly_hideous_ command line arguments. But that says nothing
	297	either way about the archive format, and there are alternative tools,
	298	such as:
	299
1f8ee46b	300	http://freecode.com/projects/afio
99aef427 RL	301
	302	2) The cpio archive format chosen by the kernel is simpler and cleaner (and
	303	thus easier to create and parse) than any of the (literally dozens of)
	304	various tar archive formats. The complete initramfs archive format is
	305	explained in buffer-format.txt, created in usr/gen_init_cpio.c, and
	306	extracted in init/initramfs.c. All three together come to less than 26k
	307	total of human-readable text.
	308
	309	3) The GNU project standardizing on tar is approximately as relevant as
	310	Windows standardizing on zip. Linux is not part of either, and is free
	311	to make its own technical decisions.
	312
	313	4) Since this is a kernel internal format, it could easily have been
	314	something brand new. The kernel provides its own tools to create and
	315	extract this format anyway. Using an existing standard was preferable,
	316	but not essential.
	317
	318	5) Al Viro made the decision (quote: "tar is ugly as hell and not going to be
	319	supported on the kernel side"):
	320
	321	http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1540.html
	322
	323	explained his reasoning:
	324
	325	http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1550.html
	326	http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1638.html
	327
	328	and, most importantly, designed and implemented the initramfs code.
	329
7f46a240 RL	330	Future directions:
	331	------------------
	332
e7b69055	333	Today (2.6.16), initramfs is always compiled in, but not always used. The
7f46a240 RL	334	kernel falls back to legacy boot code that is reached only if initramfs does
	335	not contain an /init program. The fallback is legacy code, there to ensure a
	336	smooth transition and allowing early boot functionality to gradually move to
	337	"early userspace" (I.E. initramfs).
	338
	339	The move to early userspace is necessary because finding and mounting the real
	340	root device is complex. Root partitions can span multiple devices (raid or
	341	separate journal). They can be out on the network (requiring dhcp, setting a
1810732e	342	specific MAC address, logging into a server, etc). They can live on removable
7f46a240 RL	343	media, with dynamically allocated major/minor numbers and persistent naming
	344	issues requiring a full udev implementation to sort out. They can be
	345	compressed, encrypted, copy-on-write, loopback mounted, strangely partitioned,
	346	and so on.
	347
	348	This kind of complexity (which inevitably includes policy) is rightly handled
	349	in userspace. Both klibc and busybox/uClibc are working on simple initramfs
e7b69055	350	packages to drop into a kernel build.
7f46a240	351
e7b69055 RL	352	The klibc package has now been accepted into Andrew Morton's 2.6.17-mm tree.
	353	The kernel's current early boot code (partition detection, etc) will probably
	354	be migrated into a default initramfs, automatically created and used by the
	355	kernel build.