[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: LINUX Filesystem standards



Ok, here goes.

I have split it into two parts for security:

------------------------- linux fs std. part 1 -------------------------
Filesystem Standard Group                                 Daniel Quinlan
date submitted: 93/11/17                            quinlan@bucknell.edu


              Advance Draft on Linux Filesystem Structure

Status of this draft

   This draft is being distributed to members of the Linux community in
   order to solicit their reactions to the series of ideas, concepts,
   and proposals included within it.  While the entire content of this
   draft may not conform to what every individual desires, it should
   prove to be a good start to solving many problems.

   The draft is a product of the Filesystem Standard (FSSTND) channel
   of the linux-activists@Niksula.hut.fi mailing list.  This draft is a
   working document of the Filesystem Standard channel, the author, and
   all other groups collaborating to help create this draft.  The
   distribution of this draft is limited at this time to those directly
   involved in its development or implementation.

________________________________________________________________________

                                ABSTRACT

Introduction

   This document is an extensive undertaking to correct outstanding
   problems with the filesystem structures in use by developers,
   programmers, administrators, and users.  Our purpose and goal is to
   produce a draft of exceptional quality that developers and others
   will voluntarily adopt to solve well-acknowledged problems.

   The FSSTND group hopes that this draft will be eventually adopted as
   a better standard than the de-facto standard produced by the current
   disarray of ideas.

   We felt that it was desirable to first call attention to some of the
   fundamental problems with the current filesystem situation:

   (1) There is no single well-accepted Linux directory structure.
       Instead, there are many different ones, each being incompatible
       with each other.  This is a problem that justifies our effort
       and should overshadow any differences in opinion, the same
       differences in opinion that make Linux filesystems an utter mess.

   (2) In the most widely used filesystem hierarchies, the directories
       are not well structured and differ gratuitously from more modern
       standards (such as POSIX, System V, BSD, and others).

   (3) The filesystem is disturbing to experienced UNIX users and
       administrators who have experience on more mainstream UNIX
       systems.

   (4) The current layout is confusing for new users, especially new
       users coming from a non-UNIX background.

   (5) The incompatibilities between primary installation packages and
       other software packages are typically solved by methods of a less
       than appealing nature.

   (6) Overall, symbolic links are used too often within the filesystem
       to fix problems.  (However, there are times when symbolic links
       need to be used to ensure backward compatibility or to allow
       specific systems to have an individual filesystem structure.)

   The FSSTND group seeks to correct these problems by proposing a good
   filesystem structure that the Linux community may voluntarily follow.
   While developing this draft, approval and input was received from a
   number of Linux developers, noted Linux programmers, many system
   administrators, and both experienced and novice users.  For this
   reason, I feel that following our recommendations is a good thing.
   If you feel that there is a problem with this effort or the substance
   of the draft, feel free to first contact the draft coordinator,
   Daniel Quinlan <quinlan@bucknell.edu>, with your comments.

------------------------------------------------------------------------

Specific Problems

   Naturally, while defining a Linux filesystem structure, there were
   some specific problems that we sought to correct.  Here are some of
   the major and well-accepted ones:

   o  The primary binary directories, /bin and /usr/bin, do not have
      well defined divisions between them.  As a result, the binaries
      that are in found in each directory greatly differ between various
      Linux distributions.

   o  Including both binaries and configuration files in /etc makes it
      more confusing and harder to maintain for inexperienced users or
      system administrators with especially large systems.

   o  Often, configuration files for items which are not related to
      machine startup and are not machine-local are located in /etc.
      (Configuration files for items in /usr that are not specific to
      a single machine should be found in /usr/etc.)

   o  The current implementation of /usr cannot be mounted read-only
      because it contains variable files and directories that need to
      be written to.

   o  In a networked environment it is desirable to serve software to
      workstations via a NFS mounted filesystem.  Such filesystems need
      to be mounted read-only so that accidents or malice on one
      workstation cannot damage the files on the server.  This requires
      identification and separation of files that a machine must write
      to and separation of files that are specific to a single machine.

   o  Linux is not well prepared for a network installation including
      the possibility of a read-only /usr partition and diskless (or
      small local disk) workstations.

   While these are some of the major problems we addressed, there were
   numerous additional problems that needed to be solved.  This draft
   attempts to address many of those other problems, but there may be
   something we missed.  If you wish to bring something to our
   attention, please note there are some things we have discussed at
   length, but did not include in this draft (for good reasons).

------------------------------------------------------------------------

Objectives

   In trying to solve the above problems, we saw several objectives that
   needed to be accomplished in addition to the more technical matters.
   These goals comprise the correction of outstanding problems as well
   as the validation of our discussion and work.

   o  Solve the above problems while also limiting the possible
      transition difficulties resulting from moving away from the former
      de-facto standards.

   o  Gain approval of distributors, developers, and other important
      people in the Linux movement, as well as their suggestions.

   o  Provide a standard that all of the Linux community would choose
      to follow because it will solve the above problems as well
      as provide the most sensible structure for Linux's filesystem.

   o  While conformance to this or any other standard in Linux is
      obviously completely voluntary, we wanted to impress upon
      developers that this organization is a very sensible way to
      lay out a Linux filesystem.  If you, as a developer, wish
      to suggest any improvements, we are quite willing to listen.

------------------------------------------------------------------------

History and Progress

   The original post that motivated this effort to restructure the Linux
   filesystem was written by: Olaf Kirsh <okir@monad.swb.de> on August 2,
   1993 in the NORMAL channel of the Linux activists mailing list.

   Soon thereafter, it was decided that the best way to accomplish such
   a restructuring of the Linux filesystem would be to create a mailing
   list for the purpose of trying to develop a consensus standard.

   After a comprehensive discussion, with surprisingly few flames, a
   preliminary draft was written.  Then, with the help of several
   dedicated people, the draft was finished and that resulting draft
   submitted to the FSSTND channel for more discussion.  The first draft
   was submitted to the channel on September 18, 1993.

;; more to be added

________________________________________________________________________

                        THE FILESYSTEM STRUCTURE


The directory hierarchy is separated into unsharable data, "/" (root),
and sharable data, "/usr".  These two filesystems are divided according
to the following two data types: static data and variable data.  Static
data includes binaries, libraries, documentation, and anything that does
not change without system administrator intervention.  Variable data is
just about anything that does change in an unpredictable way.

(Please note that "filesystem" can refer to either a single formatted
partition for data, or it can mean the entire directory hierarchy... or
at least in our usage it does.)

We now have defined 4 orthogonal categories of file data: sharable,
unsharable, variable, and static.  We have defined /usr as sharable data
and / as unsharable data.  Each hierarchy, / and /usr, divides data into
static and variable types.  Throughout this document, and in any well
planned filesystem, the acceptance of these facts will help guide the
structure and lend it additional consistency.

The distinction between sharable and unsharable data needed to be made
for several reasons...

 (1) In a networked environment, certain filesystems contain information
     specific to a single machine.  Therefore these filesystems cannot
     be shared (with NFS).

 (2) The current implementation of /usr cannot be mounted read-only
     because it contains variable files and directories that need to be
     written to.  This is a factor that must be addressed when /usr is
     shared on a network or mounted read-only because of other
     considerations (safety).

The "sharable" factor can be extended in two directions.

 (1) A /usr mounted through the network (using NFS).

 (2) A /usr mounted from read-only media.  (You can think of that CD-ROM
     drive as a networked, using postal mail, filesystem that you are
     sharing with other Linux systems.)  While this is not something
     that is immensely practical or even used, it is something that we
     did not want to entirely rule out.

The "static" vs. "variable" factor dramatically affects the filesystem
in 2 major ways:

 (1) Since / contains both variable and static data, it needs to be
     mounted read-write.

 (2) Since /usr contains both variable and static data, but since we
     want to mount it read-only (see above), it is necessary to provide
     a method to have /usr mounted read-only.  This is done through the
     creation of a /var hierarchy which is mounted read-write, taking
     over much of the /usr partition's traditional functionality.

------------------------------------------------------------------------

ROOT

This is the root directory structure.  In general, enough data should be
contained in the root partition to boot, restore, recover, and/or repair
the system:

 (1) To boot a system, enough must be present to mount /usr.  This
     includes utilities, configuration, boot loader information, and
     other essential start-up data.

 (2) To recover and/or repair a system, those utilities needed by an
     experienced user to diagnose and reconstruct a damaged system
     should be present on root.

 (3) To restore a system, those utilities needed to restore a system
     from backups (on floppy, tape, etc.) should be present on root.

The primary concern used to balance these desires (placing many things
in root) is the goal of keeping root as small as reasonably possible.
It is desirable to keep root small in terms of number of directories,
files, and disk space for several reasons:

 (1) The root is often mounted from very small media.  For example, most
     people using Linux install and do recovery by mounting root off of
     a RAM disk which is copied from a single 1.44M or 1.2M floppy disk.

 (2) Root has many system-specific configuration files in it, a kernel
     that is specific to the system, a different hostname, etc.  This
     means that root isn't always sharable between networked systems.
     Keeping root small on networked systems minimizes the amount of
     space lost on servers to unsharable files.  It also allows
     workstations with smaller local hard drives.

     However, with diskless clients, this does not have to be entirely
     the case, unless each client has a different root image.

 (3) While you may have a large root partition, and may be able to fill
     it to your heart's content, there will be people with smaller
     partitions.  If you have more files installed, you may find
     incompatibilities with other systems using limited root partitions.
     If you are a developer then you may be sharing this problem with a
     large number of users.

 (4) Disk errors on the root partition are a greater problem than errors
     on any other partition.  A small root partition is less prone to
     corruption as the result of a system crash.

Since root is small and host-specific (due to the division between / and
/usr), this scheme necessitates a writeable root.  However, this does
not necessitate a fully locally stored root.  The root partition doesn't
have to be locally stored just to be system specific (i.e., root mounted
from a NFS root server.)

No single package should have its own specific root directory.  This
structure provides more than enough flexibility for any package.  Any
package which does occupy a directory under root suffers from sheer
arrogance.

/ : the root directory
  |
  |- bin        : essential command binaries
  |- boot       : static files of the boot loader
  |- dev        : device files
  |- etc        : essential system configuration
  |- home       : user home directories
  |- lib        : shared libraries (libc.so.*, libm.so.*, and ld.so)
  |- lost+found : files and directories found by `fsck' (Ext2 specific)
  |- mnt        : mount point of temporary partitions
  |- proc       : process information pseudo-filesystem
  |- root       : home directory for root
  |- sbin       : essential system binaries
  |- tmp        : temporary files
  |- usr        : second major permanent mount point
  |- var        : files that tend to grow or vary in size
  \- {kernel image}

Following this section, each directory is explained in full.

The root directory normally contains the current kernel image.  The
kernel image name is locally configurable, but the name we suggest (that
has been used in recent Linux kernel sources) is `vmlinux' which may or
may not be a (symbolic-)link to the actual file, possibly depending on
the system distribution used.  More information on kernel placement is
located in the /boot section of the draft.

------------------------------------------------------------------------

/bin : essential command binaries (for use by all users)

There should be no subdirectories within /bin.

The commands (static data) that are needed in single user mode by the
super user (root) are stored in /bin.  However, the commands in /bin are
for use by *both* root and other users.  On the same note, the /bin
directory should not contain anything that is explicitly to be used by
root.

All root-only binaries such as standard daemons, `init', `getty',
`mkfs', et al. (previously found in /etc), shall now be placed in /sbin
or /usr/sbin depending on the necessity of the command.  For discussion
and our definition of essential (necessity and related concepts) please
read the issues and rationale section towards the end of this draft.

Command binaries that are not essential enough to place into /bin should
be placed into /usr/bin, instead.  Items which are only used by non-root
users are not essential (interactive shells, pagers, `passwd', et al.)
and should be placed elsewhere.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

REQUIRED files for /bin:

general commands:

        The following commands have been added because of their
        essential nature in the system.  A few have been added
        because of their traditional placement in /bin.

 { [, arch, cat, chgrp, chmod, chown, cp, date, dd, df, echo, ed, false,
   free, kill, ln, login, ls, mkdir, mknod, mv, ps, pwd, rm, rmdir,
   sh, stty, su, sync, test, true, uname }

If /bin/sh is BASH, then /bin/sh should be a symbolic or hard link to
/bin/bash since bash behaves differently when called as `sh' or `bash'.
The same goes for `pdksh', which is often the /bin/sh on install disks
and such (link from /bin/sh to /bin/ksh).  I personally prefer the use
of a symbolic link in these cases, because it allows users to easily see
that /bin/sh is not a true Bourne shell.

Also, `[' and `test' are built into BASH.

/bin/arch should produce the same output as "uname -m", "i386" or
"i486".

;; is `arch' all that we need?

restoration commands:

        These commands have been added to make restoration of a system
        possible (provided that / is intact).

 { tar, gzip, gunzip, zcat }

If system backups are made with a package other than `gzip' and `tar',
then that administrator should include the minimal necessary restoration
components in the root partition.  For instance, many systems should
include `cpio' as it is the next most commonly used backup utility after
tar.  Conversely, if no restoration from the root partition is ever
expected, then these binaries may be omitted (i.e., a ROM chip root,
mounting /usr through NFS).

networking:

        These are deemed the only necessary networking binaries that
        both root and users will want or need to execute other than
        the ones in /usr/bin or /usr/local/bin.

 { domainname (link to hostname), hostname, ftp, netstat, ping }

;; should `ping' be in /usr/bin?

;; should `ftp' be put into the restoration class?  This would make
;; sense and satisfy all those concerned.

------------------------------------------------------------------------

/boot : static files of the boot loader

This directory contains everything for boot except configuration files
and the map installer.  This includes saved master boot sectors, sector
map files, and anything else that is not directly edited by hand.  The
boot loader program should be placed into /sbin and configuration files
for boot loaders into /etc.

For LILO:

  Old location                  New location
  ------------------------      -----------------
  /etc/lilo/config.defines      /etc/lilo.defines
  /etc/lilo/config              /etc/lilo.conf
  /etc/lilo/disktab             /etc/disktab
  /etc/lilo/lilo                /sbin/lilo
  /etc/lilo/boot.NNNN           /boot/boot.NNNN
  /etc/lilo/part.NNNN           /boot/part.NNNN
  /etc/lilo/map                 /boot/map
  /etc/lilo/*.b                 /boot/*.b

*.b are the first and second stage boot loader, plus all those chain
loaders.  `QuickInst' (if used at all) should be placed into /usr/sbin.
(The `activate' command is left out of this scheme because its future is
uncertain at this time.)

Extra kernel images may be stored in /boot.  The main kernel can either
be placed in / or in /boot according to preference of the administrator.
If placed in /, the kernel may also possibly be a symlink to a kernel
image in /boot.  Note that the standard location for the kernel is still
in /.

------------------------------------------------------------------------

/dev : Device files

/dev usually also contains a file, MAKEDEV, a shell script designed to
create devices as needed.  It also often contains a MAKEDEV.local for
any local-only devices.

Symbolic links within /dev "to make it easier to understand" are
dangerous and not a good idea. The largest problem with symlinks in /dev
is that they are often not updated along with other devices.

A good standard already exists for Linux devices.  We believe that the
current standard should by followed.  The device list is maintained by
Rick Miller <rick@ee.uwm.edu>, the Linux Device Registrar.

------------------------------------------------------------------------

/etc : Essential system configuration files

No binaries should go directly into /etc.  Binaries which would have in
the past been found in /etc should now be placed in /sbin.  This
includes such files as init, getty, and update.  Binaries such as
hostname which are used by users as well as root should not be placed in
/sbin, but in /bin.

;; this listing is not quite complete

REQUIRED files for /etc:

 { adjtime, csh.login, fdprm, fstab, gettydefs, group, inittab, issue,
   motd, mtab, mtools, passwd, profile, securetty, shells, termcap,
   ttytype, utmp }

networking REQUIRED files (if networking is installed):

 { ftpusers, hosts, host.conf, hosts.equiv, networks, printcap,
   protocols, resolv.conf, services }

;; are resolv.conf and host.conf the same thing?

Regarding the rc.* (BSD model) vs. rc.d/* (System V model) "debate":

  Officially: Either system is acceptable at this time for Linux systems
  although a gradual transition to the SysV system is anticipated on
  most Linux systems.  In the end, this is most affected by sysadmin and
  developer preference.

;; There are problems with allowing both the System V and BSD methods
;; here.  We should decide whether or not to endorse one of the methods
;; on a *technical* basis.

There will be more configuration files than just these, but some that
are not essential should be placed in /usr/etc rather than /etc.

The `magic' file belongs in /usr/etc rather than /etc since the `file'
utility is not stored on the root partition and the magic file can tend
to get rather large.  The `wtmp' logfile belongs in /var/adm because it
can grow in size without bound.

Systems which are using the shadow password suite will have additional
files in /etc (/etc/shadow and whatever else) and /usr/sbin (useradd,
usermod, et cetera).

;; shadow vs. old password method.  We should make up our mind.

------------------------------------------------------------------------

/home : User home directories

/home is a fairly standard concept, but it is clearly a site-specific
filesystem.  The setup will differ from machine to machine.

On small systems, each user's directory is typically one of the many
subdirectories of /home such as /home/smith, /home/linus,
/home/operator, etc.

On large systems (especially when the /home directories are mounted
across a number of machines using NFS) it is a good idea to subdivide
user home directories.  Subdivision can be accomplished by using
subdirectories such as /home/staff, /home/guests, /home/students, etc.

Different people prefer to place user accounts in a variety of places
and because of this reason, no programming should rely on this location.
If you want to find out a user's home directory, you should use the
field in /etc/passwd or another reliable method (I know of no other
reliable methods).

------------------------------------------------------------------------

/lib : Shared libraries (needed to run dynamically linked binaries)

Only the shared library images necessary to boot the system should be
placed in /lib.  The shared library images are "/lib/libc.so.*",
"/lib/libm.so.*", and "/lib/ld.so" (and not the actual ".a" files).

XFree86 and other libraries do not belong in /lib.  Essentially, only
the dynamic shared libraries needed to run programs in /bin and /sbin
should be here.

A single symbolic link for the C preprocessor currently exists in /lib
pointing /lib/cpp to either /usr/lib/gcc-lib/i-?86-linux/2.4.?/cpp or
/usr/bin/cpp.  No binaries should be added to /lib in addition to cpp.

;; There are rumors that cpp is no longer needed

------------------------------------------------------------------------

/mnt : Mount point for temporarily mounted filesystems

This is the location where the system administrator may temporarily
mount filesystems as needed.  The setup of this directory is a local
issue and should not affect the manner in which any program is run.

------------------------------------------------------------------------

/proc : Proc based process system

The procps filesystem is becoming the standard Linux method for handling
process information rather than /dev/kmem and other nasty methods.  This
is only recommended, but should in time become the standard for the
storage and retrieval of process information as well as other kernel and
memory information.

------------------------------------------------------------------------

/root: home directory for root

/ is traditionally the home directory of the root account, although on
most Linux systems this is found in /root.  One thing that is certain is
that the root account's home directory *must* be stored on the root
partition.

With sensible usage, the root account is not used for mundane things
such as mail and news, but solely for systems administration purpose.
For this reason, subdirectories such as "Mail" and "News" should not
appear in the root account's home directory.  (Mail is usually forwarded
to a more appropriate account.)

------------------------------------------------------------------------