Spread Knowledge

CS604 - Operating Systems - Lecture Handout 42

User Rating:  / 0

Related Content: CS604 - VU Lectures, Handouts, PPT Slides, Assignments, Quizzes, Papers & Books of Operating Systems


  • File Concept
  • File Types
  • File Operations
  • Access Methods
  • Directories
  • Directory Operations
  • Directory Structure

The File Concept

Computers can store information on several different storage media, such as magnetic disks, magnetic tapes and optical disks. The operating system abstracts from the physical properties of its storage devices to define a logical storage unit (the file). Files are mapped by the OS onto physical devices. These storage devices are usually non-volatile, so the contents are persistent through power failures, etc. A file is a named collection of related information that is recorded on secondary storage. Data cannot be written to secondary storage unless they are within a file. Commonly, files represent programs (source and object forms) and data. Data files may be numeric, alphabetic, alphanumeric or binary. In essence it is a contiguous logical address space.

File Structure

A file has certain defined structure characteristics according to its type. A few common types of file structures are:

None – file is a sequence of words, bytes

Simple record structure
Fixed length
Variable length
Complex Structures
Formatted document
Relocatable load file

UNIX considers each file to be a sequence of bytes; no interpretation of these bytes is made by the OS. This scheme provides maximum flexibility but little support. Each application program must include its own code to interpret an input file into the appropriate structure. However all operating systems must support at least one structurethat of an executable file-so that the system is able to load and run programs.

File Attributes

Every file has certain attributes, which vary from one OS to another, but typically consist of these:

Name: The symbolic file name is the only information kept in human-readable form
Type: This information is needed for those systems that support different types.
Location: This location is a pointer to a device and to the location of the file on that device.
Size: The current size of the file (in bytes, words or blocks) and possibly the maximum allowed size are included in this attribute.
Protection: Access control information determines who can do reading , writing, etc.
Time and date created: useful for security, protection and usage monitoring.
Time and date last updated: useful for security, protection and usage monitoring.
Read/write pointer value

Where are Attributes Stored?

File attributes are stored in the directory structure, as part of the directory entry for a file, e.g., in DOS, Windows, or in a separate data structure; in UNIX/Linux this structure is known as the inode for the file.

Directory Entry

A file is represented in a directory by its directory entry. Contents of a directory entry vary from system to system. For example, in DOS/Windows a directory entry consists of file name and its attributes. In UNIX/Linux, a directory entry consists of file name and inode number. Name can be up to 255 characters in BSD UNIX compliant systems. Inode number is used to access file’s inode. The following diagrams show directory entries for DOS/Windows and UNIX/Linux systems.

Directory Entry

File Operations

Various operations can be performed on files. Here are some of the commonly supported operations. In parentheses are written UNIX/Linux system calls for the corresponding operations.

  • Create (creat) —two steps are necessary to create a file. First, space must be found for the file in the file system. Second, an entry for the new file must be made in the directory.
  • Open (open) — The open operation takes a file name and searches the directory, copying the directory entry into the open-file table. The open system call can also accept access-mode information-read-only, read-write, etc. It typically returns a pointer to the entry in open-file table.
  • Write (write) —To write to a file, we make a system call, specifying both the name of the file and the information to be written to the file. Given the name of the file, the system searches the directory to find the location of the file. The system must keep a write pointer to the location in the file where the next write is to take place. The write pointer must be updated whenever a write occurs.
  • Read (read) — To read from a file we use a system call that specifies the name of the file, and where (in memory) the next block of the file should be put. The system needs top keep a read pointer to the location in the file where the next read is to take place. Once the read has taken place, the read pointer needs to be updated. A given process is usually only reading or writing to a file. The current pointer location is kept as a process current-file-position pointer. Both read and write use the same pointer
  • Reposition within file (lseek) — A directory is searched for the appropriate entry and the current-file-position is set to a given value. This is often known as a file seek.
  • Delete (unlink) — Search the directory for the named file, and then release the file space and erase the directory entry. File can be deleted using the unlink system call.
  • Truncate (creat) — A user may want to erase the contents of the file but keep its attributes. This function allows all attributes to be unchanged except for file length., which is set to zero and file space is released. This can be achieved using creat with a special flag
  • Close (close) — When a file is closed, the OS removes its entry in the open-file table.

File Types: Extensions

A common technique for implementing files is to include the type of the file as part of the file name. The name is split into two parts, a name and an extension, usually separated by a period character. In this way, the user and the OS can tell from the name alone, what the type of a file is.
The operating system uses the extension to indicate the type of the file and the type of operations that can be done on that file. In DOS/Windows only a file with .exe, .com, .bat extension can be executed.
The UNIX system uses a crude magic number stored at the beginning of some files to indicate roughly the type of the file-executable program, batch file/shell script, etc. Not all files have magic numbers, so system features cannot be based solely on this type of information. UNIX does allow file name extension hints, but these extensions are not
enforced or depended on by the OS; they are mostly to aid users in determining the type of contents of the file. Extension can be used or ignored by a given application.

The following tables shows some of the commonly supported file extensions on different operating systems.

Common file types

File Types in UNIX

UNIX does not support supports seven types of file:

  • Ordinary file: used to store data on secondary storage device, e.g., a source program(in C), an executable program. Every file is a sequence of bytes.
  • Directory: contains the names of other files and/or directories.
  • Block-special file: correspond to block oriented devices such as a disk. They are used to access such hardware devices.
  • Character-special file: correspond to character oriented devices, such as keyboard
  • Link file (created with the ln –s command): is created by the system when a symbolic link is created to an existing file, allowing you to rename the existing file and share it without duplicating its contents without
  • FIFO (created with the mkfifo or mknod commands or system calls): enable processes to communicate with each other. A FIFO(name pipe) is an area in the kernel that allows two processes to communicate with each other provided they are running on the same system , but the processes do not have to be related to each other.
  • Socket (in BSD-compliant systems—socket): can be used by the process on the same computer or on different computers to communicate with each other.

File Access

Files store information that can be accessed in several ways:

Sequential Access

Information in the file is processed in order, one record after the other. A read operation reads the next potion of the file and automatically advances a file pointer which tracks the I/O location. Similarly, a write operation appends to the end of the file and advances to the end of the newly written material. Such a file can be rest to the beginning and on some systems; a program may be able to skip forward or backward, n records.

Sequential Access File

Direct Access

A file is made up of fixed length logical record that allow program to read and write records in no particular order. For the direct-access method, the file operations must be modified to include the block number as a parameter (read n (n = relative block number), write n for instance). An alternate approach is to retain read next and write next and to add an operation, position file to n, where n is the block number. The block number provided by the user to the OS is normally a relative block number, an index relative to the beginning of the file.

Sequential Access on a Direct Access File

Directory Structure

It is a collection of directory entries. To manage all the data, first disks are split into one or more partitions. Each partition contains information about files within it. This information is kept within device directory or volume table of contents.

Directory Structure

Directory Operations

The following directory operations are commonly supported in contemporary operating systems. Next to each operation are UNIX system calls or commands for the corresponding operation.

  • Create — mkdir
  • Open — opendir
  • Read — readdir
  • Rewind — rewinddir
  • Close — closedir
  • Delete — rmdir
  • Change Directory — cd
  • List — ls
  • Search

Directory Structure

When considering a particular directory structure we need to consider the following issues:

  1. Efficient Searching
  2. Naming – should be convenient to users
    • Two users can have same name for different files
    • The same file can have several different names
  3. Grouping – logical grouping of files by properties, (e.g., all Java programs, all games, ..)

Single-Level Directory

All files are contained in the same directory, which is easy to support and understand.
However when the number of files increases or the system has more than one user, it has limitations. Since all the files are in the same directory, they must have unique names.

Single-Level Directory

Single-level directory structure

Two-Level Directory

There is a separate directory for each user.

Two-level directory strucutre

When a user refers to a particular file, only his own user file directory (UFD) is searched. Thus different users can have the same file name as long as the file names within each UFD are unique. This directory structure allows efficient searching.
However, this structure effectively isolates one user from another, hence provides no grouping capability.

Tree Directory

Here is the tree directory structure. Each user has his/her own directory (known as user’s home directory) under which he/she can create a complete directory tree of his/her own.

Tree directory structure

The tree has a root directory. Every file in the system has a unique pathname. A path name is the path from the root, through al the subdirectories to a specified file. A directory/subdirectory contains a set of files or subdirectories. In normal use, each user has a current directory. The current directory should contain most of the files that are of current interest to the user. When a reference to a file is made, the current directory is searched. If a file is needed that is not in the current directory, then the user must either specify a path name or change the directory to the directory holding the file( using the cd system call).This structure hence supports efficient searching. Allowing the user to define his own subdirectories permits him to impose a structure on his files.A lso users can access files of other users.

UNIX / Linux Notations and Concepts

  • Root directory (/)
  • Home directory
    • ~, $HOME, $home
    • cd ~
    • cd
  • Current/working directory (.)
    • pwd
  • Parent of Current Directory (..)
  • Absolute Pathname
    • Starts with the root directory
    • For example, /etc, /bin, /usr/bin, /etc/passwd, /home/students/ibraheem
  • Relative Pathname
    • Starts with the current directory or a user’s home directory
    • For example, ~/courses/cs604, ./a.out