Notes on Distributed File Systems

A DFS should look to client like a conventional, centralized file system.

Naming

Naming is a mapping between logical and physical objects

location transparency - name doesn’t tell where the file is located

location independence - name doesn’t change if file is moved (file mobility or migration) Andrew but not NFS

Naming Schemes

host:local_name - not location transparent or independent

mount remote system on local directory - forest of trees, name not always identical on all systems, depends on where it is mounted.

single tree on all systems - special filenames (such as /DEV) complicate things, requires full system coordination

Remote Access methods

Block based access

File caching

Fault tolerance

Failure transparency - continue even if client, server or net temporarily fails

Stateless or stateful servers

replication

Semantics of Sharing

Unix semantics - every read of a file sees the effects of all previous writes. It is possible for clients to share the pointer into a file.

session semantics - write to an open file are immediately visible to local clients for invisible to remote clients, Once a file is closed all changes are visible to new opens.

Immutable shared file semantics - Files cannot be changed

transaction like semantics

Scalability

multi-thread or single thread servers

Security

Andrew File System

The design of AFS assumes

Most files are small (<10K)

Most files are used only by one user.

Reads occur about 6 times more often than writes

Most writes are by the file owner. Few files have shared writes.

Files are referenced in bursts (temporal locality)

AFS operates by:

When a file is opened, the entire file is copied from the server to the local disk.

The local disk is opened for the user.

Changes are made to the local copy.

When the file is closed, updates are written by to the server.