7.Programmer’s View of Files§

Java File Functions

RandomAccessFile(String name, String mode)

close()

read(byte[] b)

write(byte[] b)

seek(long pos)

Primary vs. Secondary Storage

Comparisons

\[\begin{split}\begin{array}{l|r|r|r|r|r|r|r} \hline \textbf{Medium}& 1996 & 1997 & 2000 & 2004 & 2006 & 2008 & 2011\\ \hline \textbf{RAM}& \$45.00 & 7.00 & 1.500 & 0.3500 & 0.1500 & 0.0339 & 0.0138\\ \textbf{Disk}& 0.25 & 0.10 & 0.010 & 0.0010 & 0.0005 & 0.0001 & 0.0001\\ \textbf{USB drive}& -- & -- & -- & 0.1000 & 0.0900 & 0.0029 & 0.0018\\ \textbf{Floppy}& 0.50 & 0.36 & 0.250 & 0.2500 & -- & -- & --\\ \textbf{Tape}& 0.03 & 0.01 & 0.001 & 0.0003 & -- & -- & --\\ \textbf{Solid State}& -- & -- & -- & -- & -- & -- & 0.0021\\ \hline \end{array}\end{split}\]

Golden Rule of File Processing

Disk Drives

Disk drive platters

Sectors

The organization of a disk platter

Terms

Seek Time

Other Factors

(Old) Disk Spec Example

Disk Access Cost Example (1)

Disk Access Cost Example (2)

How Much to Read?

Newer Disk Spec Example

Buffers

Buffer Pools

Buffer Pools

Settings

Proficient Saving... Error Saving
Server Error
Resubmit

Organizing Buffer Pools

LRU

Settings

Proficient Saving... Error Saving
Server Error
Resubmit

Dirty Bit

Settings

Proficient Saving... Error Saving
Server Error
Resubmit

Bufferpool ADT: Message Passing

// ADT for buffer pools using the message-passing style
public interface BufferPoolADT {
  // Copy "sz" bytes from "space" to position "pos" in the buffered storage
  public void insert(byte[] space, int sz, int pos);

  // Copy "sz" bytes from position "pos" of the buffered storage to "space"
  public void getbytes(byte[] space, int sz, int pos);
}

Bufferpool ADT: Buffer Passing

// ADT for buffer pools using the buffer-passing style
public interface BufferPoolADT {
  // Return pointer to the requested block
  public byte[] getblock(int block);

  // Set the dirty bit for the buffer holding "block"
  public void dirtyblock(int block);

  // Tell the size of a buffer
  public int blocksize();
};

Design Issues

Some Goals

Improved Interface

// Improved ADT for buffer pools using the buffer-passing style.
// Most user functionality is in the buffer class, not the buffer pool itself.

// A single buffer in the buffer pool
public interface BufferADT {
  // Read the associated block from disk (if necessary) and return a
  // pointer to the data
  public byte[] readBlock();

  // Return a pointer to the buffer's data array (without reading from disk)
  public byte[] getDataPointer();

  // Flag buffer's contents as having changed, so that flushing the
  // block will write it back to disk
  public void markDirty();

  // Release the block's access to this buffer. Further accesses to
  // this buffer are illegal
  public void releaseBuffer();
}

Improved Interface (2)

public interface BufferPoolADT {

  // Relate a block to a buffer, returning a pointer to a buffer object
  Buffer acquireBuffer(int block);
}

External Sorting

Model of External Computation

Key Sorting

Simple External Mergesort (1)

Simple External Mergesort (2)

1. Split the file into two files.
2. Read in a block from each file.
3. Take first record from each block, output them in sorted order.
4. Take next record from each block, output them to a second file in sorted order.
5. Repeat until finished, alternating between output files. Read new input blocks as needed.
6. Repeat steps 2-5, except this time input files have runs of two sorted records that are merged together.
7. Each pass through the files provides larger runs.

Simple External Mergesort (3)

Settings

Proficient Saving... Error Saving
Server Error
Resubmit

Problems with Simple Mergesort

A Better Process

Settings

Proficient Saving... Error Saving
Server Error
Resubmit

Breaking a File into Runs

Replacement Selection (1)

Replacement Selection (2)

RS Example

Settings

Proficient Saving... Error Saving
Server Error
Resubmit

Snowplow Analogy (1)

Snowplow Analogy (2)

Problems with Simple Merge

Multiway Merge (1)

Multiway Merge (2)

Settings

Proficient Saving... Error Saving
Server Error
Resubmit

Limits to Multiway Merge (1)

Limits to Multiway Merge (2)

General Principles