Processing math: 100%

7.Programmer’s View of Files§

Java File Functions

RandomAccessFile(String name, String mode)

close()

read(byte[] b)

write(byte[] b)

seek(long pos)

Primary vs. Secondary Storage

Comparisons

Medium1996199720002004200620082011RAM$45.007.001.5000.35000.15000.03390.0138Disk0.250.100.0100.00100.00050.00010.0001USB drive0.10000.09000.00290.0018Floppy0.500.360.2500.2500Tape0.030.010.0010.0003Solid State0.0021

Golden Rule of File Processing

Disk Drives

Disk drive platters

Sectors

The organization of a disk platter

Terms

Seek Time

Other Factors

(Old) Disk Spec Example

Disk Access Cost Example (1)

Disk Access Cost Example (2)

How Much to Read?

Newer Disk Spec Example

Buffers

Buffer Pools

Buffer Pools

1 / 9 Settings
<<<>>>

Here is a basic example of using a buffer pool. It also illustrates a basic problem that a buffer pool implementation needs to solve. The following series of memory requests will be processed: 9 0 1 7 6 6 0 8 1

Created with Raphaël 2.1.2
  1. AAAAAA0
  2. BBBBBB1
  3. CCCCCC2
  4. DDDDDD3
  5. EEEEEE4
  6. FFFFFF5
  7. GGGGGG6
  8. HHHHHH7
  9. IIIIII8
  10. JJJJJJ9
  1. 0
  2. 1
  3. 2
  4. 3
  5. 4
Disk File
Buffer Pool
Buffers
Proficient Saving... Error Saving
Server Error
Resubmit

Organizing Buffer Pools

LRU

1 / 11 Settings
<<<>>>

Here is an example of buffer pool replacement using the LRU replacement heuristic. The following series of memory requests will be processed: 9 0 1 7 6 6 0 8 1

Created with Raphaël 2.1.2
  1. AAAAAA0
  2. BBBBBB1
  3. CCCCCC2
  4. DDDDDD3
  5. EEEEEE4
  6. FFFFFF5
  7. GGGGGG6
  8. HHHHHH7
  9. IIIIII8
  10. JJJJJJ9
  1. 0
  2. 1
  3. 2
  4. 3
  5. 4
Disk File
Buffer Pool
Buffers
Proficient Saving... Error Saving
Server Error
Resubmit

Dirty Bit

1 / 11 Settings
<<<>>>

Let's see an example of buffer pool replacement using the LRU replacement heuristic and a dirty bit. The following series of memory requests will be processed: 9 0 1 7 6 6 8 1 4

Created with Raphaël 2.1.2
  1. AAAAAA0
  2. BBBBBB1
  3. CCCCCC2
  4. DDDDDD3
  5. EEEEEE4
  6. FFFFFF5
  7. GGGGGG6
  8. HHHHHH7
  9. IIIIII8
  10. JJJJJJ9
  1. 0
  1. 0
  1. 0
  1. 0
  1. 0
Disk File
Buffer Pool
Buffers
Proficient Saving... Error Saving
Server Error
Resubmit

Bufferpool ADT: Message Passing

// ADT for buffer pools using the message-passing style
public interface BufferPoolADT {
  // Copy "sz" bytes from "space" to position "pos" in the buffered storage
  public void insert(byte[] space, int sz, int pos);

  // Copy "sz" bytes from position "pos" of the buffered storage to "space"
  public void getbytes(byte[] space, int sz, int pos);
}

Bufferpool ADT: Buffer Passing

// ADT for buffer pools using the buffer-passing style
public interface BufferPoolADT {
  // Return pointer to the requested block
  public byte[] getblock(int block);

  // Set the dirty bit for the buffer holding "block"
  public void dirtyblock(int block);

  // Tell the size of a buffer
  public int blocksize();
};

Design Issues

Some Goals

Improved Interface

// Improved ADT for buffer pools using the buffer-passing style.
// Most user functionality is in the buffer class, not the buffer pool itself.

// A single buffer in the buffer pool
public interface BufferADT {
  // Read the associated block from disk (if necessary) and return a
  // pointer to the data
  public byte[] readBlock();

  // Return a pointer to the buffer's data array (without reading from disk)
  public byte[] getDataPointer();

  // Flag buffer's contents as having changed, so that flushing the
  // block will write it back to disk
  public void markDirty();

  // Release the block's access to this buffer. Further accesses to
  // this buffer are illegal
  public void releaseBuffer();
}

Improved Interface (2)

public interface BufferPoolADT {

  // Relate a block to a buffer, returning a pointer to a buffer object
  Buffer acquireBuffer(int block);
}

External Sorting

Model of External Computation

Key Sorting

Simple External Mergesort (1)

Simple External Mergesort (2)

  1. Split the file into two files.
  2. Read in a block from each file.
  3. Take first record from each block, output them in sorted order.
  4. Take next record from each block, output them to a second file in sorted order.
  5. Repeat until finished, alternating between output files. Read new input blocks as needed.
  6. Repeat steps 2-5, except this time input files have runs of two sorted records that are merged together.
  7. Each pass through the files provides larger runs.

Simple External Mergesort (3)

1 / 16 Settings
<<<>>>

Our approach to external sorting is derived from the Mergesort algorithm. The simplest form of external Mergesort performs a series of sequential passes over the records, merging larger and larger sublists on each pass.

Created with Raphaël 2.1.2
  1. 36
  2. 17
  3. 28
  4. 23
  1. 20
  2. 13
  3. 14
  4. 15
Input
Output
Runs of length 1
Runs of length 2
Proficient Saving... Error Saving
Server Error
Resubmit

Problems with Simple Mergesort

A Better Process

1 / 9 Settings
<<<>>>

Assume that each record has four bytes of data and a 4-byte key, for a total of eight bytes per record:

Created with Raphaël 2.1.2
8-byte record
4-byte key
4-bytes of data
Proficient Saving... Error Saving
Server Error
Resubmit

Breaking a File into Runs

Replacement Selection (1)

Replacement Selection (2)

Created with Raphaël 2.1.2
Input File
Input Buffer
RAM
Output Buffer
Output Run File

RS Example

1 / 27 Settings
<<<>>>

This is our starting heap:

  1. 120
  2. 191
  3. 312
  4. 253
  5. 214
  6. 565
  7. 406
Created with Raphaël 2.1.2
12
19
31
25
21
56
40
Input:
Output:
  1. 16
  2. 29
  3. 14
  4. 35
  5. 23
Proficient Saving... Error Saving
Server Error
Resubmit

Snowplow Analogy (1)

Snowplow Analogy (2)

Created with Raphaël 2.1.2
Falling Snow
Existing snow
Future snow
Start time T
Snowplow Movement

Problems with Simple Merge

Multiway Merge (1)

Multiway Merge (2)

1 / 31 Settings
<<<>>>

Here are our starting input runs for the multiway merge.

  1. 5
  2. 10
  3. 15
  1. 6
  2. 7
  3. 23
  1. 12
  2. 18
  3. 20
Input Runs
Output Buffer
Proficient Saving... Error Saving
Server Error
Resubmit

Limits to Multiway Merge (1)

Limits to Multiway Merge (2)

General Principles