Processing math: 68%

1.Algorithm Efficiency§

There are often many approaches (algorithms) to solve a problem. How do we choose between them?

At the heart of computer program design are two (sometimes conflicting) goals.

  1. To design an algorithm that is easy to understand, code, debug.
  2. To design an algorithm that makes efficient use of the computer’s resources.

Goal (1) is the concern of Software Engineering

Goal (2) is the concern of data structures and algorithm analysis

How to Measure Efficiency?

  1. Empirical comparison (run programs)
  2. Asymptotic Algorithm Analysis

Problems, Algorithms, Programs

1 / 16 Settings
<<<>>>

Here is a visual summary showing how to differentiate between a problem, a problem instance, an algorithm, and a program.

Created with Raphaël 2.1.2
Proficient Saving... Error Saving
Server Error
Resubmit

Growth Rate Example (1)

Example 1: Find largest value

// Return position of largest value in integer array A
static int largest(int[] A) {
  int currlarge = 0;             // Position of largest element seen
  for (int i=1; i<A.length; i++) { // For each element
    if (A[currlarge] < A[i]) {    //   if A[i] is larger
       currlarge = i;            //     remember its position
    }
  }
  return currlarge;              // Return largest position
}

Growth Rate Example (2)

Example 2: Assignment statement

Example 3: Double loop

sum = 0;
for (j=1; j<=n; j++) {     // First for loop
  for (i=1; i<=j; i++) {  //   is a double loop
    sum++;
  }
}
for (k=0; k<n; k++) {      // Second for loop
   A[k] = k;
}

Growth Rate Graph

The growth rates for five equations

Best, Worst, Average Cases

Not all inputs of a given size take the same time to run.

Sequential search for K in an array of n integers:

Best case:

Worst case:

Average case:

Which Analysis to Use?

Faster Computer or Algorithm?

Suppose we buy a computer 10 times faster.

Faster Computer or Algorithm? 2

f(n)nnChangen/n10n100010,000n=10n1020n5005000n=10n105nlogn250184210n<n<10n7.372n270223n=10n3.162n1316n=n+3

Asymptotic Analysis: Big-oh

Definition: For T(n) a non-negatively valued function, T(n) is in the set O(f(n)) if there exist two positive constants c and n0 such that T(n)cf(n) for all n>n0.

Use: The algorithm is in O(n2) in [best, average, worst] case.

Meaning: For all data sets big enough (i.e., n>n0), the algorithm always executes in less than cf(n) steps in the [best, average, worst] case.

Big-oh Notation (cont)

Big-oh notation indicates an upper bound.

Example: If T(n)=3n2 then T(n) is in O(n2).

Look for the tightest upper bound:

Big-Oh Examples

Example 1: Finding value X in an array (average cost).

Then T(n)=csn/2.

For all values of n>1,csn/2csn.

Therefore, the definition is satisfied for f(n)=n,n0=1, and c=cs. Hence, T(n) is in O(n).

Big-Oh Examples (2)

Example 2: Suppose T(n)=c1n2+c2n, where c1 and c2 are positive.

c1n2+c2nc1n2+c2n2(c1+c2)n2 for all n>1.

Then T(n)cn2 whenever n>n0, for c=c1+c2 and n0=1.

Therefore, T(n) is in O(n2) by definition.

Example 3: T(n)=c. Then T(n) is in O(1).

A Common Misunderstanding

“The best case for my algorithm is n=1 because that is the fastest.”

WRONG!

Big-oh refers to a growth rate as n grows to

Best case is defined for the input of size n that is cheapest among all inputs of size n.

Big-Omega Ω

Definition: For T(n) a non-negatively valued function, T(n) is in the set Ω(g(n)) if there exist two positive constants c and n0 such that T(n)cg(n) for all n>n0.

Meaning: For all data sets big enough (i.e., n>n0), the algorithm always requires more than cg(n) steps.

Lower bound.

Big-Omega Example

T(n)=c1n2+c2n.

c1n2+c2nc1n2 for all n>1.

T(n)cn2 for c=c1 and n0=1.

Therefore, T(n) is in Ω(n2) by the definition.

We want the greatest lower bound.

Theta Notation Θ

When big-Oh and Ω coincide, we indicate this by using Θ (big-Theta) notation.

Definition: An algorithm is said to be in Θ(h(n)) if it is in O(h(n)) and it is in Ω(h(n)).

A Common Misunderstanding

Confusing worst case with upper bound.

Upper bound refers to a growth rate.

Worst case refers to the worst input from among the choices for possible inputs of a given size.

Simplifying Rules

  1. If f(n) is in O(g(n)) and g(n) is in O(h(n)), then f(n) is in O(h(n)).
  2. If f(n) is in O(kg(n)) for some constant k>0, then f(n) is in O(g(n)).
  3. If f1(n) is in O(g1(n)) and f2(n) is in O(g2(n)), then (f1+f2)(n) is in O(max.
  4. If f_1(n) is in O(g_1(n)) and f_2(n) is in O(g_2(n)), then f_1(n)f_2(n) is in O(g_1(n)g_2(n)).

Summary

Created with Raphaël 2.1.2
Costs for all inputs of an arbitrary (but fixed) size n for three representative algorithms
I_n
cheap
expensive
2^n-1
_
*
Towers of Hanoi
I_n
cheap
expensive
n
_
FindMax
I_n
cheap
expensive
n
_
1
_
Find
Costs, as n grows, for some representative algorithms
n
cheap
expensive
2^n-1
_
T(n) = 2^n-1
Towers of Hanoi
n
cheap
expensive
n
_
1
_
FindMax
n
cheap
expensive
1
_
Find (Best)
n
cheap
expensive
n
_
1
_
Find (Average)
n
cheap
expensive
n
_
1
_
Find (Worst)

.

.

Time Complexity Examples (1)

Example: a = b;

This assignment takes constant time, so it is \Theta(1).

Example:

sum = 0;
for (i=1; i<=n; i++) {
   sum += n;
}

Time Complexity Examples (2)

Example:

sum = 0;
for (j=1; j<=n; j++) {     // First for loop
  for (i=1; i<=j; i++) {  //   is a double loop
    sum++;
  }
}
for (k=0; k<n; k++) {      // Second for loop
   A[k] = k;
}

Time Complexity Examples (3)

Example: Compare these two code fragments:

sum1 = 0;
for (i=1; i<=n; i++) {     // First double loop
  for (j=1; j<=n; j++) {  //   do n times
    sum1++;
  }
}

sum2 = 0;
for (i=1; i<=n; i++) {     // Second double loop
  for (j=1; j<=i; j++) {  //   do i times
    sum2++;
  }
}

Time Complexity Examples (4)

Not all double loops are \Theta(n^2).

sum1 = 0;
for (k=1; k<=n; k*=2) {    // Do log n times
  for (j=1; j<=n; j++) {  // Do n times
    sum1++;
  }
}

sum2 = 0;
for (k=1; k<=n; k*=2) {    // Do log n times
  for (j=1; j<=k; j++) {  // Do k times
    sum2++;
  }
}

Other Control Statements

while loop: Analyze like a for loop.

if statement: Take greater complexity of then/else clauses.

switch statement: Take complexity of most expensive case.

Subroutine call: Complexity of the subroutine.

Analyzing Problems

Upper bound: Upper bound of best known algorithm.

Lower bound: Lower bound for every possible algorithm.

Analyzing Problems: Example

May or may not be able to obtain matching upper and lower bounds.

Example of imperfect knowledge: Sorting

  1. Cost of I/O: \Omega(n).
  2. Bubble or insertion sort: O(n^2).
  3. A better sort (Quicksort, Mergesort, Heapsort, etc.): O(n \log n).
  4. We prove later that sorting is in \Omega(n \log n).

Space/Time Tradeoff Principle

One can often reduce time if one is willing to sacrifice space, or vice versa.

Disk-based Space/Time Tradeoff Principle: The smaller you make the disk storage requirements, the faster your program will run.

Multiple Parameters

Compute the rank ordering for all C pixel values in a picture of P pixels.

  for (i=0; i<C; i++) {   // Initialize count
     count[i] = 0;
  }
  for (i=0; i<P; i++) {   // Look at all of the pixels
     count[value(i)]++; // Increment a pixel value count
  }
  sort(count);          // Sort pixel value counts

If we use P as the measure, then time is (P \log P).

More accurate is \Theta(P + C log C).

Space Complexity

Space complexity can also be analyzed with asymptotic complexity analysis.

Time: Algorithm

Space: Data Structure