Close
Register
Close Window

Chapter 27 Searching

Show Source |    | About   «  27.1. Chapter Introduction: Search   ::   Contents   ::   27.3. Search in Sorted Arrays  »

27.2. Analyzing Search in Unsorted Arrays

You already know the simplest form of search: the sequential search algorithm. Sequential search on an unsorted list requires \(\Theta(n)\) time in the worst case.

How many comparisons does linear search do on average? A major consideration is whether \(K\) is in list L at all. We can simplify our analysis by ignoring everything about the input except the position of \(K\) if it is found in L. Thus, we have \(n+1\) distinct possible events: That \(K\) is in one of positions 0 to \(n-1\) in L (each position having its own probability), or that it is not in \(L\) at all. We can express the probability that \(K\) is not in L as

\[\mathbf{P}(K \notin \mathbf{L}) = 1 - \sum_{i=1}^n \mathbf{P}(K = \mathbf{L}[i])\]

where \(\mathbf{P}(x)\) is the probability of event \(x\).

Let \(p_i\) be the probability that \(K\) is in position \(i\) of L (indexed from 0 to \(n-1\). For any position \(i\) in the list, we must look at \(i+1\) records to reach it. So we say that the cost when \(K\) is in position \(i\) is \(i+1\). When \(K\) is not in L, sequential search will require \(n\) comparisons. Let \(p_n\) be the probability that \(K\) is not in L. Then the average cost \(\mathbf{T}(n)\) will be

\[\mathbf{T}(n) = n p_n + \sum_{i=0}^{n-1} (i+1) p_i.\]

What happens to the equation if we assume all the \(p_i\) 's are equal (except \(p_n\))?

\[\begin{split}\mathbf{T}(n) &=& p_n n + \sum_{i=0}^{n-1} (i+1) p\\ &=& p_n n + p\sum_{i=1}^n i\\ &=& p_n n + p\frac{n(n+1)}{2}\\ &=& p_n n + \frac{1 - p_n}{n}\frac{n(n+1)}{2}\\ &=& \frac{n + 1 + p_n(n-1)}{2}\end{split}\]

Depending on the value of \(p_n\), \(\frac{n+1}{2} \leq \mathbf{T}(n) \leq n\).

   «  27.1. Chapter Introduction: Search   ::   Contents   ::   27.3. Search in Sorted Arrays  »

nsf
Close Window