Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Data-Structures-And-Algorithms-Alfred-V-Aho

.pdf
Скачиваний:
122
Добавлен:
09.04.2015
Размер:
6.91 Mб
Скачать

Data Structures and Algorithms: CHAPTER 1: Design and Analysis of Algorithms

all n.

When we say T(n) is O(f(n)), we know that f(n) is an upper bound on the growth rate of T(n). To specify a lower bound on the growth rate of T(n) we can use the notation T(n) is W(g(n)), read "big omega of g(n)" or just "omega of g(n)," to mean that there exists a positive constant c such that T(n) ³ cg(n) infinitely often (for an infinite number of values of n).

Example 1.6. To verify that the function T(n)= n3 + 2n2 is W(n3), let c = 1. Then T(n) ³ cn3 for n = 0, 1, . . ..

For another example, let T(n) = n for odd n ³ 1 and T(n) = n2/100 for even n ³ 0. To verify that T(n) is W (n2), let c = 1/100 and consider the infinite set of n's: n = 0, 2, 4, 6, . . ..

The Tyranny of Growth Rate

We shall assume that programs can be evaluated by comparing their running-time functions, with constants of proportionality neglected. Under this assumption a program with running time O(n2) is better than one with running time O(n3), for example. Besides constant factors due to the compiler and machine, however, there is a constant factor due to the nature of the program itself. It is possible, for example, that with a particular compilermachine combination, the first program takes 100n2 milliseconds, while the second takes 5n3 milliseconds. Might not the 5n3 program be better than the 100n2 program?

The answer to this question depends on the sizes of inputs the programs are expected to process. For inputs of size n < 20, the program with running time 5n3 will be faster than the one with running time 100n2. Therefore, if the program is to be run mainly on inputs of small size, we would indeed prefer the program whose running time was O(n3). However, as n gets large, the ratio of the running times, which is 5n3/100n2 = n/20, gets arbitrarily large. Thus, as the size of the input increases, the O(n3) program will take significantly more time than the O(n2) program. If there are even a few large inputs in the mix of problems these two programs are designed to solve, we can be much better off with the program whose running time has the lower growth rate.

Another reason for at least considering programs whose growth rates are as low as possible is that the growth rate ultimately determines how big a problem we can solve on a computer. Put another way, as computers get faster, our desire to solve larger problems on them continues to increase. However, unless a program has a low growth rate such as O(n) or O(nlogn), a modest increase in computer speed makes very little difference in the size of the largest problem we can solve in a fixed amount of time.

Example 1.7. In Fig. 1.11 we see the running times of four programs with different time complexities, measured in seconds, for a particular compiler-machine combination.

http://www.ourstillwaters.org/stillwaters/csteaching/DataStructuresAndAlgorithms/mf1201.htm (17 of 37) [1.7.2001 18:58:22]

Data Structures and Algorithms: CHAPTER 1: Design and Analysis of Algorithms

Suppose we can afford 1000 seconds, or about 17 minutes, to solve a given problem. How large a problem can we solve? In 103 seconds, each of the four algorithms can solve roughly the same size problem, as shown in the second column of Fig. 1.12.

Fig. 1.11. Running times of four programs.

Suppose that we now buy a machine that runs ten times faster at no additional cost. Then for the same cost we can spend 104 seconds on a problem where we spent 103 seconds before. The maximum size problem we can now solve using each of the four programs is shown in the third column of Fig. 1.12, and the ratio of the third and second columns is shown in the fourth column. We observe that a 1000% improvement in computer speed yields only a 30% increase in the size of problem we can solve if we use the O(2n) program. Additional factors of ten speedup in the computer yield an even smaller percentage increase in problem size. In effect, the O(2n) program can solve only small problems no matter how fast the underlying computer.

Fig. 1.12. Effect of a ten-fold speedup in computation time.

In the third column of Fig. 1.12 we see the clear superiority of the O(n) program; it returns a 1000% increase in problem size for a 1000% increase in computer speed. We see that the O(n3) and O(n2) programs return, respectively, 230% and 320% increases in problem size for 1000% increases in speed. These ratios will be maintained for additional increases in speed.

As long as the need for solving progressively larger problems exists, we are led to an almost paradoxical conclusion. As computation becomes cheaper and machines become faster, as will most surely continue to happen, our desire to solve larger and more complex problems will continue to increase. Thus, the discovery and use of efficient algorithms, those whose growth rates are low, becomes more rather than less important.

A Few Grains of Salt

We wish to re-emphasize that the growth rate of the worst case running time is not the sole, or necessarily even the most important, criterion for evaluating an algorithm or program. Let us review some conditions under which the running time of a program can be overlooked in favor of other issues.

http://www.ourstillwaters.org/stillwaters/csteaching/DataStructuresAndAlgorithms/mf1201.htm (18 of 37) [1.7.2001 18:58:22]

Data Structures and Algorithms: CHAPTER 1: Design and Analysis of Algorithms

1.If a program is to be used only a few times, then the cost of writing and debugging dominate the overall cost, so the actual running time rarely affects the total cost. In this case, choose the algorithm that is easiest to implement correctly.

2.If a program is to be run only on "small" inputs, the growth rate of the running time may be less important than the constant factor in the formula for running time. What is a "small" input depends on the exact running times of the competing algorithms.

There are some algorithms, such as the integer multiplication algorithm due to Schonhage and Strassen [1971], that are asymptotically the most efficient known for their problem, but have never been used in practice even on the largest problems, because the constant of proportionality is so large in comparison to other simpler, less "efficient" algorithms.

3.A complicated but efficient algorithm may not be desirable because a person other than the writer may have to maintain the program later. It is hoped that by making the principal techniques of efficient algorithm design widely known, more complex algorithms may be used freely, but we must consider the possibility of an entire program becoming useless because no one can understand its subtle but efficient algorithms.

4.There are a few examples where efficient algorithms use too much space to be implemented without using slow secondary storage, which may more than negate the efficiency.

5.In numerical algorithms, accuracy and stability are just as important as efficiency.

1.5 Calculating the Running Time of a

Program

Determining, even to within a constant factor, the running time of an arbitrary program can be a complex mathematical problem. In practice, however, determining the running time of a program to within a constant factor is usually not that difficult; a few basic principles suffice. Before presenting these principles, it is important that we learn how to add and multiply in "big oh" notation.

Suppose that T1(n) and T2(n) are the running times of two program fragments P1 and P2,

and that T1(n) is O(f(n)) and T2(n) is O(g(n)). Then T1(n)+T2(n), the running time of P1 followed by P2, is O(max(f(n),g(n))). To see why, observe that for some constants c1, c2,

n1, and n2, if n ³ n1 then T1(n) £ c1f(n), and if n ³ n2 then T2(n) £ c2g(n). Let n0 = max(n1, n2). If n ³ n0, then T1(n) + T2(n) £ c1f(n) + c2g(n). From this we conclude that if n ³ n0, then T1(n) + T2(n) £ (c1 + c2)max(f(n), g(n)). Therefore, the combined running time T1(n) + T2(n) is O (max(f (n), g (n))).

http://www.ourstillwaters.org/stillwaters/csteaching/DataStructuresAndAlgorithms/mf1201.htm (19 of 37) [1.7.2001 18:58:22]

Data Structures and Algorithms: CHAPTER 1: Design and Analysis of Algorithms

Example 1.8. The rule for sums given above can be used to calculate the running time of a sequence of program steps, where each step may be an arbitrary program fragment with loops and branches. Suppose that we have three steps whose running times are, respectively, O(n2), O(n3) and O(n log n). Then the running time of the first two steps executed sequentially is O(max(n2, n3)) which is O(n3). The running time of all three together is O(max(n3, n log n)) which is O(n3).

In general, the running time of a fixed sequence of steps is, to within a constant factor, the running time of the step with the largest running time. In rare circumstances there will be two or more steps whose running times are incommensurate (neither is larger than the other, nor are they equal). For example, we could have steps of running times O(f (n)) and O(g (n)), where

In such cases the sum rule must be applied directly; the running time is O(max(f(n), g(n))),

that is, n4 if n is even and n3 if n is odd.

Another useful observation about the sum rule is that if g(n) ≤ f(n) for all n above some constant n0, then O(f(n) + g(n)) is the same as O(f(n)). For example, O(n2+n) is the same as O(n2).

The rule for products is the following. If T1(n) and T2(n) are O(f(n)) and O(g(n)), respectively, then T1(n)T2(n) is O(f(n)g(n)). The reader should prove this fact using the same ideas as in the proof of the sum rule. It follows from the product rule that O(cf(n)) means the same thing as O(f(n)) if c is any positive constant. For example, O(n2/2) is the same as O(n2).

Before proceeding to the general rules for analyzing the running times of programs, let us take a simple example to get an overview of the process.

Example 1.9. Consider the sorting program bubble of Fig. 1.13, which sorts an array of integers into increasing order. The net effect of each pass of the inner loop of statements (3)-

(6) is to "bubble" the smallest element toward the front of the array.

procedure bubble ( var A: array [1..n] of integer ); { bubble sorts array A into increasing order }

var

http://www.ourstillwaters.org/stillwaters/csteaching/DataStructuresAndAlgorithms/mf1201.htm (20 of 37) [1.7.2001 18:58:22]

Data Structures and Algorithms: CHAPTER 1: Design and Analysis of Algorithms

i, j, temp: integer; begin

(1)for i := 1 to n-1 do

(2)for j := n downto i+1 do

(3)

if A[j-1] > A[j] then begin

 

{ swap A[j - 1] and A[j] }

(4)

temp := A[j-1];

(5)

A[j-1] := A[j];

(6)

AI> [j] := temp

end end; { bubble }

Fig. 1.13. Bubble sort.

The number n of elements to be sorted is the appropriate measure of input size. The first observation we make is that each assignment statement takes some constant amount of time, independent of the input size. That is to say, statements (4), (5) and (6) each take O(1) time. Note that O(1) is "big oh" notation for "some constant amount." By the sum rule, the combined running time of this group of statements is O(max(1, 1, 1)) = O(1).

Now we must take into account the conditional and looping statements. The ifand forstatements are nested within one another, so we may work from the inside out to get the running time of the conditional group and each loop. For the if-statement, testing the condition requires O(1) time. We don't know whether the body of the if-statement (lines (4)- (6)) will be executed. Since we are looking for the worst-case running time, we assume the worst and suppose that it will. Thus, the if-group of statements (3)-(6) takes O(1) time.

Proceeding outward, we come to the for-loop of lines (2)-(6). The general rule for a loop is that the running time is the sum, over each iteration of the loop, of the time spent executing the loop body for that iteration. We must, however, charge at least O(1) for each iteration to account for incrementing the index, for testing to see whether the limit has been reached, and for jumping back to the beginning of the loop. For lines (2)-(6) the loop body takes O(1) time for each iteration. The number of iterations of the loop is n-i, so by the product rule, the time spent in the loop of lines (2)-(6) is O((n-i) X 1) which is O(n-i).

Now let us progress to the outer loop, which contains all the executable statements of the program. Statement (1) is executed n - 1 times, so the total running time of the program is bounded above by some constant times

which is O(n2). The program of Fig. 1.13, therefore, takes time proportional to the square of the number of items to be sorted. In Chapter 8, we shall give sorting programs whose

http://www.ourstillwaters.org/stillwaters/csteaching/DataStructuresAndAlgorithms/mf1201.htm (21 of 37) [1.7.2001 18:58:22]

Data Structures and Algorithms: CHAPTER 1: Design and Analysis of Algorithms

running time is O(nlogn), which is considerably smaller, since for large n, lognis very much smaller than n.

Before proceeding to some general analysis rules, let us remember that determining a precise upper bound on the running time of programs is sometimes simple, but at other times it can be a deep intellectual challenge. There are no complete sets of rules for analyzing programs. We can only give the reader some hints and illustrate some of the subtler points by examples throughout this book.

Now let us enumerate some general rules for the analysis of programs. In general, the running time of a statement or group of statements may be parameterized by the input size and/or by one or more variables. The only permissible parameter for the running time of the whole program is n, the input size.

1.The running time of each assignment, read, and write statement can usually be taken to be O(1). There are a few exceptions, such as in PL/I, where assignments can involve arbitrarily large arrays, and in any language that allows function calls in assignment statements.

2.The running time of a sequence of statements is determined by the sum rule. That is, the running time of the sequence is, to within a constant factor, the largest running time of any statement in the sequence.

3.The running time of an if-statement is the cost of the conditionally executed statements, plus the time for evaluating the condition. The time to evaluate the condition is normally O(1). The time for an if-then-else construct is the time to evaluate the condition plus the larger of the time needed for the statements executed when the condition is true and the time for the statements executed when the condition is false.

4.The time to execute a loop is the sum, over all times around the loop, of the time to execute the body and the time to evaluate the condition for termination (usually the latter is O(1)). Often this time is, neglecting constant factors, the product of the number of times around the loop and the largest possible time for one execution of the body, but we must consider each loop separately to make sure. The number of iterations around a loop is usually clear, but there are times when the number of iterations cannot be computed precisely. It could even be that the program is not an algorithm, and there is no limit to the number of times we go around certain loops.

Procedure Calls

If we have a program with procedures, none of which is recursive, then we can compute the running time of the various procedures one at a time, starting with those procedures that make no calls on other procedures. (Remember to count a function invocation as a "call.") There must be at least one such procedure, else at least one procedure is recursive. We can

http://www.ourstillwaters.org/stillwaters/csteaching/DataStructuresAndAlgorithms/mf1201.htm (22 of 37) [1.7.2001 18:58:22]

Data Structures and Algorithms: CHAPTER 1: Design and Analysis of Algorithms

then evaluate the running time of procedures that call only procedures that make no calls, using the already-evaluated running times of the called procedures. We continue this process, evaluating the running time of each procedure after the running times of all procedures it calls have been evaluated.

If there are recursive procedures, then we cannot find an ordering of all the procedures so that each calls only previously evaluated procedures. What we must now do is associate with each recursive procedure an unknown time function T(n), where n measures the size of the arguments to the procedure. We can then get a recurrence for T(n), that is, an equation for T(n) in terms of T(k) for various values of k.

Techniques for solving many different kinds of recurrences exist; we shall present some of these in Chapter 9. Here we shall show how to analyze a simple recursive program.

Example 1.10. Figure 1.14 gives a recursive program to compute n!, the product of all the integers from 1 to n inclusive.

An appropriate size measure for this function is the value of n. Let T(n) be the running time for fact(n). The running time for lines (1) and (2) is O(1), and for line (3) it is O(1) + T(n-1). Thus, for some constants c and d,

function fact ( n: integer ): integer; { fact(n) computes n! }

begin

(1)if n <= 1 then

(2)

fact := 1

 

else

(3)

fact := n * fact(n-1)

 

end; { fact }

Fig. 1.14. Receursive program to compute factorials.

Assuming n > 2, we can expand T(n-1) in (1.1) to obtain

T(n) = 2c + T(n-2) if n > 2

That is, T(n-1) = c + T(n-2), as can be seen by substituting n-1 for n in (1.1). Thus, we may substitute c + T(n-2) for T(n-1) in the equation T(n) = c + T(n-1). We can then use (1.1) to expand T(n-2) to obtain

http://www.ourstillwaters.org/stillwaters/csteaching/DataStructuresAndAlgorithms/mf1201.htm (23 of 37) [1.7.2001 18:58:22]

Data Structures and Algorithms: CHAPTER 1: Design and Analysis of Algorithms

T(n) = 3c + T(n-3) if n > 3

and so on. In general,

T(n) = ic + T(n-i) if n > i

Finally, when i = n-1 we get

T(n) = c(n-1) + T(1) = c(n-1) + d

(1.2)

From (1.2) we can conclude that T(n) is O(n). We should note that in this analysis we have assumed that the multiplication of two integers is an O(1) operation. In practice, however, we cannot use the program in Fig. 1.14 to compute n! for large values of n, because the size of the integers being computed will exceed the word length of the underlying machine.

The general method for solving recurrence equations, as typified by Example 1.10, is repeatedly to replace terms T(k) on the right side of the equation by the entire right side with k substituted for n, until we obtain a formula in which T does not appear on the right as in (1.2). Often we must then sum a series or, if we cannot sum it exactly, get a close upper bound on the sum to obtain an upper bound on T(n).

Programs with GOTO's

In analyzing the running time of a program we have tacitly assumed that all flow of control within a procedure was determined by branching and 1ooping constructs. We relied on this fact as we determined the running time of progressively larger groups of statements by assuming that we needed only the sum rule to group sequences of statements together. Goto statments, however, make the logical grouping of statements more complex. For this reason, goto statements should be avoided, but Pascal lacks breakand continue-statements to jump out of loops. The goto-statement is often used as a substitute for statements of this nature in Pascal.

We suggest the following approach to handling goto's that jump from a loop to code that is guaranteed to follow the loop, which is generally the only kind of goto that is justified. As the goto is presumably executed conditionally within the loop, we may pretend that it is never taken. Because the goto takes us to a statement that will be executed after the loop completes, this assumption is conservative; we can never underestimate the worst case running time of the program if we assume the loop runs to completion. However, it is a rare program in which ignoring the goto is so conservative that it causes us to overestimate the growth rate of the worst case running time for the program. Notice that if we were faced with a goto that jumped back to previously executed code we could not ignore it safely, since that goto may create a loop that accounts for the bulk of the running time.

http://www.ourstillwaters.org/stillwaters/csteaching/DataStructuresAndAlgorithms/mf1201.htm (24 of 37) [1.7.2001 18:58:22]

Data Structures and Algorithms: CHAPTER 1: Design and Analysis of Algorithms

We should not leave the impression that the use of backwards goto's by themselves make running times unanalyzable. As long as the loops of a program have a reasonable structure, that is, each pair of loops are either disjoint or nested one within the other, then the approach to running time analysis described in this section will work. (However, it becomes the responsibility of the analyzer to ascertain what the loop structure is.) Thus, we should not hesitate to apply these methods of program analysis to a language like Fortran, where goto's are essential, but where programs written in the language tend to have a reasonable loop structure.

Analyzing a Pseudo-Program

If we know the growth rate of the time needed to execute informal English statements, we can analyze pseudo-programs just as we do real ones. Often, however, we do not know the time to be spent on not-fully-implemented parts of a pseudo-program. For example, if we have a pseudo-program in which the only unimplemented parts are operations on ADT's, one of several implementations for an ADT may be chosen, and the overall running time may depend heavily on the implementation. Indeed, one of the reasons for writing programs in terms of ADT's is so we can consider the trade-offs among the running times of the various operations that we obtain by different implementations.

To analyze pseudo-programs consisting of programming language statements and calls to unimplemented procedures, such as operations on ADT's, we compute the running time as a function of unspecified running times for each procedure. The running time for a procedure will be parameterized by the "size" of the argument or arguments for that procedure. Just as for "input size," the appropriate measure of size for an argument is a matter for the analyzer to decide. If the procedure is an operation on an ADT, then the underlying mathematical model for the ADT often indicates the logical notion of size. For example, if the ADT is based on sets, the number of elements in a set is often the right notion of size. In the remaining chapters we shall see many examples of analyzing the running time of pseudo-programs.

1.6 Good Programming Practice

There are a substantial number of ideas we should bear in mind when designing an algorithm and implementing it as a program. These ideas often appear platitudinous, because by-and-large they can be appreciated only through their successful use in real problems, rather than by development of a theory. They are sufficiently important, however, that they are worth repeating here. The reader should watch for the application of these ideas in the programs designed in this book, as well as looking for opportunities to put them into practice in his own programs.

1.Plan the design of a program. We mentioned in Section 1.1 how a program can be designed by first sketching the algorithm informally, then as a pseudo-program, and gradually refining the pseudo-program until it becomes executable code. This

http://www.ourstillwaters.org/stillwaters/csteaching/DataStructuresAndAlgorithms/mf1201.htm (25 of 37) [1.7.2001 18:58:22]

Data Structures and Algorithms: CHAPTER 1: Design and Analysis of Algorithms

strategy of sketch-then-detail tends to produce a more organized final program that is easier to debug and maintain.

2.Encapsulate. Use procedures and ADT's to place the code for each principal operation and type of data in one place in the program listing. Then, if changes become necessary, the section of code requiring change will be localized.

3.Use or modify an existing program. One of the chief inefficiencies in the programming process is that usually a project is tackled as if it were the first program ever written. One should first look for an existing program that does all or a part of the task. Conversely, when writing a program, one should consider making it available to others for possibly unanticipated uses.

4.Be a toolsmith. In programming parlance, a tool is a program with a variety of uses. When writing a program, consider whether it could be written in a somewhat more general way with little extra effort. For example, suppose one is assigned the task of writing a program to schedule final examinations. Instead, write a tool that takes an arbitrary graph and colors the vertices with as few colors as possible, so that no two vertices connected by an edge have the same color. In the context of examination scheduling, the vertices are classes, the colors are examination periods, and an edge between two classes means that the classes have a student in common. The coloring program, together with routines that translate class lists into graphs and colors into specific times and days, is the examination scheduler. However, the coloring program can be used for problems totally unrelated to examination scheduling, such as the traffic light problem of Section 1.1.

5.Program at the command level. Often we cannot find in a library the one program needed to do a job, nor can we adapt one tool to do the job. A well-designed operating system will allow us to connect a network of available programs together without writing any programs at all, except for one list of operating system commands. To make commands composable, it is generally necessary that each behave as a filter, a program with one input file and one output file. Notice that any number of filters can be composed, and if the command language of the operating system is intelligently designed, merely listing the commands in the order in which they are to be performed will suffice as a program.

Example 1.11. As an example, let us consider the program spell, as it was originally written by S.C. Johnson from UNIXcommands. This program takes as input a file f1consisting of English text and produces as output all those words in f1that are not found in a small dictionary.spell tends to list proper names as misspellings and may also list real words not in its dictionary, but the typical output of spell is short enough that it can be scanned by eye, and human intelligence can be used to determine whether a word in the output of spell is a misspelling. (This book was checked using spell.)

The first filter used by spell is a command called translate that, given appropriate parameters, replaces capital letters by lower case letters and blanks by newlines, leaving

http://www.ourstillwaters.org/stillwaters/csteaching/DataStructuresAndAlgorithms/mf1201.htm (26 of 37) [1.7.2001 18:58:22]

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]