ALGORITHMS

The Concept of an Algorithm

Algorithm - informal definition:

A set of steps that defines how a task is performed.

	An algorithm is abstract and distinct from its representation.
	A program is a formal representation of an algorithm designed for computer applications.
	A process is the activity of executing the program, or equivalently, the activity of executing an algorithm.
	Parallel algorithms contain more than one sequence of steps, each designed to be executed by different processors in a multiprocessor machine.
	Algorithm - formal definition:
	An algorithm is an ordered set of unambiguous, executable steps, defining a terminating process.
	Although, in general, an algorithm must lead to an end there are applications in real life for non-terminating processes, for example:

monitoring the vital signs of a patient

maintaining an aircraft's in-flight altitude

Algorithm Representation

Primitives

	The representation of an algorithm requires some form of language.
	Problems encountered:

	Terminology used may have more than one meaning.
	Misunderstandings over the level of detail required.

Primitives are building blocks for algorithmic representations.

Remove problems of ambiguity by:

	Assignment of precise definitions.
	Establishing a uniform level of detail

Programming Language: A collection of primitives along with a collection of rules stating how the primitives can be combined to represent more complex ideas.

2 Parts of a primitive:

Syntax - the primitive's symbolic representation

Semantics - the concept represented, or the meaning of the primitive

Formal Programming Languages: use a collection of "higher level" primitives, each being an abstract tool constructed form the lower-level primitives provided in the machine's language.

Pseudocode

	A pseudocode is a notational system in which ideas can be expressed informally during the algorithm development process.
	The text's approach to pseudocode is to develop a consistent, concise notation for representing recurring semantic structures. Some of these structures are:

if (condition) then (activity)

else (activity)

	if (condition) then (activity)
	while (condition) do (activity)
	assign name the value expression
	procedure name

	When a task performed by a procedure is needed elsewhere, we only have to request it by name.
	Procedures should be as generic as possible.
	The data that the procedure performs a task on is also assigned a generic name and is listed in parentheses on the same line as the procedure name.

example: procedure Sort (List)

Indentation: often enhances the readability of a program, so it is also used in pseudocode.

Pseudocode provides a rough outline of an algorithm, not a finished, formal program.

Algorithm Discovery

The development of a program consists of two activities:

	discovering the underlying algorithm
	representing that algorithm as a program

The Theory of problem Solving

	There is a close association between the process of algorithm discovery and that of general problem solving.
	Problem Solving Phases (by mathematician C. Polya):

	Understand the problem.
	Devise a plan for solving the problem.
	Carry out the plan.
	Evaluate the solution for accuracy and for its potential as a tool for solving other problems.

Phases of program development:

	Understand the problem.
	Get an idea as to how an algorithmic procedure might solve the problem.
	Formulate the algorithm and represent it as a program.
	Evaluate the program for accuracy and for its potential as a tool for solving other problems.

Polya's 4 phases are not necessarily completed in sequence.

The period between conscious work on a problem and a sudden inspiration is known as an incubation period.

Getting a Foot in the Door

Several general approaches to going about getting a foot in the door.

	Try working the problem backwards.
	Look for a related problem that is either easier to solve or has been solved before and then trying to apply its solution to the current problem.
	Apply stepwise refinement - breaking the original problem into subproblems.

	Stepwise refinement is a top-down methodology in that it progresses from the general to the specific.
	Solutions produced by stepwise refinement possess a natural modular structure.

	If an algorithm has a natural modular structure, it is easily adapted to a modular representation, which is conducive to the development of a manageable program.
	Modules produced are compatible with the concept of team programming.

Iterative Structures

Iterative structures -- a collection of instructions is repeated in a looping manner.

The Sequential Search Algorithm

Problem stated: To search a list for the occurrence of a particular target value.

We assume the list is in alphabetical order or in order of increasing magnitude.

Solution: To continue searching down the list as long as there are more names to be investigated and the target name is less than the name currently being considered.

The Sequential Search algorithm in Pseudocode: procedure Search (List, TargetValue)

                                  if (List empty)
                                                then
                                                    (Declare search a failure)
                                                    else
                                                            [Select the first entry in List as the test entry.
                                                             while (TargetValue > test entry and there remain entries to be
                                                             considered)
                                                                  do (Select the next entry in List as the test entry.);
                                                             if (TargetValue = test entry)
                                                                then (Declare search a success.)
                                                                else (Declare search a failure.)]

In the case of long lists, sequential searches are not as efficient as other techniques.

Loop Control

	Loop is an iterative structure for implementing the repetitive use of an instruction or sequence of instructions.
	The control of the loop is the more error prone part of the structure.
	Loop control consists of three activities:

	initialize - Establish an initial state that will be modified toward the termination condition.
	test - Compare the current state to the termination condition and terminate the repetition if

equal.

modify - Change the state in such a way that it moves toward the termination condition.

There are 2 popular loop structures that differ only in the order in which the loop control components are executed:

while (condition) do (activity)

test for termination occurs before the loop's body is executed.

repeat (activity) until (condition)

	test occurs after the loop's body is executed
	loop's body is always performed at least once.

Flowcharts - use various shapes to represent individual steps and arrows to indicate the sequence of steps.

	Diamond indicates a decision.
	Rectangle indicates an arbitrary statement or sequence of statements.

The Insertion Sort Algorithm

Goal: to sort the list within itself.

This is a typical restriction because we want to use the storage space available in an efficient manner.

Generalize a process to obtain an algorithm:

Pick the first name in the unsorted portion of the list, slide the names greater than the extracted name down, and insert the extracted name back in the list where the hole appears.

Identify the extracted name as the pivot entry.

Note that the process should be executed repeatedly, so we need to use an iterative structure.

Formulate the pseudocode for the process:

Move the pivot entry to a temporary location leaving a hole in List;

      while (there is a name above the hole and that name is greater than the pivot) do
                (move the name above the hole down into the hole leaving a hole above the name)
          Move the pivot entry into the hole in List.

The insertion sort expressed in pseudocode: procedure Sort (List)

                                        if (there are two or more entries in List) then
                                                [Shade the portion of List from the second entry through the last entry;
                                            repeat
                                                (Remove the shading from the first name in the shaded portion of List
                                                    and identify this name as the pivot entry;
                                                Move the pivot entry to a temporary location leaving a hole in List;
                                                    while (there is a name above the hole and that name is greater than
                                                    the pivot) do
                                                        (Move the name above the hole down into the hole leaving a hole above
                                                            the name)
                                                        Move the pivot entry into the hole in List.)
                                            until (entire List is unshaded)]

The components of repetitive control in Insertion Sort:

Outer Loop:

initialize -- shading the portion of List from the second entry to the last

test -- when shaded portion of list becomes empty

modify -- Remove shading from first name and identify as pivot entry

Inner Loop:

initialize -- remove pivot entry from List and create a hole

test -- hole is below a name that is not greater than the pivot or hole reaches top of List.

modify -- Moving entries above the hole down, thus moving the hole up.

Recursive Structures

Recursive structures provide an alternative to the loop paradigm for repetitive structures.

The Binary Search Algorithm

	Binary Search Algorithm applies a divide-and-conquer methodology to the search process.
	Divides the list in half, then in half again, etc. until the Target Value is found.
	Assumes a generic sorted list.
	Apply the same Search technique to the smaller lists that we apply to the original one.
	Progress in the original copy of the list is temporarily suspended while we apply the second copy to the task of searching the smaller list. It reports its findings to the original copy and progress is continued in the original.
	The Binary Search Algorithm in Pseudocode:

procedure Search (List, TargetValue)

if (list empty)
                    then
                            (Declare the search a failure)
                    else
                            [Select the middle entry in List as the test entry;
                              Execute one of the following blocks of instructions depending on whether TargetValue is
                               equal to, less than, or greater than the test entry.
                            TargetValue = test entry;
                                    (Declare the search a success.)
                                TargetValue < test entry;
                                    [Apply the procedure Search to see whether TargetValue is in the portion of List
                                    preceding the test entry, and
                                if (that search is successful)
                                    then (Declare this search a success.)
                                    else (Declare this search a failure.)]
                            TargetValue > test entry;
                                    [Apply the procedure Search to see whether TargetValue is in the portion of List
                                    following the test entry, and
                                if (that search is successful)
                                    then (Declare this search a success.)
                                    else (Declare this search a failure.)]]

Recursive Control

	Recursion - Each stage of the repetition is executed as a subtask of the previous stage.
	The execution of a recursive algorithm creates multiple copies of the algorithm called activations.
	Just as in loop control, recursive systems are dependent on testing for a termination condition and on a design that assures this condition will be reached.

	Testing for the condition is done before requesting further activations.
	All activations that are generated ultimately terminate, leaving the original task completed.
	Equivalence of iterative and recursive structures is proven in Appendix E.

The Quick Sort Algorithm

	Goal of Quick Sort Algorithm - to sort a list within itself.
	Approach: Select a name, called the pivot entry, find its correct position, and place the name in that position.
	Choose the top entry as the pivot.
	There is a pointer to the pivot and to the last name.
	The algorithm directs the movement of the pointers toward each other in such a way that the following assertion is always satisfied:

Assertion 1: The names above the top pointer are less than (alphabetically) or equal to the pivot entry, and those below the bottom pointer are greater than the pivot.

When the two pointers coincide, we have found the correct location for the pivot.

We interchange the pivot with the entry identified by the common pointers.

As the bottom pointer is moved up the list, as long as the name is greater than the pivot, the pointer is moved up. When the name is less than or equal to the pivot entry, the movement of the pointer is stopped. The top pointer is moved down the list as long as the name compared to the pivot is less than or equal to the pivot.

	If the pointers do not coincide, a deadlock is produced, which we break by interchanging the names designated by the pointers.
	Assertion 2: At the beginning of the sort process and after each interchange for breaking a deadlock, the top pointer points to a name less than or equal to the pivot.
	Assertion 3: If the pointers coincide, the entry they identify is less than or equal to the pivot.

The list is now 2 lists: the list above the pivot and the list below it.

We consistently apply the algorithm to each list, which in turn create smaller and smaller lists, until the lists contain no more than one name. When no more applications of the process are required, the list is sorted.

Pseudocode for the Quick Sort Algorithm: procedure Sort(List)

                             if (List contains fewer than two entries)
                                 then (Declare List to be sorted)
                                 else
                                        [Select the first entry in List as the pivot:
                                          Place pointers at the first and last entries of List;
                                        while the pointers do not coincide) do
                                             (Move the bottom pointer up to the nearest entry less
                                                  than or equal to the pivot but not beyond the top
                                                  pointer;
                                                 Move the top pointer down to the nearest entry greater
                                                    than the pivot but not beyond the bottom pointer;
                                              if (the pointers do not coincide)
                                                 then (interchange the entries indicated by
                                                        the pointers.))
                                            Interchange the pivot entry with the entry indicated by
                                                the common pointers;
                                            Apply the procedure Sort to the portion of List above
                                                the pivot.
                                            Apply the procedure Sort to the portion of List below
                                                the pivot.

The last two statements of the pseudocode create further activations of the algorithm, which is recursion.

Efficiency and Correctness

Algorithm Efficiency

	The ultimate efficiency of an algorithm is closely associated with the details of its implementation.
	Typically, a major concern regarding an algorithm's implementation is data organization.

Software Verification

	There is a distinction between a program that is believed to be correct and a program that is correct.
	A formal proof of a program's correctness is based on the specifications under which the program was designed.

	Proof of correctness begins with the assumption that certain conditions, called preconditions, are satisfied at the beginning of the program execution.
	Proof of correctness proceeds by identifying statements, called assertions, that can be established at various points in the program.
	The result is a collection of assertions, each being a consequence of the program's preconditions and the sequence of instructions that lead to the point in the program at which the assertion is established.
	If the assertion so established at the end of the program corresponds to the desired output specifications, we can conclude that the program is correct.
	In the structure of a loop, a certain condition is true each time the loop is executed. Such an assertion within a loop is called a loop invariant.

Example: In the Insertion Sort, the loop invariant is stated as:

Each time the loop body is completed the names in the unshaded portion of the
list are sorted.

Assertions obtained when proving a program correct are often essentially the insights that led tot he program in the first place.

In most cases today, software is "verified" by applying it to test data, which only proves that the program runs correctly for the test data.

Test data should not be designed by the person(s) writing the program.

Programming Languages

Languages have been developed that allow humans to avoid the intricacies of registers, memory addresses, and machine cycles during the program development process and, instead, to concentrate on the properties of the problem being solved.

Historical Perspective

Early Generations

	Originally all algorithms were expressed in the machine's language. -- first generation language
	Later, mnemonics were assigned to the various op-codes and used in place of hexadecimal representation.

	Example: names R0, R1, etc. were assigned to the registers in the CPU.
	This use of mnemonics was formalized into a programming language, assembly language.
	Second generation language
	Assembler == a program to translate assembly language into machine language.
	Machine dependent
	Programmer still had to think in small, incremental steps instead of concentrating on the solution to the problem.
	Not easily transported to another machine.

Third Generation Languages - use collections of high level primitives to develop software.

	Each high level primitive is implemented as a collection of lower level primitives used by the machine.
	Translator - a program to translate high level primitives into machine language.

Translator is now called a compiler.

Machine Independence and Beyond

Since statements in a 3gl do not refer to the attributes of any particular machine, they are compiled as easily for one machine as for another.

However, some differences among machines, such as the size of registers and memory cells still affect portability.

ANSI and IOS (American National Standards Institute and the International Organization for Standardization) have adopted and published standards for many of the popular languages.

3gl's are close enough to being machine independent that software can be transported from one machine to another with relative ease.

Fourth generation language is used in reference to software packages that allow users to customize computer software to their applications without needing technical expertise.

	Usually involves a selection from choices presented on the monitor's screen in sentence or icon form.
	Packages include spreadsheet systems, database systems, and graphics packages, and powerful word processors.
	Packages are often bundled (as integrated software) to form one coherent system.

Fifth generation language refers to declarative programming, with an emphasis on logic programming.

	User concentrates on the problem's characteristics instead of on how the problem is solved.
	User describes a problem and the machine solves it.

Programming Paradigms

	A paradigm is an approach to the programming process.
	Four Programming Paradigms:

Functional (LISP, ML, Scheme)

	Process of program development is the construction of "black boxes", each of which accepts input at the top and produces output at the bottom.
	"Boxes" = Functions

construction of functions as nested complexes of simpler functions.

Elaborate functions are constructed from elementary functions (the primitives of the language).

Problem is approached by considering the input that is given, the output that is desired, and the transformation that is required to produce that output from the given input.

Object Oriented (SIMULA, C++, Ada '95, Smalltalk, Java)

	leads to Object-Oriented Programming (OOP)
	Units of data are viewed as active "objects"
	Object is the data together with a collection of routines for manipulating the data.
	another example == GUI (graphical user interface.)

	Icons are implemented as objects.
	Routines describe how objects should respond to various events (i.e., clicking or dragging).
	Result is an event-driven system.
	examples are Visual Basic (by Microsoft) and Delphi (by Borland)

Imperative (FORTRAN, BASIC, C, COBOL, ALGOL, APL, Ada, Pascal)

	Also known as procedural paradigm
	represents the traditional approach to programming
	approach a problem by trying to find a method for solving it.

Declarative (GPSS, Prolog)

	Asks "What is the problem" instead of how to solve it.
	Developing a precise statement of the problem rather than of discovering an algorithm for solving the problem.
	Early declarative languages tended to be special-purpose in nature.
	Used in simulations -- task is to describe the relationships among the parameters to be simulated.

Traditional Programming Concepts

Three categories of statements in programming languages:

	declarative statements - define customized terminology that is used later in the program , such as the names used to reference data items.
	imperative statements - describe the steps in the underlying algorithms
	comments - enhance readability

Variables, Constants, and Literals

variables - identification of memory locations by means of descriptive names.

value associated with the identifier changes as the program executes.

literal - a reference to a fixed, predetermined value.

Use of literals is not good programing practice.

constants - assignment of descriptive names to specific, nonchangeable values.

better portability

Data Type

	Data Type refers to both the interpretation given a bit pattern and the operations that can be performed on that pattern.
	Common data types:

Integer - numeric data consisting of whole numbers, probably stored in two's complement notation.

Operations include arithmetic and relational

Real - numeric data that may contain values other than whole numbers, probably stored in floating point notation.

Operations include arithmetic and relational

Character - data consisting of symbols, probably stored using ASCII.

Operations include comparison, concatenation, and testing to see if one string is inside another.

Boolean - can only take on the values true or false.

A declarative statement that introduces a variable usually must specify the type of data that will be referenced by that variable.

Information on data types can be used to identify errors.

Data Structure

	Data structure relates to the conceptual shape of the data.
	A string has both type and length.
	homogeneous array - a block of values of the same type such as a string or an array of one or more dimensions.

components are identified via indices that specify the row, column, and so on.

heterogeneous array - a block of data in which different elements can have different types.

components are referenced by identifying the array, followed by the component name separated by a period.

Once an array has been declared, it can be referenced in its entirety by name or an individual component can be referenced by its position.

Assignment Statements

	perhaps the most basic imperative statement
	requests that a value be assigned to a variable.
	Example:

	Total = Price + Tax; (in C and C++)
	Total := Price + Tax; (in Ada and Pascal)

Ambiguities in interpretation are resolved by the rules of operator precedence, meaning that certain operations are given precedence over others.

overloading -- multiple use of a symbol; many languages allow the use of one symbol to represent more than one type of operation, and the operation performed depends on the data types involved.

Control Statements

	Control statements are imperative statements that alter the execution sequence of the program.
	goto statement is bad programming practice
	Control statements of modern languages allow a certain branching pattern to be expressed within a single syntactic structure.
	Structured programming - encompasses an organized design methodology combined with the appropriate use of the language's control statements.

produces a program that is easily understood and meets specifications

Common branching structures:

	if-then-else
	while
	case - allows a selection between many options
	for - similar to while, but all the initialization, modification, and termination are incorporated into one statement.

Comments

	Comments are internal documentation ignored by the compiler or translator.
	Comments should be meaningful, adding clarity rather than length to a program.
	A good approach is to collect comments that relate to a single program unit into one place, perhaps at the beginning of the unit.

Program Units

Should group steps in an algorithm to form short, simple portions of the algorithm and to use these portions as abstract tools to express the final product.

Procedures

	A procedure is a program unit written independently of the main program yet associated with it through a transfer/return process.
	Transferring control to a procedure is also called calling or invoking the procedure.
	The program unit that requests the execution of the procedure is the calling unit.
	The definition begins with a statement, known as the procedure's header, that identifies, among other things, the name of the procedure.
	As a general rule, the variables declared within a procedure are Local variables, meaning that they can be referenced only within that procedure.
	An item of data shared by all units within the program is represented by a variable called a global variable.

not a recommended technique

Parameters

Explicit identification of data being shared is done by listing the data items in the procedure call and in the procedure's header.

	The items in both lists are called parameters.
	In Ada, the keywords in, out, and in out are used.

Once a procedure has been defined, it can be used from within another program unit.

Names used for the parameters within the procedure are called formal parameters.

Data values supplied from the other program unit are called actual parameters.

When the amount of data associated with the parameters being passed is large, a more efficient method is to transfer only the address of the memory cells that contain the data.

	This is called passed by reference.
	If the information is actually copied, it is called passed by value.

Functions

	Function refers to a program unit similar to a procedure, except that a value is transferred back to the main program as "the value of the function" rather than via the parameter list.
	Value is associated with the function's name.

I/O Statements

	Implementations of I/O operations: procedures and functions being called are actually routines within the machine's operating system.
	Examples:

	Pascal -- readln (Value);
	C -- printf("%d %d\n", Value1, Value2);
	C++ -- provides ready made objects known as cin and cout

cin >> Value; and cout << Value;

Program Examples

Sample programs in Pascal and C are shown in Figure 5.12 on page 210 of the text.

Language Implementation

Process of converting a program written in a high-level programming language into a machine-executable form.

The Translation Process

	Translation -- the process of converting a program from one language to another
	source program -- program in its original form
	object program -- the translated version
	Three activities of the translation process:

lexical analysis- process of recognizing which strings of symbols from the source program represent a single entity.

generates a bit pattern known as a token indicating the unit's class (i.e. a value, a word, etc.)

parsing- process of identifying the grammatical structure of the program and recognizing the role of each component.

	fixed-format languages -- insisted each program statement be positioned in a particular manner on the printed page.
	free-format languages -- the positioning of the statements is not critical.

the language's syntax must be designed so the structure of the program can be identified regardless of spacing.

this is done through the use of punctuation marks, key words, and reserved words.

parsing process is based on a set of syntax rules

syntax diagrams -- pictorial representations of a program's grammatical structure.

	terms that require further description -- (in rectangles) -- nonterminals.
	terminals appear in ovals.
	a string can be represented by a parse tree.

Parser analyzes declarative and imperative statements and ignores the comments

symbol table -- record of information from declaration statements

implicit conversion between types is called coercion.

most modern day languages are strongly typed -- all activities must involve data of agreeable types without coercion.

code generation- constructing the machine language instructions to simulate the statements recognized by the parser.

The code generator is responsible for code optimization

Linking and Loading

	The job of the linker is to link several object programs, operating system routines, and utility software to produce a complete, executable program (sometimes called a load module.
	To execute a translated program, the load module must be placed in memory by a program called a loader.

	Loader is usually part of the operating system's scheduler.
	A relocatable module executes correctly regardless of where it executes in memory.

In summary, the complete task of preparing a high-level language program for execution consists of the three-step sequence of translate, link, and load.

Software Development Packages

	Grouping of a translator along with an editor, debugging tools, etc.
	System would be classified as application software.
	Future programming process -- might be construction of software from large prefabricated blocks.

Software Engineering

Sorry, this section hasn't been entered yet, but will be soon, I hope.
Or, if you would like to go to my pages on my Software Engineering Class, click on Software Engineering above.

	initialize -- shading the portion of List from the second entry to the last
	test -- when shaded portion of list becomes empty
	modify -- Remove shading from first name and identify as pivot entry

	initialize -- remove pivot entry from List and create a hole
	test -- hole is below a name that is not greater than the pivot or hole reaches top of List.
	modify -- Moving entries above the hole down, thus moving the hole up.