Question? Leave a message!




Abstract List

Abstract List
ECE 250 Algorithms and Data Structures Lists Douglas Wilhelm Harder, M.Math. LEL Department of Electrical and Computer Engineering University of Waterloo Waterloo, Ontario, Canada ece.uwaterloo.ca dwharderalumni.uwaterloo.ca © 20062013 by Douglas Wilhelm Harder. Some rights reserved.Lists 2 Outline We will now look at our first abstract data structure – Relation: explicit linear ordering – Operations – Implementations of an abstract list with: • Linked lists • Arrays – Memory requirements – Strings as a special case – The STL vector classLists 3 3.1 Definition An Abstract List (or List ADT) is linearly ordered data where the programmer explicitly defines the ordering We will look at the most common operations that are usually – The most obvious implementation is to use either an array or linked list – These are, however, not always the most optimalLists 4 3.1.1 Operations th Operations at the k entry of the list include: Access to the object Erasing an object Insertion of a new object Replacement of the objectLists 5 3.1.1 Operations th Given access to the k object, gain access to either the previous or next object Given two abstract lists, we may want to – Concatenate the two lists – Determine if one is a sublist of the otherLists 6 3.1.2 Locations and run times The most obvious data structures for implementing an abstract list are arrays and linked lists – We will review the run time operations on these structures We will consider the amount of time required to perform actions such as finding, inserting new entries before or after, or erasing entries at – the first location (the front) th – an arbitrary (k ) location th – the last location (the back or n ) The run times will be Q(1), O(n) or Q(n)Lists 7 3.1.3 Linked lists We will consider these for – Singly linked lists – Doubly linked listsLists 8 3.1.3.1 Singly linked list st th th Front/1 node k node Back/n node FindQ(1)O(n)Q(1) Insert Before Q(1)O(n)Q(n) Insert AfterQ(1)Q(1)Q(1) ReplaceQ(1)Q(1)Q(1) EraseQ(1)O(n)Q(n) NextQ(1)Q(1) n/a Previous n/aO(n)Q(n) th These assume we have already accessed the k entry—an O(n) operationLists 9 3.1.3.1 Singly linked list st th th Front/1 node k node Back/n node FindQ(1)O(n)Q(1) Insert Before Q(1)Q(1)Q(1) Insert AfterQ(1)Q(1)Q(1) ReplaceQ(1)Q(1)Q(1) EraseQ(1)Q(1)Q(n) NextQ(1)Q(1) n/a Previous n/aO(n)Q(n) By replacing the value in the node in question, we can speed things up – useful for interviewsLists 10 3.1.3.2 Doubly linked lists st th th Front/1 node k node Back/n node FindQ(1)O(n)Q(1) Insert Before Q(1)Q(1)Q(1) Insert AfterQ(1)Q(1)Q(1) ReplaceQ(1)Q(1)Q(1) EraseQ(1)Q(1)Q(1) NextQ(1)Q(1) n/a Previous n/aQ(1)Q(1) th These assume we have already accessed the k entry—an O(n) operationLists 11 3.1.3.2 Doubly linked lists th Accessing the k entry is O(n) th k node Insert BeforeQ(1) Insert AfterQ(1) ReplaceQ(1) EraseQ(1) NextQ(1) PreviousQ(1)Lists 12 3.1.3.3 Other operations on linked lists Other operations on linked lists include: – Allocation and deallocating the memory requires Q(n) time – Concatenating two linked lists can be done in Q(1) • This requires a tail pointerLists 13 3.1.4 Arrays We will consider these operations for arrays, including: – Standard or oneended arrays – Twoended arraysLists 14 3.1.4 Standard arrays We will consider these operations for arrays, including: – Standard or oneended arrays – Twoended arraysLists 15 Run times Insert or erase at the Accessing th th the k entry Front k entry Back Singly linked listsQ(1) or Q(n) O(n) Q(1)Q(1) Doubly linked lists Q(1) ArraysQ(n) Q(1) O(n)Q(1) Twoended arrays Q(1) Assume we have a pointer to this nodeLists 16 Data Structures In general, we will only use these basic data structures if we can restrict ourselves to operations that execute in Q(1) time, as the only alternative is O(n) or Q(n) Interview question: in a singly linked list, can you speed up the two O(n) operations of – Inserting before an arbitrary node – Erasing any node that is not the last node If you can replace the contents of a node, the answer is ―yes‖ – Replace the contents of the current node with the new entry and insert after the current node – Copy the contents of the next node into the current node and erase the next nodeLists 17 Memory usage versus run times All of these data structures require Q(n) memory – Using a twoended array requires one more member variable, Q(1), in order to significantly speed up certain operations – Using a doubly linked list, however, required Q(n) additional memory to speed up other operationsLists 18 Memory usage versus run times As well as determining run times, we are also interested in memory usage In general, there is an interesting relationship between memory and time efficiency For a data structure/algorithm: – Improving the run time usually requires more memory – Reducing the required memory usually requires more run timeLists 19 Memory usage versus run times Warning: programmers often mistake this to suggest that given any solution to a problem, any solution which may be faster must require more memory This guideline not true in general: there may be different data structures and/or algorithms which are both faster and require less memory – This requires thought and researchLists 20 The sizeof Operator In order to determine memory usage, we must know the memory usage of the various builtin data types and classes – The sizeof operator in C++ returns the number of bytes occupied by a data type – This value is determined at compile time • It is not a functionLists 21 The sizeof Operator include iostream using namespace std; int main() cout "bool " sizeof( bool ) endl; cout "char " sizeof( char ) endl; cout "short " sizeof( short ) endl; cout "int " sizeof( int ) endl; cout "char " sizeof( char ) endl; cout "int " sizeof( int ) endl; cout "double " sizeof( double ) endl; cout "int10 " sizeof( int10 ) endl; eceunix:1 ./a.out output return 0; bool 1 char 1 short 2 int 4 char 4 int 4 double 8 int10 40 eceunix:2Lists 22 Abstract Strings A specialization of an Abstract List is an Abstract String: – The entries are restricted to characters from a finite alphabet – This includes regular strings ―Hello world‖ The restriction using an alphabet emphasizes specific operations that would seldom be used otherwise – Substrings, matching substrings, string concatenations It also allows more efficient implementations – String searching/matching algorithms – Regular expressionsLists 23 Abstract Strings Strings also include DNA – The alphabet has 4 characters: A, C, G, and T – These are the nucleobases: adenine, cytosine, guanine, and thymine Bioinformatics today uses many of the algorithms traditionally restricted to computer science: – Dan Gusfield, Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology, Cambridge, 1997 http://books.google.ca/booksid=STGlsyqtjYMC – References: http://en.wikipedia.org/wiki/DNA http://en.wikipedia.org/wiki/BioinformaticsLists 24 Standard Template Library In this course, you must understand each data structure and their associated algorithms – In industry, you will use other implementations of these structures The C++ Standard Template Library (STL) has an implementation of the vector data structure – Excellent reference: http://www.cplusplus.com/reference/stl/vector/Lists 25 Standard Template Library include iostream include vector using namespace std; int main() vectorint v( 10, 0 ); cout "Is the vector empty " v.empty() endl; cout "Size of vector: " v.size() endl; g++ vec.cpp ./a.out v0 = 42; Is the vector empty 0 v9 = 91; Size of vector: 10 v0 = 42 v1 = 0 for ( int k = 0; k 10; ++k ) v2 = 0 cout "v" k " = " vk endl; v3 = 0 v4 = 0 v5 = 0 return 0; v6 = 0 v7 = 0 v8 = 0 v9 = 91 Lists 26 Summary In this topic, we have introduced Abstract Lists – Explicit linear orderings – Implementable with arrays or linked lists • Each has their limitations • Introduced modifications to reduce run times down to Q(1) – Discussed memory usage and the sizeof operator – Looked at the String ADT – Looked at the vector class in the STLLists 27 References 1 Donald E. Knuth, The Art of Computer Programming, Volume 1: rd Fundamental Algorithms, 3 Ed., Addison Wesley, 1997, §2.2.1, p.238. rd 2 Weiss, Data Structures and Algorithm Analysis in C++, 3 Ed., Addison Wesley, §3.3.1, p.75.Lists 28 Usage Notes • These slides are made publicly available on the web for anyone to use • If you choose to use them, or a part thereof, for a course at another institution, I ask only three things: – that you inform me that you are using the slides, – that you acknowledge my work, and – that you alert me of any mistakes which I made or changes which you make, and allow me the option of incorporating such changes (with an acknowledgment) in my set of slides Sincerely, Douglas Wilhelm Harder, MMath dwharderalumni.uwaterloo.ca