Question? Leave a message!

Arrays and collections

Arrays and collections
Object Oriented Software Development 6. Arrays and collections www.ThesisScientist.comOnetomany relationships  A department can have many employees in it  Onetomany “hasa” relationship Employee employeeId Department employees name username phoneNumber 1 +Move() +Email() HourlyPaidEmployee SalariedEmployee payGrade +RecordTime() +PayIncrement() +Email()  “hasa” relationship is implemented with instance variable  This variable must be able to hold a collection of Employee objects www.ThesisScientist.comCollections of data  Most software applications need to work with collections of similar objects, for example:  List of transactions to display in a bank application  List of sprites in a game  Often need to perform actions such as:  Listing all the objects in a collection  Adding and removing objects  Finding a specific object in a collection www.ThesisScientist.comArrays  Simple way of storing collection of objects of the same type  Contents can be any type, including reference types and value types  Arrays are stored on the heap  If content type is a reference type, the array stores references to objects www.ThesisScientist.comLooping through an array  for and foreach loops:  foreach loop works with any collection which implements an interface called IEnumerable  We will look at this again later www.ThesisScientist.comArrays and polymorphism  An array can store instances of its declared type or subclasses of that type  May need to cast to get object back out as its original type www.ThesisScientist.comMore than one dimension  Arrays can have two dimensions  Can have more than two dimensions  Can be rectangular or jagged  Jagged array = array of arrays  Arrays may be different sizes www.ThesisScientist.comLimitations of arrays  Once created, the size of an array is fixed  Array does not provide ways of adding, removing or finding elements  To use an array you need to write code to access elements  Often better to use classes which are designed to store data and provide methods for accessing the data  Data structures, or collections www.ThesisScientist.comAbstract data types  An ADT is a specification for the set of operations which can be performed on the data stored in a data structure  “Abstract” because the specification is independent of the concrete implementation  Could have many implementations of the same ADT which work differently “under the hood”  We will look first at the list ADT and two ways of implementing it www.ThesisScientist.comList  A list is a sequential data structure  Addition and removals can be made at any position in the list  Lists are normally in the form of a ,a ,a .....a . 1 2 3 n The size of this list is n.  The first element of the list is a ,and the last 1 element is a n  The position of element a in a list is i. i www.ThesisScientist.comList operations  A list data structure should be able to provide the following operations:  Create a list  Find an item in the list  Insert an item in the list  Modify an item in the list  Delete an item from the list  Check whether list is full/empty  Traverse the list www.ThesisScientist.comLists example  CollectionsDemo solution  CollectionsDemo and SimpleCollections projects  Data structures in separate project from program – can easily be reused by other programs  SimpleCollections is a class library project  CollectionsDemo – Program.cs  SimpleCollections – ISimpleList.cs, SimpleArrayList.cs, SimpleLinkedList.cs www.ThesisScientist.comImplementing a list  We can express these requirements as an interface  Lists will implement this interface www.ThesisScientist.comData type of items  Note that in interface code, the item type is Object  Polymorphism – each list item can be:  An instance of Object  An instance of any type which derives from object...  ...which is essentially any type at all as all types in .NET derive from Object  Reusable – can be used to store any type of items in any application www.ThesisScientist.comImplementation details  There are two common ways of actually storing the data in a list  ArrayList  Items are stored in an array  LinkedList  Items are stored as a chain of connected nodes  Each node has a data item and a reference to the next node www.ThesisScientist.comDoes it matter which one  Both versions can be written to implement the interface, and both will work perfectly well  In fact, we don’t even need to write our own implementations as the .NET Framework library already has a wide selection of tried and test collection classes, including lists  But, different implementations have different performance in particular situations  Need to understand characteristics of structures to make the best choice www.ThesisScientist.comSimple implementations  We will look at the code for simplified implementations of ArrayList and LinkedList  This will help us to understand how the two versions work and identify where there might be performance differences  Wouldn’t use these in real programs – Framework versions are more complex, but more capable and robust  Use Framework classes unless you need specific custom functionality www.ThesisScientist.comSimpleArrayList  Items stored in array of type Object  Capacity is the number of items which can be stored  Size is the number of items currently stored constructor instance variables www.ThesisScientist.comAdding items to SimpleArrayList  Adding a new item at the end of the list is simple  Number of items stored so far = size  Index of last stored = size 1  Store the new item at index = size www.ThesisScientist.comAdding items to SimpleArrayList  Adding a new item at a specified index involves more work  Need to shuffle items up to make space G l s g o w G l s g o w G l s g o w a www.ThesisScientist.comAdding items to SimpleArrayList  Start at first empty position (size)  Copy previous entry into current one  Move down one position  Copy previous entry into current one  Repeat until target position reached  Put new item in target position www.ThesisScientist.comSimpleLinkedList  Each item is stored in an object of type Node  Contains item and reference to next Node  List only needs reference to first Node, can find others from there by following references  No capacity limit, list can grow constructor Node class reference to head node, initially set to null by constructor www.ThesisScientist.comAdding items to SimpleLinkedList  To add a new item at a specified index break chain at target position and insert new node new item to be probe follows added at index 2 chain to find target node new item inserted www.ThesisScientist.comAdding items to SimpleLinkedList  Special case when inserting first item in empty list  Simply set head to point to new node head of list, initially null  Adding at end is not a particularly special case  Need to probe all the way along the list to find last item then insert new node www.ThesisScientist.comAdding items to SimpleLinkedList  Special case when inserting at start of list www.ThesisScientist.comAdding items to SimpleLinkedList  Implementation for general case www.ThesisScientist.comOther operations  How would deleting be implemented in each case  How would searching be implemented  How would modifying be implemented  Look at sample code to see how other operations are implemented www.ThesisScientist.comWhich list should we use  Need to judge likely effect of list characteristics on performance  Real applications may store a large number of data  Consider the way the data will be added and accessed  Will new items be added frequently  Where in the list will they be added  Will we need to access items frequently  Where in the list will we need to access items www.ThesisScientist.comArrayList  Time to access does not depend on the size of the list  To add an element at the end of the list, the time taken does not depend on the size  Time taken to add an element at any other point in the list does depend on the size  Additions near the start of the list take longer than additions near the middle or end  Deletes near the start of the list take longer www.ThesisScientist.comLinkedList  Time to access an element depends on the index because each element of the list must be traversed until the required index is found  Time to add an element does not depend on the size of the list, as no shifts are required  Time to add does depend on the index  Additions near the end of the list take longer than additions near the middle or start  Deletes near the end of the list take longer www.ThesisScientist.comVariations on LinkedList  Circularly linked list  The tail of the list always points to the head  Can start anywhere and reach whole list  Doubly linked list  Permits searching in both directions www.ThesisScientist.comCapacity of lists  LinkedList is dynamic  No limit on capacity  ArrayList is static  Capacity defined by length of underlying array  Could be a big disadvantage, but...  .NET Framework implementation of ArrayList can resize itself dynamically  When array gets close to being full the ArrayList creates a new, larger array and copies items into it www.ThesisScientist.comTraversing a collection  Traversing means passing through the collection visiting each item in turn  The details of how this works depend on the way the collection is implemented  Traversing an ArrayList simply involves passing along the array  Traversing a LinkedList involves following references to reach each node  Collection should provide a way of traversing www.ThesisScientist.comforeach loop and IEnumerable  In C the foreachin loop is the usual way to traverse a collection  Collection needs to provide help for the foreach loop to work  Collection must implement IEnumerable interface www.ThesisScientist.comIEnumerable interface  Defines a contract  If a collection type implements IEnumerable interface then it agrees to provide the mechanism which can enumerate, or loop though, the collection  Defines one method – GetEnumerator, which returns an instance of a class which in turn implements IEnumerator  Most .NET collection types and arrays implement this interface www.ThesisScientist.comEnumerators  The IEnumerator interface defines an enumerator object which traverses a collection in a way specific to that collection  Enumerator object can be used by foreach loop or can be used directly in your code  Defines:  MoveNext method  Reset method  Current property  See example in lab exercises www.ThesisScientist.comMore data structures  Lists are suitable for many applications  However, there are other data structures which have characteristics which make them more suited to specific applications, including:  Stack  Queue  Dictionary www.ThesisScientist.comStack ADT  A stack is a special case of a list  Addition and removals can only be made at one end  Lastin, firstout (LIFO)  Push adds a new item onto the stack  Pop – removes and returns last item added  Can be implemented with array or linked list www.ThesisScientist.comStack applications  Page visited history in a Web browser  Undo sequence in a text editor  Reversing a string  Method calls – memory stack  Balancing symbols – e.g. check for matching brackets in a C code file www.ThesisScientist.comQueue ADT  A queue is also special case of a list  Addition can only be made at the tail and removals made at the head  Firstin, firstout (FIFO)  Enqueue adds a new item onto the queue  Dequeue – removes and returns first item added  Can be implemented with array or linked list www.ThesisScientist.comQueue applications  Waiting lists  Message queues  Access to shared resources, e.g. printer  Buffers for transferring data asynchronously e.g., file IO, sockets www.ThesisScientist.comDictionary ADT  A dictionary is a searchable collection of keyitem pairs  Also known as map, associative list  Keys are used to locate items  Collection can be sorted or unsorted  May or may not allow more than one item with the same key – depends on implementation www.ThesisScientist.comDictionary operations  Put(key, item) add a new key/item pair  Get(key) – get the item with the specified key  Set(key, item) – set the item for the specified key  Remove(key) – remove the item with the specified key  ContainsKey(key) – check whether the specified key exists www.ThesisScientist.comDictionary data types  Most reusable case is when key and item data types are both Object  Can store any type and use any type as key because of polymorphism  Sometimes it is convenient to have a customised implementation restricted to specific data types  e.g. .NET Framework includes a StringDictionary class www.ThesisScientist.comBoxing/unboxing with collections  Collections often have item type Object  Don’t need to create a separate collection class for each data type which might be stored  A consequence of this is that value types are converted to object references (boxed) when stored in such a collection boxing unboxing – cast to value type www.ThesisScientist.comDisadvantages of boxing  Overhead of creating object references affects performance  Need to write code to explicitly cast to value type  Not typesafe – can add the “wrong” type of objects into the collection, compiler will not pick this up www.ThesisScientist.comGenerics  Generic collection classes give the best of both worlds  One class allows storage of any data type  Can specify what type it is to hold  Typesafe – compiler will not allow code which stores wrong type same collection class used for different types no boxing needed no unboxing, no cast needed www.ThesisScientist.comGeneric list example  CollectionsDemo solution  SimpleCollections project  SimpleGenericArrayList.cs www.ThesisScientist.comCreating a generic collection class  Declare placeholder for generic type – T  Placeholder will be replaced with specific type when application is compiled  References to T instead of Object generic type becomes Employee entries when compiled specific type www.ThesisScientist.comGeneric collection class members references to T instead of Object www.ThesisScientist.comGeneric collection class members  Can’t generally set a reference of generic type to null  Type at compile time might be a value type which can’t be null  Use default instead – null for reference type, default value for value type www.ThesisScientist.comFramework collections  The .NET framework provides a range of collections in three namespaces  System.Collections  System.Collections.Generic  generic versions – names not always consistent with nongeneric variants, e.g. ArrayList is equivalent to ListT  System.Collections.Specialized  a range of collections with specialized characteristics www.ThesisScientist.comFramework collections  Use framework collection where possible instead of writing your own  Consider characteristics and likely impact on performance when choosing  Use generic collections where possible  Only create custom collections to match very specific, unusual requirements www.ThesisScientist.comFramework collections example  FrameworkCollectionsDemo solution  ListGenericDemo project  Program.cs  DictionaryGenericDemo project  Program.cs www.ThesisScientist.comUsing ListT  Create  Add  Get value  Iterate www.ThesisScientist.comUsing DictionaryT,U  Create Note that .NET implementation  Add uses Add rather than Put, and arraylike index notation for lookup rather than Get  Get value www.ThesisScientist.comUsing DictionaryT,U  Iterate  Iterate through Dictionary as KeyValuePair collection could also use implicit type: var entry in dict  Get dictionary values from Values property(can get Keys similarly) www.ThesisScientist.comFurther reading  There are other collection types  Many sources of information online  Recommended:  An Extensive Examination of Data Structures using C 2.0 (MSDN)  us/library/ms36409128v=VS.8029.aspx www.ThesisScientist.comFurther reading  The code in the CollectionsDemo download contains example implementations of the following collection types:  Stack  Queue  Dictionary – see extra slides for information on ways to implement a Dictionary  Implementation details of these will not be in the exam www.ThesisScientist.comWhat’s next  We will go on to look in detail at how to implement a UML model using C classes, and how collections can help with this
Document Information
User Name:
User Type:
Uploaded Date: