Lecture note in Human Computer Interface

how can a human computer interface be improved and human computer interface design guidelines. human computer interface design in software engineering pdf free download
AbbieBenson Profile Pic
AbbieBenson,United States,Professional
Published Date:13-07-2017
Your Website URL(Optional)
Human Computer Interaction – Lecture Notes Cambridge Computer Science Tripos, Part II Alan Blackwell       Overview of content: Lecture 1: The scope and challenges of HCI and Interaction Design. Lecture 2: Visual representation. Segmentation and variables of the display plane. Modes of correspondence Lecture 3: Text and gesture interaction. Evolution of interaction hardware. Measurement and assessment of novel methods. Lecture 4: Inference-based approaches. Bayesian strategies for data entry, and programming by example. Lecture 5: Augmented reality and tangible user interfaces. Machine vision, fiducial markers, paper interfaces, mixed reality. Lecture 6: Usability of programming languages. End-user programming, programming for children, cognitive dimensions of notations. Lecture 7: User-centred design research. Contextual observation, prototyping, think-aloud protocols, qualitative data in the design cycle. Lecture 8: Usability evaluation methods. Formative and summative methods. Empirical measures. Evaluation of part II projects. 1 Lecture 1: What is HCI / Interaction Design? With the exception of some embedded software and operating system code, the success of a software product is determined by the humans who use the product. These notes present theoretical and practical approaches to making successful and usable software. A user- centred design process, as taught in earlier years of the tripos and experienced in many group design projects, provides a professional resource to creating software with functionality that users need. However, the availability of technical functionality does not guarantee that software will be practically usable. Software that is usable for its purpose is sometimes described by programmers as “intuitive” (easy to learn, easy to remember, easy to apply to new problems) or “powerful” (efficient, effective). These terms are vague and unscientific, but they point in the right direction. This course presents scientific approaches to making software that is “intuitive” and “powerful”. HCI helps us to understand why some software products are good and other software is bad. But sadly it is not a guaranteed formula for creating a successful product. In this sense it is like architecture or product design. Architects and product designers need a thorough technical grasp of the materials they work with, but the success of their work depends on the creative application of this technical knowledge. This creativity is a craft skill that is normally learned by working with a master designer in a studio, or from case studies of successful designs. A computer science course does not provide sufficient time for this kind of training in creative design, but it can provide the essential elements: an understanding of the user’s needs, and an understanding of potential solutions. There are many different approaches to the study and design of user interfaces. This course attempts, so far as possible within 8 lectures, to discuss the important aspects of fields including: Interaction Design, User Experience Design (UX), Interactive Systems Design, Cognitive Ergonomics, Man-Machine Interface (MMI), User Interface Design (UI), Human Factors, Cognitive Task Design, Information Architecture (IA), Software Product Design, Usability Engineering, User-Centred Design (UCD) and Computer Supported Collaborative Work (CSCW). These investigations require a wide range of academic styles, extending across all the parts of the University. Lack of familiarity with other kinds of investigation and analysis can make it hard to absorb or collaborate with other perspectives. The advantages of different disciplines can range from those that are interpretive to those that are scientific (both are necessary), the first criticized as soft and the second as reductionist, and each relying on different kinds of knowledge, with suspicion of those seen as relativist at one extreme or positivist at the other. In professional work, the most important attributes for HCI experts are to be both creative and practical, placing design at the centre of the field. 2 Notes on recommended reading The recommended reading for this course is as follows: Interaction Design: Beyond human-computer interaction by Helen Sharp, Yvonne Rogers & Jenny Preece (2nd Edition 2007) describes both theoretical approaches and practical professional design methods, at forefront of current practice. HCI Models, Theories and Frameworks: Toward a multidisciplinary science edited by John Carroll (2003) provides an in-depth introduction to the most influential theoretical approaches across the HCI field. Unfortunately the publisher has let this book go out of print, but there are still many copies around Cambridge. Research methods for human-computer interaction. Is a new text edited by Paul Cairns and Anna Cox (Cambridge University Press 2008) that explains the nature of HCI research, and the range of methods used, within the context of academic HCI from a UK perspective. These notes refer to specific chapters in those books for more detail on specific topics. 3 Lecture 2: Visual representation. How can you design computer displays that are as meaningful as possible to human viewers? Answering this question requires understanding of visual representation – the principles by which markings on a surface are made and interpreted. Note: many illustrations referred to in this section are easily available online, though with a variety of copyright restrictions. I will show as many as possible in the lecture itself – if you want to investigate further, Google should find most of those mentioned. Typography and text For many years, computer displays resembled paper documents. This does not mean that they were simplistic or unreasonably constrained. On the contrary, most aspects of modern industrial society have been successfully achieved using the representational conventions of paper, so those conventions seem to be powerful ones. Information on paper can be structured using tabulated columns, alignment, indentation and emphasis, borders and shading. All of those were incorporated into computer text displays. Interaction conventions, however, were restricted to operations of the typewriter rather than the pencil. Each character typed would appear at a specific location. Locations could be constrained, like filling boxes on a paper form. And shortcut command keys could be defined using onscreen labels or paper overlays. It is not text itself, but keyboard interaction with text that is limited and frustrating compared to what we can do with paper (Sellen & Harper 2002). But despite the constraints on keyboard interaction, most information on computer screens is still represented as text. Conventions of typography and graphic design help us to interpret that text as if it were on a page, and human readers benefit from many centuries of refinement in text document design. Text itself, including many writing systems as well as specialised notations such as algebra, is a visual representation that has its own research and educational literature. Documents that contain a mix of bordered or coloured regions containing pictures, text and diagrammatic elements can be interpreted according to the conventions of magazine design, poster advertising, form design, textbooks and encyclopaedias. Designers of screen representations should take care to properly apply the specialist knowledge of those graphic and typographic professions. Position on the page, use of typographic grids, and genre-specific illustrative conventions should all be taken into account. Summary: most screen-based information is interpreted according to textual and typographic conventions, in which graphical elements are arranged within a visual grid, occasionally divided or contained with ruled and coloured borders. 4 Maps and graphs The computer has, however, also acquired a specialised visual vocabulary and conventions. Before the text-based ‘glass teletype’ became ubiquitous, cathode ray tube displays were already used to display oscilloscope waves and radar echoes. Both could be easily interpreted because of their correspondence to existing paper conventions. An oscilloscope uses a horizontal time axis to trace variation of a quantity over time, as pioneered by William Playfair in his 1786 charts of the British economy. A radar screen shows direction and distance of objects from a central reference point, just as the Hereford Mappa Mundi of 1300 organised places according to their approximate direction and distance from Jerusalem. Many visual displays on computers continue to use these ancient but powerful inventions – the map and the graph. In particular, the first truly large software project, the SAGE air defense system, set out to present data in the form of an augmented radar screen – an abstract map, on which symbols and text could be overlaid. The first graphics computer, the Lincoln Laboratory Whirlwind, was created to show maps, not text. Summary: basic diagrammatic conventions rely on quantitative correspondence between a direction on the surface and a continuous quantity such as time or distance. These should follow established conventions of maps and graphs. Schematic drawings Ivan Sutherland’s groundbreaking PhD research with Whirlwind’s successor TX-2 introduced several more sophisticated alternatives (1963). The use of a light pen allowed users to draw arbitrary lines, rather than relying on control keys to select predefined options. An obvious application, in the engineering context of MIT, was to make engineering drawings such as a girder bridge (see figure). Lines on the screen are scaled versions of the actual girders, and text information can be overlaid to give details of force calculations. Plans of this kind, as a visual representation, are closely related to maps. However, where the plane of a map corresponds to a continuous surface, engineering drawings need not be continuous. Each set of connected components must share the same scale, but white space indicates an interpretive break, so that independent representations can potentially share the same divided surface – a convention introduced in Diderot’s encyclopedia of 1772, which showed pictures of multiple objects on a page, but cut them loose from any shared pictorial context. Summary: engineering drawing conventions allow schematic views of connected components to be shown in relative scale, and with text annotations labelling the parts. White space in the representation plane can be used to help the reader distinguish elements from each other rather than directly representing physical space. 5 Pictures Sutherland also suggested the potential value that computer screens might offer as artistic tools. His Sketchpad system was used to create a simple animated cartoon of a winking girl. This is the first computer visual representation that might suffer from the ‘resemblance fallacy’, i.e. that drawings are able to depict real object or scenes because the visual perception of the flat image simulates the visual perception of the real scene. Sutherland’s cartoon could only be called an approximate simulation, but many flat images (photographs, photorealistic ray-traced renderings, ‘old master’ oil paintings) have been described as though perceiving the representation is equivalent to perceiving a real object. In reality, new perspective rendering conventions are invented and esteemed for their accuracy by critical consensus, and only more slowly adopted by untrained readers. The consensus on preferred perspective shifts across cultures and historical periods, as is obvious from comparison of prehistoric, classical, medieval and renaissance artworks. It would be naïve to assume that the conventions of today are the final and perfect product of technical evolution. As with text, we become so accustomed to interpreting these representations that we are blind to the artifice. When even psychological object- recognition experiments employ line drawings as though they were objects, it can be hard to insist on the true nature of the representation. But professional artists are fully aware of the conventions they use – the way that a photograph is framed changes its meaning, and a skilled pencil drawing is completely unlike visual edge-detection thresholds. A good pictorial representation need not simulate visual experience any more than a good painting of a unicorn need resemble an actual unicorn. Summary: pictorial representations, including line drawings, paintings, perspective renderings and photographs rely on shared interpretive conventions for their meaning. It is naïve to treat screen representations as though they were simulations of experience in the physical world. Node-and-link diagrams The first impulse of a computer scientist, when given a pencil, seems to be to draw boxes and connect them with lines. These node and link diagrams can be analysed in terms of the graph structures that are fundamental to the study of algorithms (but unrelated to the visual representations known as graphs or charts). A predecessor of these connectivity diagrams can be found in electrical circuit schematics, where the exact location of components, and the lengths of the wires, can be arranged anywhere, because they are irrelevant to the circuit function. Another early program created for the TX-2, this time by Ivan Sutherland’s brother Bert, allowed users to create circuit diagrams of this kind. The distinctive feature of a node-and-link connectivity diagram is that, since the position of each node is irrelevant to the operation of the circuit, it can be used to carry other information. Marian Petre’s research into the work of electronics engineers (1995) catalogued the ways in which they 6 positioned components in ways that were meaningful to human readers, but not to the computer – like the blank space between Diderot’s objects a form of ‘secondary notation’ – use of the plane to assist the reader in ways not related to the technical content. Circuit connectivity diagrams have been most widely popularised through the London Underground diagram, an invention of electrical engineer Henry Beck. The diagram has been clarified by exploiting the fact that most underground travellers are only interested in order and connectivity, not location, of the stations on the line. However, popular resistance to reading ‘diagrams’ means that this one is more often described as the London Undergound ‘map’, despite Beck’s complaints. Summary: node and link diagrams are still widely perceived as being too technical for broad acceptance. Nevertheless, they can present information about ordering and relationships clearly, especially if consideration is given to the value of allowing human users to specify positions. Icons and symbols Maps frequently use symbols to indicate specific kinds of landmark. Sometimes these are recognisably pictorial (the standard symbols for tree and church), but others are fairly arbitrary conventions (the symbol for a railway station). As the resolution of computer displays increased in the 1970s, a greater variety of symbols could be differentiated, by making them more detailed, as in the MIT SDMS system that mapped a naval battle scenario with symbols for different kinds of ship. However, the dividing line between pictures and symbols is ambiguous. Children’s drawings of houses often use conventional symbols (door, four windows, triangle roof and chimney) whether or not their own house has two storeys, or a fireplace. Letters of the Latin alphabet are shapes with completely arbitrary relationship to their phonetic meaning, but the Korean phonetic alphabet is easier to learn because the forms mimic the shape of the mouth when pronouncing those sounds. The field of semiotics offers sophisticated ways of analysing the basis on which marks correspond to meanings. In most cases, the best approach for an interaction designer is simply to adopt familiar conventions. When these do not exist, the design task is more challenging. It is unclear which of the designers working on the Xerox Star coined the term ‘icon’ for the small pictures symbolising different kinds of system object. David Canfield Smith winningly described them as being like religious icons, which he said were pictures standing for (abstract) spiritual concepts. But ‘icon’ is also used as a technical term in semiotics. Unfortunately, few of the Xerox team had a sophisticated understanding of semiotics. It was fine art PhD Susan Kare’s design work on the Apple Macintosh that established a visual vocabulary which has informed the genre ever since. Some general advice principles are offered by authors such as Horton (1994), but the successful design of icons is still sporadic. Many software publishers simply opt for a memorable brand logo, 7 while others seriously misjudge the kinds of correspondence that are appropriate (my favourite blooper was a software engineering tool in which a pile of coins was used to access the ‘change’ command). It has been suggested that icons, being pictorial, are easier to understand than text, and that pre-literate children, or speakers of different languages, might thereby be able to use computers without being able to read. In practice, most icons simply add decoration to text labels, and those that are intended to be self-explanatory must be supported with textual tooltips. The early Macintosh icons, despite their elegance, were surprisingly open to misinterpretation. One PhD graduate of my acquaintance believed that the Macintosh folder symbol was a briefcase (the folder tag looked like a handle), which allowed her to carry her files from place to place when placed inside it. Although mistaken, this belief never caused her any trouble – any correspondence can work, so long as it is applied consistently. Summary: the design of simple and memorable visual symbols is a sophisticated graphic design skill. Following established conventions is the easiest option, but new symbols must be designed with an awareness of what sort of correspondence is intended - pictorial, symbolic, metonymic (e.g. a key to represent locking), bizarrely mnemonic, but probably not monolingual puns. Visual metaphor The ambitious graphic designs of the Xerox Star/Alto and Apple Lisa/Macintosh were the first mass-market visual interfaces. They were marketed to office professionals, making the ‘cover story’ that they resembled an office desktop a convenient explanatory device. Of course, as was frequently noted at the time, these interfaces behaved nothing like a real desktop. The mnemonic symbol for file deletion (a wastebasket) was ridiculous if interpreted as an object placed on a desk. And nobody could explain why the desk had windows in it (the name was derived from the ‘clipping window’ of the graphics architecture used to implement them – it was at some later point that they began to be explained as resembling sheets of paper on a desk). There were immediate complaints from luminaries such as Alan Kay and Ted Nelson that strict analogical correspondence to physical objects would become obstructive rather than instructive. Nevertheless, for many years the marketing story behind the desktop metaphor was taken seriously, despite the fact that all attempts to improve the Macintosh design with more elaborate visual analogies, as in General Magic and Microsoft Bob, subsequently failed. The ‘desktop’ can be far more profitably analysed (and extended) by understanding the representational conventions that it uses. The size and position of icons and windows on the desktop has no meaning, they are not connected, and there is no visual perspective, so it is neither a map, graph nor picture. The real value is the extent to which it allows secondary notation, with the user creating her own meaning by arranging items as she wishes. Window borders separate areas of the screen into different pictorial, text or symbolic 8 contexts as in the typographic page design of a textbook or magazine. Icons use a large variety of conventions to indicate symbolic correspondence to software operations and/or company brands, but they are only occasionally or incidentally organised into more complex semiotic structures. Summary: theories of visual representation, rather than theories of visual metaphor, are the best approach to explaining the conventional Macintosh/Windows ‘desktop’. There is huge room for improvement. Unified theories of visual representation The analysis in this lecture has addressed the most important principles of visual representation for screen design, introduced with examples from the early history of graphical user interfaces. In most cases, these principles have been developed and elaborated within whole fields of study and professional skill – typography, cartography, engineering and architectural draughting, art criticism and semiotics. Improving on the current conventions requires serious skill and understanding. Nevertheless, interaction designers should be able, when necessary, to invent new visual representations. One approach is to take a holistic perspective on visual language, information design, notations, or diagrams. Specialist research communities in these fields address many relevant factors from low-level visual perception to critique of visual culture. Across all of them, it can be necessary to ignore (or not be distracted by) technical and marketing claims, and to remember that all visual representations simply comprise marks on a surface that are intended to correspond to things understood by the reader. The two dimensions of the surface can be made to correspond to physical space (in a map), to dimensions of an object, to a pictorial perspective, or to continuous abstract scales (time or quantity). The surface can also be partitioned into regions that should be interpreted differently. Within any region, elements can be aligned, grouped, connected or contained in order to express their relationships. In each case, the correspondence between that arrangement, and the intended interpretation, must be understood by convention or explained. Finally, any individual element might be assigned meaning according to many different semiotic principles of correspondence. The following table summarises holistic views, as introduced above, drawing principally on the work of Bertin, Richards, MacEachren, Blackwell & Engelhardt and Engelhardt. 9  Graphic Resources Correspondence Design Uses Marks Shape Literal (visual imitation of physical Mark position, identify Orientation features) category (shape, texture Size Mapping (quantity, relative scale) colour) Texture Conventional (arbitrary) Indicate direction Saturation (orientation, line) Colour Express magnitude Line (saturation, size, length) Simple symbols and colour codes Symbols Geometric elements Topological (linking) Texts and symbolic calculi Letter forms Depictive (pictorial conventions) Diagram elements Logos and icons Figurative (metonym, visual puns) Branding Picture elements Connotative (professional and Visual rhetoric Connective elements cultural association) Definition of regions Acquired (specialist literacies) Regions Alignment grids Containment Identifying shared Borders and frames Separation membership Area fills Framing (composition, Segregating or nesting White space photography) multiple surface Gestalt integration Layering conventions in panels Accommodating labels, captions or legends Surfaces The plane Literal (map) Typographic layouts Material object on Euclidean (scale and angle) Graphs and charts which marks are Metrical (quantitative axes) Relational diagrams imposed (paper, stone) Juxtaposed or ordered (regions, Visual interfaces Mounting, orientation catalogues) Secondary notations and display context Image-schematic Signs and displays Display medium Embodied/situated As an example of how one might analyse (or working backwards, design) a complex visual representation, consider the case of musical scores. These consist of marks on a paper surface, bound into a multi-page book, that is placed on a stand at arms length in front of a performer. Each page is vertically divided into a number of regions, visually separated by white space and grid alignment cues. The regions are ordered, with that at the top of the page coming first. Each region contains two quantitative axes, with the horizontal axis representing time duration, and the vertical axis pitch. The vertical axis is segmented by lines to categorise pitch class. Symbols placed at a given x-y location indicate a specific pitched sound to be initiated at a specific time. A conventional symbol set indicates the duration of the sound. None of the elements use any variation in colour, saturation or texture. A wide variety of text labels and annotation symbols are used to elaborate these basic elements. Music can be, and is, also expressed using many other visual representations (see e.g. Duignan 2010 for a survey of representations used in digital music processing). Sources and Further reading The historical examples of early computer representations used in this lecture are mainly drawn from Sutherland (Ed. Blackwell & Rodden 2003), Garland (1994), and Blackwell (2006). Historical reviews of visual representation in other fields include Ferguson (1992), 10 Pérez-Gómez & Pelletier (1997), McCloud (1993), Tufte (1983). Reviews of human perceptual principles can be found in Gregory (1970), Ittelson (1996), Ware (2004), Blackwell (2002). Advice on principles of interaction with visual representation is distributed throughout the HCI literature, but classics include Norman (1988), Horton (1994), Shneiderman (Shneiderman & Plaisant 2010, Card et al 1999, Bederson & Shneiderman 2003) and Spence (2001). Green’s Cognitive Dimensions of Notations framework has for many years provided a systematic classification of the design parameters in interactive visual representations. A brief introduction is provided in Blackwell & Green (2003). References related to visual representation Bederson, B.B. and Shneiderman, B. (2003). The Craft of Information Visualization: Readings and Reflections, Morgan Kaufmann Bertin, J. (1967). Semiologie graphique. Paris: Editions Gauthier-Villars. English translation by WJ. Berg (1983) as Semiology of graphics, Madison, WI: University of Wisconsin Press Blackwell, A.F. and Engelhardt, Y. (2002). A meta-taxonomy for diagram research. In M. Anderson & B. Meyer & P. Olivier (Eds.), Diagrammatic Representation and Reasoning, London: Springer-Verlag, pp. 47-64. Blackwell, A.F. (2002). Psychological perspectives on diagrams and their users. In M. Anderson & B. Meyer & P. Olivier (Eds.), Diagrammatic Representation and Reasoning. London: Springer-Verlag, pp. 109-123. Blackwell, A.F. and Green, T.R.G. (2003). Notational systems - the Cognitive Dimensions of Notations framework. In J.M. Carroll (Ed.) HCI Models, Theories and Frameworks: Toward a multidisciplinary science. San Francisco: Morgan Kaufmann, 103-134. Blackwell, A.F. (2006). The reification of metaphor as a design tool. ACM Transactions on Computer-Human Interaction (TOCHI), 13(4), 490-530. Duignan, M., Noble, J. & Biddle, R. (2010). Abstraction and activity in computer mediated music production, Computer Music Journal, vol. 34 Engelhardt, Y. (2002). The Language of Graphics. A framework for the analysis of syntax and meaning in maps, charts and diagrams. PhD Thesis, University of Amsterdam. Ferguson, E.S. (1992). Engineering and the mind's eye. MIT Press. Garland, K. (1994). Mr. Beck's Underground Map. Capital Transport Publishing. Goodman, N. (1976). Languages of art. Indianapolis: Hackett. Pérez-Gómez, A. and Pelletier, L. (1997). Architectural Representation and the Perspective Hinge. MIT Press Gregory, R. (1970). The Intelligent Eye. Weidenfeld and Nicolson. Horton, W.K. (1994). The icon book: Visual symbols for computer systems and documentation. Wiley Ittelson, W.H. (1996). Visual perception of markings. Psychonomic Bulletin & Review, 3(2), 171-187. MacEachren, A.M. (1995). How maps work: Representation, visualization, and design. Guilford. McCloud, S. (1993). Understanding comics: The invisible art. Northhampton, MA:: Kitchen Sink Press. Norman, D.A. (1988/2002). The Design of Everyday Things (originally published under the title ‘The Psychology of Everyday Things’). Newprint. Petre, M. (1995) Why looking isn’t always seeing: readership skills and graphical programming. Communications of the ACM, 38 (6), 33 - 44. 11 Richards, C.J. (1984). Diagrammatics: an investigation aimed at providing a theoretical framework for studying diagrams and for establishing a taxonomy of their fundamental modes of graphic organization. Unpublished Phd Thesis. London: Royal College of Art. Sellen, A. J. & Harper, R. H. R. (2002). The Myth of the Paperless Office. MIT Press. Shneiderman, B. & Plaisant, C. (2010). Designing the User Interface: Strategies for Effective Human-Computer Interaction, 5th edition. Addison-Wesley. Card, S.K., Mackinlay, J.D. & Shneiderman, B. (1999). Readings in Information Visualization: Using Vision to Think. Morgan Kaufmann Spence, R. (2001). Information visualization. Addison-Wesley. Sutherland, I.E. (1963/2003). Sketchpad, A Man-Machine Graphical Communication System. PhD Thesis at Massachusetts Institute of Technology, online version and editors' introduction by A.F. Blackwell & K. Rodden. Technical Report 574. Cambridge University Computer Laboratory Tufte, E.R. (1983). The visual display of quantitative information. Cheshire, CT: Graphics Press Ware, C. (2004). Information Visualization - Perception for Design. Morgan Kaufmann. 12 Visual representation design exercise   13 Lecture 3: Text and gesture interaction Guest lecturer: Per Ola Kristensson will present these ideas, using a case study based on his own recent research, leading to a successful product, recent buyout and extensive press coverage. When technical people are commenting on, or even creating, user interfaces, they often get distracted or hung up on the hardware used for input and output. This is a sign that they haven’t thought very hard about what is going on underneath, and also that they will never keep up with new technical advances. There have always been good and bad examples of interface designs using control panels, punch cards, teletypes, text terminals, bitmap displays, light pens, tablets, mice, touch screens, and so on. With every generation, you can hear people debating whether, for example, ‘the mouse is better than a touch screen’ or ‘voice input is better than a keyboard’. Debates like this demonstrate only that those involved haven’t been able to see past the surface appearance (and the marketing spiel of the device manufacturers). And opinions or expertise on these matters quickly gets out of date. Within the past few weeks, I’ve heard a leading researcher tell his sponsors that ‘we have added a GUI to our prototype’, as if that was an important thing. 20 years ago, it was something of an achievement to get some output on a bitmap display rather than a command line text application. Nowadays, it is more challenging to work with projection surfaces or augmented reality (more of that in a later lecture). But sensing and display technologies change fast, and it’s more important to understand the principles of interaction than the details of a specific interaction device. The lecture on visual representation was based on display principles that are independent of any particular display hardware. If we consider the interaction principles that are independent of any particular hardware, these are:  How does the user get content (both data and structure) into digital form?  How does the user navigate around the content?  How does the user manipulate the content (restructuring, revising, replacing)? These are often inter-dependent. The Dasher system for text entry presents an interface in which the user ‘navigates’ through a space of possible texts as predicted by a probabilistic language model, so it can be considered both as content creation and navigation. It is relatively hard to structure and revise text using Dasher, because the language model only uses a 5-character context, and many text documents have structure on a larger-scale than that. However, Dasher provides an excellent example of an interaction ‘paradigm’ that is independent of any particular hardware – it can be controlled using mouse, keys, voice, breath, eyetracking, and many other devices. 14 General principles: direct manipulation, and interface modes At the point where the GUI was about to become popular, HCI researcher Ben Shneiderman summarized the important opportunities it provided, under the name Direct Manipulation. In fact some of these things were already possible with text interfaces (for example after the advent of full-screen text editors), and they remain relevant in more recent generations of hardware. It is also possible to use GUI libraries to create bad user interfaces that do not support these principles – just being graphical doesn’t make it good The principles of Direct Manipulation as described by Shneiderman are:  An object that is of interest to the user should be continuously visible in the form of a graphical representation on the screen  Operations on objects should involve physical actions (using a pointing device to manipulate the graphical representation) instead of commands with complex syntax  The actions that the user makes should be rapid, should offer incremental changes over the previous situation, and should be reversible  The effect of actions should immediately be visible, so that the user knows what has happened  There should be a modest set of commands doing everything that a novice might need, but it should be possible to expand these, gaining access to more functions as the user develops expertise. We should also note an additional principle, defined around the same time by Larry Tesler at Apple, that the same action should always have the same effect. It’s hard to believe that this wouldn’t be done, but he was campaigning against editors like vi, which many people found unusable because hitting a key on the keyboard could have different consequences at different times. Tesler campaigned against ‘modes’ in the user interface, based on his studies of non-technical users (search for ‘nomodes’ to learn more). The largest achievement of the ‘windows’ style interface is that the frames around each application give the user a clue about different modes – but as Tesler said, removing modes altogether is a great ambition. Content creation Text content: guest lecturer will discuss this, including how we can assess and measure the efficiency of alternative text entry mechanisms. Non-text content: This course won’t say very much about non-text content creation. ‘Content’ can refer to music, visual arts, film, games, novels and many other genres. To understand any of them well, you would have to take a degree in the relevant discipline (some available in Cambridge). All of those fields develop their own professional tools, and there is a constant stream of ‘amateur’ tools modeled on the professional ones. Cultural tastes don’t change that fast (the rate of change is generational, not annual), so digital content creation tools are usually derived from and imitate the artistic tools of previous 15 generations (cameras, microphones, mixing desks, typewriters etc). Innovative content creation tools appear first in the ‘avant garde’ contemporary arts, and take a generation to reach popular audiences, get taken up by mainstream professional artists, and become subject to consumer demand for amateur tools – for example, sampling and mashups were th first explored in the mid-20 century by ‘musique concrete’ composers using tape recorders. The Computer Lab Rainbow Group has always had an active programme of research engagement with contemporary artists, developing new digital media tools. That research continues actively at present, but is outside the scope of an introduction to HCI. Content manipulation and navigation via deixis In order to manipulate content, the user has to be able to refer to specific parts of the product (whether text, diagram, video, audio etc) that he or she is working on. In early text interfaces, references were made by numbering the lines of a text file (e.g. substitute ‘fred’ for ‘frrd’ on line 27 – ‘27;s/frrd/fred/’). As in programming languages, line number could be replaced by labels, but it is irritating to give everything names. Imagine a shop where everything for sale was given its own unique name, or had to be referred to by index position of aisle, shelf, and item. It’s much easier just to point and say ‘I want that one’. In language, this is called deixis – sentences in which the object is identified by pointing at it, rather than by naming it. For the same reason, deixis has become universal in computer languages, and this is why devices for pointing are so important in user interfaces. In early GUIs, the combination of mouse and pointer to achieve deixis was a significant invention (hence the WIMP interface – Windows, Icons, Mouse, Pointer). Other inventions around the same time were the placement of a text cursor between characters, rather than identifying a single character (Larry Tesler had a hand in this invention too). But new hardware suggests new approaches to deixis – touch screens, augmented reality etc will all require new inventions. It’s reasonable to assume that deixis in different media can be achieved in different ways too – audio interfaces, cameras, and other devices don’t necessarily need to have a cursor. In many cases, what is required is a deictic method that relates user ‘gestures’ (detected via any kind of sensing device) with a media ‘location’. Navigation is then a matter of supporting user strategies to vary that location, including techniques to show local detail within a larger context (via scroll bars, zooming, thumbnails, fisheye views, overview maps, structure navigation and so on) Simple content manipulations include simply adding more content (perhaps inserted within a particular context), or removing content that isn’t required. Anything more complex involves modifying the structure of the content. This is an area in which user interface design can build on insights from the usability of programming languages (in a later lecture). 16 Evaluation of pointing devices and WIMP interfaces As with text entry, modern user interfaces involve so much pointing, that it is worth optimizing the efficiency of the interaction. Early HCI models based their optimization on Fitts' law – an experimental observation that the time it takes to point at a given location is related to the size of the target and also the distance from the current hand position to the target. Fitts original experiment involved two targets of variable size, and separated by a variable distance. Experimental subjects were required to touch first one target, then the other, as quickly as they could. The time that it takes to do this increased with the Amplitude of the movement (i.e. the distance between the targets) and decreased with the Width of the target that they were pointing to: T = K log (A / W + 1) where A = amplitude, W = width 2 When evaluating new pointing devices, it can be useful experimentally to measure performance over a range of target sizes and motion distances, in order to establish the value of the constant in this equation (sometimes called ID: the Index of Difficulty). In user interfaces that require a user to make many sequences of repetitive actions (for example, people working in telephone call centres or in data entry), it can be useful to compare alternative designs by counting the individual actions needed to carry out a particular task, including the number and extent of mouse motions, as well as all the keys pressed on the keyboard. This Keystroke Level Model can be used to provide a quantitative estimate of user performance, and to optimize the design and layout of the interaction sequence. It is more difficult to make numerical comparisons of user interfaces in cases where the user actions are less predictable – the GOMS model (Goals Operators Methods Selection) combines keystroke-level estimates of user actions with an AI planning model derived from the 1969 work of Ernst and Newell on a Generalised Problem Solver. The GPS operated in a search space characterised by possible intermediate states between some initial state and a goal state. Problem solving consisted of finding a series of operations that would eventually reach the goal state. This involved recursive application of two heuristics: A) select an intermediate goal that will reduce the difference between the current state and the desired state, and B) if there is no operation to achieve that goal directly, decompose it into sub-goals until the leaves of the sub-goal hierarchy can be achieved by individual keystrokes or mouse movements. For further reading on KLM and GOMS, see chapter 4 in Carroll, by Bonnie John. Once we can measure interaction efficiency, whether text entry or time to point at a target, it is possible to compare alternative designs through controlled experiments with human participants. These are described in a later lecture. 17 Lecture 4: Inference Mental models – what the user infers about the system Don Norman, one of the first generation of cognitive scientists investigating HCI, also 1 wrote the first popular book on the topic – The Design of Everyday Things . What most people remember about this book is the example of door handles that are so badly designed they need labels telling you to pull them. But his key message was to draw attention to the gulf of evaluation and the gulf of execution– how does the user know what the system is doing, and how do they know what they need to do, in order to achieve their goals? For a review of Norman’s model, see section 3.3.2 in Sharp, Rogers & Preece. Computer systems are so complex, that nobody really knows what is happening inside (except, possibly, the designer). In the face of incomplete information, the gulf of evaluation is unavoidable. The user has to make inferences (or guess) what is happening inside. The user’s conclusions form a mental model of the system. One way of thinking about the design problem is that the designer must give sufficient clues to the user to support that inference process, and help the user form an accurate (or at least adequate) mental model. The idea of a visual metaphor is that the screen display simulates some more familiar real world object, and that the user’s mental model will then be understood by analogy to the real world. The metaphor/analogy approach can potentially help with the gulf of execution too. If the system behaved exactly like the real world objects depicted, then users would know exactly what to do with them. In practice, computer systems never behave exactly like real world objects, and the differences can make the system even more confusing. (Why do you have windows in your desktop? Why do I have to put my USB drive in the rubbish before unplugging it?) Furthermore, designers inadvertently create metaphors that correspond very well to their own understanding of the internal behaviour of the system, but users should not be expected to know as much as designers. User studies can help to identify what users actually know, what they need to know, and how they interpret prototype displays. Mental models research Mental models research attempts to describe the structure of the mental representations that people use for everyday reasoning and problem solving. Common mental models of everyday situations are often quite different from scientific descriptions of the same phenomena. They may be adequate for basic problem solving, but break down in unusual situations. For example, many people imagine electricity as being like a fluid flowing                                                          1 originally called the Psychology of Everyday Things – he wrote much of it while on sabbatical leave at the Applied Psychology Unit in Cambridge, and among other examples, described the idiosyncratic voicemail system at the APU 18 through the circuit. When electrical wiring was first installed in houses, it appeared very similar to gas or water reticulation, including valves to turn the flow on and off, and hoses to direct the flow into an appliance. Many people extended this analogy and believed that the electricity would leak out of the light sockets if they were left without a lightbulb. This mental model did not cause any serious problems - people simply made sure that there were lightbulbs in the sockets, and they had no trouble operating electrical devices on the basis of their model. The psychological nature of unofficial but useful mental models was described in the 1970s, and these ideas have been widely applied to computer systems. Young's study of calculator users in 1981 found that users generally had some cover story which explained to their satisfaction what happened inside the device. Payne carried out a more recent study of ATM users, demonstrating that even though they have never been given explicit instruction about the operation of the ATM network, they do have a definite mental model of data flow through the network, as well as clear beliefs about information such as the location of their account details. The basic claim of mental models theory is that if you know the users' beliefs about the system they are using, you can predict their behaviour. The users' mental models allow them to make inferences about the results of their actions by a process of mental simulation. The user imagines the effect of his or her actions before committing to a physical action on the device. This mental simulation process is used to predict the effect of an action in accordance with a mental model, and it supports planning of future actions through inference on the mental model. Where the model is incomplete, and the user encounters a situation that cannot be explained by the mental model, this inference will usually rely on analogy to other devices that the user already knows. Think aloud studies A great deal of cognitive psychology research, including some basic research on mental models, has been based on think-aloud studies, in which subjects are asked to carry out some task while talking as continuously as possible. The data are collected in the form of a verbal protocol, normally transcribed from a tape recording so that subtle points are not missed. Use of this technique requires some care. It can be difficult to get subjects to think aloud, and some methods of doing so can bias the experimental data. A detailed discussion of this kind of study is provided by Ericsson & Simon (1985). For a description of think-aloud techniques, see section 7.6.2 in Sharp, Rogers & Preece. Performance models of users Early HCI research was largely concerned with the performance of the user, measured in engineering terms as a system component (‘cognitive psychology’ is closely associated with ‘artificial intelligence’, investigating human performance by simulating it with 19 machines). One of the most famous findings in cognitive psychology research, and the one most often known to user interface developers, is an observation by George Miller in 1956. Miller generalised from a number of studies finding that people can recall somewhere between 5 and 9 things at one time - usually referred to as “seven plus or minus two”. Surprisingly, this number always seems to be about the same, regardless of what the “things” are. It applies to individual digits and letters, meaning that it would be very difficult to remember 25 letters. However if the letters are arranged into five 5-letter words (apple, grape …), we have no trouble remembering them. We can even remember 5 simple sentences reasonably easily. Miller called these units of short-term memory chunks. It is rather more difficult to define a chunk than to make the observation - but it clearly has something to do with how we can interpret the information. This is often relevant in user interfaces - a user may be able to remember a sequence of seven meaningful operations, but will be unable to remember them if they seem to be arbitrary combinations of smaller elements. Short term memory is also very different from long term memory - everything we know. Learning is the process of encoding information from short term memory into long term memory, where it appears to be stored by association with the other things we already know. Current models of long-term memory are largely based on connectionist theories - we recall things as a result of activation from related nodes in a network. According to this model, we can improve learning and retrieval by providing rich associations - many related connections. This is exploited in user interfaces that mimic either real world situations or other familiar applications. A further subtlety of human memory is that the information stored is not always verbal. Short term memory experiments involving recall of lists failed to investigate the way that we remember visual scenes. Visual working memory is in fact independent of verbal short term memory, and this can be exploited in mnemonic techniques which associate images with items to be remembered – display icons combined with associated labels provide this kind of dual coding. Intelligent interfaces – what the system infers about the user A further inference problem is that, in addition to the user not knowing what is happening inside the system, the system doesn’t ‘know’ what is happening inside the user. Advanced systems can be designed to record and observe user interactions, and on the basis of that data, make inferences about what the user intends to do next, and present short-cuts, usability cues or other aids. These kinds of ‘intelligent user interface’ are becoming more common, but they can also introduce severe usability problems. A notorious early example was the Microsoft Word ‘Clippy’, which analysed features of the document, and offered to help with automatic formatting (“You appear to be writing a letter …”). Although some users found it useful, a far larger number found the tone patronizing and the automated 20 

Advise: Why You Wasting Money in Costly SEO Tools, Use World's Best Free SEO Tool Ubersuggest.