Lectures notes in Artificial intelligence

what is intelligent systems and how to design intelligent systems . how artificial intelligence works pdf free download
Dr.AstonCole Profile Pic
Dr.AstonCole,United Kingdom,Researcher
Published Date:10-07-2017
Your Website URL(Optional)
Comment
NEGNEVITSKY Artificial Intelligence Second Edition Second Edition Artificial Intelligence/Soft Computing MICHAEL NEGNEVITSKY Artificial Artificial Intelligence A Guide to Intelligent Systems Intelligence Artificial Intelligence is often perceived as being a highly complicated, even frightening subject in Computer Science. This view is compounded by books in this area being crowded with complex matrix algebra and differential equations – until now. This book, evolving from lectures given to students with little knowledge of A Guide to Intelligent Systems calculus, assumes no prior programming experience and demonstrates that most of the underlying ideas in intelligent systems are, in reality, simple and straight- forward. Are you looking for a genuinely lucid, introductory text for a course in AI Second Edition or Intelligent Systems Design? Perhaps you’re a non-computer science professional looking for a self-study guide to the state-of-the art in knowledge based systems? Either way, you can’t afford to ignore this book. Covers: ✦ Rule-based expert systems ✦ Fuzzy expert systems ✦ Frame-based expert systems ✦ Artificial neural networks ✦ Evolutionary computation ✦ Hybrid intelligent systems ✦ Knowledge engineering ✦ Data mining New to this edition: ✦ New demonstration rule-based system, MEDIA ADVISOR ✦ New section on genetic algorithms ✦ Four new case studies ✦ Completely updated to incorporate the latest developments in this fast-paced field Dr Michael Negnevitsky is a Professor in Electrical Engineering and Computer Science at the University of Tasmania, Australia. The book has developed from lectures to undergraduates. Its material has also been extensively tested through short courses introduced at Otto-von-Guericke-Universität Magdeburg, Institut Elektroantriebstechnik, Magdeburg, Germany, Hiroshima University, Japan and Boston University and Rochester Institute of Technology, USA. Educated as an electrical engineer, Dr Negnevitsky’s many interests include artificial intelligence and soft computing. His research involves the development and application of intelligent systems in electrical engineering, process control and environmental engineering. He has authored and co-authored over 250 research publications including numerous journal articles, four patents for inventions and two books. Cover image by Anthony Rule An imprint of www.pearson-books.comIntroduction to knowledge- 1 based intelligent systems In which we consider what it means to be intelligent and whether machines could be such a thing. 1.1 Intelligent machines, or what machines can do Philosophers have been trying for over two thousand years to understand and resolve two big questions of the universe: how does a human mind work, and can non-humans have minds? However, these questions are still unanswered. Some philosophers have picked up the computational approach originated by computer scientists and accepted the idea that machines can do everything that humans can do. Others have openly opposed this idea, claiming that such highly sophisticated behaviour as love, creative discovery and moral choice will always be beyond the scope of any machine. The nature of philosophy allows for disagreements to remain unresolved. In fact, engineers and scientists have already built machines that we can call ‘intelligent’. So what does the word ‘intelligence’ mean? Let us look at a dictionary definition. 1 Someone’s intelligence is their ability to understand and learn things. 2 Intelligence is the ability to think and understand instead of doing things by instinct or automatically. (Essential English Dictionary, Collins, London, 1990) Thus, according to the first definition, intelligence is the quality possessed by humans. But the second definition suggests a completely different approach and gives some flexibility; it does not specify whether it is someone or something that has the ability to think and understand. Now we should discover what thinking means. Let us consult our dictionary again. Thinking is the activity of using your brain to consider a problem or to create an idea. (Essential English Dictionary, Collins, London, 1990)2 INTRODUCTION TO KNOWLEDGE-BASED INTELLIGENT SYSTEMS So, in order to think, someone or something has to have a brain, or in other words, an organ that enables someone or something to learn and understand things, to solve problems and to make decisions. So we can define intelligence as ‘the ability to learn and understand, to solve problems and to make decisions’. The very question that asks whether computers can be intelligent, or whether machines can think, came to us from the ‘dark ages’ of artificial intelligence (from the late 1940s). The goal of artificial intelligence (AI) as a science is to make machines do things that would require intelligence if done by humans (Boden, 1977). Therefore, the answer to the question ‘Can machines think?’ was vitally important to the discipline. However, the answer is not a simple ‘Yes’ or ‘No’, but rather a vague or fuzzy one. Your everyday experience and common sense would have told you that. Some people are smarter in some ways than others. Sometimes we make very intelligent decisions but sometimes we also make very silly mistakes. Some of us deal with complex mathematical and engineering problems but are moronic in philosophy and history. Some people are good at making money, while others are better at spending it. As humans, we all have the ability to learn and understand, to solve problems and to make decisions; however, our abilities are not equal and lie in different areas. There- fore, we should expect that if machines can think, some of them might be smarter than others in some ways. One of the earliest and most significant papers on machine intelligence, ‘Computing machinery and intelligence’, was written by the British mathema- tician Alan Turing over fifty years ago (Turing, 1950). However, it has stood up well to the test of time, and Turing’s approach remains universal. Alan Turing began his scientific career in the early 1930s by rediscovering the Central Limit Theorem. In 1937 he wrote a paper on computable numbers, in which he proposed the concept of a universal machine. Later, during the Second World War, he was a key player in deciphering Enigma, the German military encoding machine. After the war, Turing designed the ‘Automatic Computing Engine’. He also wrote the first program capable of playing a complete chess game; it was later implemented on the Manchester University computer. Turing’s theoretical concept of the universal computer and his practical experi- ence in building code-breaking systems equipped him to approach the key fundamental question of artificial intelligence. He asked: Is there thought without experience? Is there mind without communication? Is there language without living? Is there intelligence without life? All these questions, as you can see, are just variations on the fundamental question of artificial intelligence, Can machines think? Turing did not provide definitions of machines and thinking, he just avoided semantic arguments by inventing a game, the Turing imitation game. Instead of asking, ‘Can machines think?’, Turing said we should ask, ‘Can machines pass a behaviour test for intelligence?’ He predicted that by the year 2000, a computer could be programmed to have a conversation with a human interrogator for five minutes and would have a 30 per cent chance of deceiving the interrogator that it was a human. Turing defined the intelligent behaviour of a computer as the ability to achieve the human-level performance in cognitive tasks. In otherINTELLIGENT MACHINES 3 Figure 1.1 Turing imitation game: phase 1 words, a computer passes the test if interrogators cannot distinguish the machine from a human on the basis of the answers to their questions. The imitation game proposed by Turing originally included two phases. In the first phase, shown in Figure 1.1, the interrogator, a man and a woman are each placed in separate rooms and can communicate only via a neutral medium such as a remote terminal. The interrogator’s objective is to work out who is the man and who is the woman by questioning them. The rules of the game are that the man should attempt to deceive the interrogator that he is the woman, while the woman has to convince the interrogator that she is the woman. In the second phase of the game, shown in Figure 1.2, the man is replaced by a computer programmed to deceive the interrogator as the man did. It would even be programmed to make mistakes and provide fuzzy answers in the way a human would. If the computer can fool the interrogator as often as the man did, we may say this computer has passed the intelligent behaviour test. Physical simulation of a human is not important for intelligence. Hence, in the Turing test the interrogator does not see, touch or hear the computer and is therefore not influenced by its appearance or voice. However, the interrogator is allowed to ask any questions, even provocative ones, in order to identify the machine. The interrogator may, for example, ask both the human and the Figure 1.2 Turing imitation game: phase 24 INTRODUCTION TO KNOWLEDGE-BASED INTELLIGENT SYSTEMS machine to perform complex mathematical calculations, expecting that the computer will provide a correct solution and will do it faster than the human. Thus, the computer will need to know when to make a mistake and when to delay its answer. The interrogator also may attempt to discover the emotional nature of the human, and thus, he might ask both subjects to examine a short novel or poem or even painting. Obviously, the computer will be required here to simulate a human’s emotional understanding of the work. The Turing test has two remarkable qualities that make it really universal. . By maintaining communication between the human and the machine via terminals, the test gives us an objective standard view on intelligence. It avoids debates over the human nature of intelligence and eliminates any bias in favour of humans. . The test itself is quite independent from the details of the experiment. It can be conducted either as a two-phase game as just described, or even as a single- phase game in which the interrogator needs to choose between the human and the machine from the beginning of the test. The interrogator is also free to ask any question in any field and can concentrate solely on the content of the answers provided. Turing believed that by the end of the 20th century it would be possible to program a digital computer to play the imitation game. Although modern computers still cannot pass the Turing test, it provides a basis for the verification and validation of knowledge-based systems. A program thought intelligent in some narrow area of expertise is evaluated by comparing its performance with the performance of a human expert. 18 Our brain stores the equivalent of over 10 bits and can process information 15 at the equivalent of about 10 bits per second. By 2020, the brain will probably be modelled by a chip the size of a sugar cube – and perhaps by then there will be a computer that can play – even win – the Turing imitation game. However, do we really want the machine to perform mathematical calculations as slowly and inaccurately as humans do? From a practical point of view, an intelligent machine should help humans to make decisions, to search for information, to control complex objects, and finally to understand the meaning of words. There is probably no point in trying to achieve the abstract and elusive goal of developing machines with human-like intelligence. To build an intelligent computer system, we have to capture, organise and use human expert knowl- edge in some narrow area of expertise. 1.2 The history of artificial intelligence, or from the ‘Dark Ages’ to knowledge-based systems Artificial intelligence as a science was founded by three generations of research- ers. Some of the most important events and contributors from each generation are described next.THE HISTORY OF ARTIFICIAL INTELLIGENCE 5 1.2.1 The ‘Dark Ages’, or the birth of artificial intelligence (1943–56) The first work recognised in the field of artificial intelligence (AI) was presented by Warren McCulloch and Walter Pitts in 1943. McCulloch had degrees in philosophy and medicine from Columbia University and became the Director of the Basic Research Laboratory in the Department of Psychiatry at the University of Illinois. His research on the central nervous system resulted in the first major contribution to AI: a model of neurons of the brain. McCulloch and his co-author Walter Pitts, a young mathematician, proposed a model of artificial neural networks in which each neuron was postulated as being in binary state, that is, in either on or off condition (McCulloch and Pitts, 1943). They demonstrated that their neural network model was, in fact, equivalent to the Turing machine, and proved that any computable function could be computed by some network of connected neurons. McCulloch and Pitts also showed that simple network structures could learn. The neural network model stimulated both theoretical and experimental work to model the brain in the laboratory. However, experiments clearly demonstrated that the binary model of neurons was not correct. In fact, a neuron has highly non-linear characteristics and cannot be considered as a simple two-state device. Nonetheless, McCulloch, the second ‘founding father’ of AI after Alan Turing, had created the cornerstone of neural computing and artificial neural networks (ANN). After a decline in the 1970s, the field of ANN was revived in the late 1980s. The third founder of AI was John von Neumann, the brilliant Hungarian- born mathematician. In 1930, he joined the Princeton University, lecturing in mathematical physics. He was a colleague and friend of Alan Turing. During the Second World War, von Neumann played a key role in the Manhattan Project that built the nuclear bomb. He also became an adviser for the Electronic Numerical Integrator and Calculator (ENIAC) project at the University of Pennsylvania and helped to design the Electronic Discrete Variable Automatic Computer (EDVAC), a stored program machine. He was influenced by McCulloch and Pitts’s neural network model. When Marvin Minsky and Dean Edmonds, two graduate students in the Princeton mathematics department, built the first neural network computer in 1951, von Neumann encouraged and supported them. Another of the first-generation researchers was Claude Shannon. He gradu- ated from Massachusetts Institute of Technology (MIT) and joined Bell Telephone Laboratories in 1941. Shannon shared Alan Turing’s ideas on the possibility of machine intelligence. In 1950, he published a paper on chess- playing machines, which pointed out that a typical chess game involved about 120 10 possible moves (Shannon, 1950). Even if the new von Neumann-type 106 computer could examine one move per microsecond, it would take 3 10 years to make its first move. Thus Shannon demonstrated the need to use heuristics in the search for the solution. Princeton University was also home to John McCarthy, another founder of AI. He convinced Martin Minsky and Claude Shannon to organise a summer6 INTRODUCTION TO KNOWLEDGE-BASED INTELLIGENT SYSTEMS workshop at Dartmouth College, where McCarthy worked after graduating from Princeton. In 1956, they brought together researchers interested in the study of machine intelligence, artificial neural nets and automata theory. The workshop was sponsored by IBM. Although there were just ten researchers, this workshop gave birth to a new science called artificial intelligence. For the next twenty years the field of AI would be dominated by the participants at the Dartmouth workshop and their students. 1.2.2 The rise of artificial intelligence, or the era of great expectations (1956–late 1960s) The early years of AI are characterised by tremendous enthusiasm, great ideas and very limited success. Only a few years before, computers had been intro- duced to perform routine mathematical calculations, but now AI researchers were demonstrating that computers could do more than that. It was an era of great expectations. John McCarthy, one of the organisers of the Dartmouth workshop and the inventor of the term ‘artificial intelligence’, moved from Dartmouth to MIT. He defined the high-level language LISP – one of the oldest programming languages (FORTRAN is just two years older), which is still in current use. In 1958, McCarthy presented a paper, ‘Programs with Common Sense’, in which he proposed a program called the Advice Taker to search for solutions to general problems of the world (McCarthy, 1958). McCarthy demonstrated how his program could generate, for example, a plan to drive to the airport, based on some simple axioms. Most importantly, the program was designed to accept new axioms, or in other words new knowledge, in different areas of expertise without being reprogrammed. Thus the Advice Taker was the first complete knowledge- based system incorporating the central principles of knowledge representation and reasoning. Another organiser of the Dartmouth workshop, Marvin Minsky, also moved to MIT. However, unlike McCarthy with his focus on formal logic, Minsky developed an anti-logical outlook on knowledge representation and reasoning. His theory of frames (Minsky, 1975) was a major contribution to knowledge engineering. The early work on neural computing and artificial neural networks started by McCulloch and Pitts was continued. Learning methods were improved and Frank Rosenblatt proved the perceptron convergence theorem, demonstrating that his learning algorithm could adjust the connection strengths of a perceptron (Rosenblatt, 1962). One of the most ambitious projects of the era of great expectations was the General Problem Solver (GPS) (Newell and Simon, 1961, 1972). Allen Newell and Herbert Simon from the Carnegie Mellon University developed a general- purpose program to simulate human problem-solving methods. GPS was probably the first attempt to separate the problem-solving technique from the data. It was based on the technique now referred to as means-ends analysis.THE HISTORY OF ARTIFICIAL INTELLIGENCE 7 Newell and Simon postulated that a problem to be solved could be defined in terms of states. The means-ends analysis was used to determine a difference between the current state and the desirable state or the goal state of the problem, and to choose and apply operators to reach the goal state. If the goal state could not be immediately reached from the current state, a new state closer to the goal would be established and the procedure repeated until the goal state was reached. The set of operators determined the solution plan. However, GPS failed to solve complicated problems. The program was based on formal logic and therefore could generate an infinite number of possible operators, which is inherently inefficient. The amount of computer time and memory that GPS required to solve real-world problems led to the project being abandoned. In summary, we can say that in the 1960s, AI researchers attempted to simulate the complex thinking process by inventing general methods for solving broad classes of problems. They used the general-purpose search mechanism to find a solution to the problem. Such approaches, now referred to as weak methods, applied weak information about the problem domain; this resulted in weak performance of the programs developed. However, it was also a time when the field of AI attracted great scientists who introduced fundamental new ideas in such areas as knowledge representation, learning algorithms, neural computing and computing with words. These ideas could not be implemented then because of the limited capabilities of computers, but two decades later they have led to the development of real-life practical applications. It is interesting to note that Lotfi Zadeh, a professor from the University of California at Berkeley, published his famous paper ‘Fuzzy sets’ also in the 1960s (Zadeh, 1965). This paper is now considered the foundation of the fuzzy set theory. Two decades later, fuzzy researchers have built hundreds of smart machines and intelligent systems. By 1970, the euphoria about AI was gone, and most government funding for AI projects was cancelled. AI was still a relatively new field, academic in nature, with few practical applications apart from playing games (Samuel, 1959, 1967; Greenblatt et al., 1967). So, to the outsider, the achievements would be seen as toys, as no AI system at that time could manage real-world problems. 1.2.3 Unfulfilled promises, or the impact of reality (late 1960s–early 1970s) From the mid-1950s, AI researchers were making promises to build all-purpose intelligent machines on a human-scale knowledge base by the 1980s, and to exceed human intelligence by the year 2000. By 1970, however, they realised that such claims were too optimistic. Although a few AI programs could demonstrate some level of machine intelligence in one or two toy problems, almost no AI projects could deal with a wider selection of tasks or more difficult real-world problems.8 INTRODUCTION TO KNOWLEDGE-BASED INTELLIGENT SYSTEMS The main difficulties for AI in the late 1960s were: . Because AI researchers were developing general methods for broad classes of problems, early programs contained little or even no knowledge about a problem domain. To solve problems, programs applied a search strategy by trying out different combinations of small steps, until the right one was found. This method worked for ‘toy’ problems, so it seemed reasonable that, if the programs could be ‘scaled up’ to solve large problems, they would finally succeed. However, this approach was wrong. Easy, or tractable, problems can be solved in polynomial time, i.e. for a problem of size n, the time or number of steps needed to find the solution is a polynomial function of n. On the other hand, hard or intractable problems require times that are exponential functions of the problem size. While a polynomial-time algorithm is considered to be efficient, an exponential-time algorithm is inefficient, because its execution time increases rapidly with the problem size. The theory of NP-completeness (Cook, 1971; Karp, 1972), developed in the early 1970s, showed the existence of a large class of non- deterministic polynomial problems (NP problems) that are NP-complete. A problem is called NP if its solution (if one exists) can be guessed and verified in polynomial time; non-deterministic means that no particular algorithm is followed to make the guess. The hardest problems in this class are NP-complete. Even with faster computers and larger memories, these problems are hard to solve. . Many of the problems that AI attempted to solve were too broad and too difficult. A typical task for early AI was machine translation. For example, the National Research Council, USA, funded the translation of Russian scientific papers after the launch of the first artificial satellite (Sputnik) in 1957. Initially, the project team tried simply replacing Russian words with English, using an electronic dictionary. However, it was soon found that translation requires a general understanding of the subject to choose the correct words. This task was too difficult. In 1966, all translation projects funded by the US government were cancelled. . In 1971, the British government also suspended support for AI research. Sir James Lighthill had been commissioned by the Science Research Council of Great Britain to review the current state of AI (Lighthill, 1973). He did not find any major or even significant results from AI research, and therefore saw no need to have a separate science called ‘artificial intelligence’. 1.2.4 The technology of expert systems, or the key to success (early 1970s–mid-1980s) Probably the most important development in the 1970s was the realisation that the problem domain for intelligent machines had to be sufficiently restricted. Previously, AI researchers had believed that clever search algorithms and reasoning techniques could be invented to emulate general, human-like, problem-solving methods. A general-purpose search mechanism could rely onTHE HISTORY OF ARTIFICIAL INTELLIGENCE 9 elementary reasoning steps to find complete solutions and could use weak knowledge about domain. However, when weak methods failed, researchers finally realised that the only way to deliver practical results was to solve typical cases in narrow areas of expertise by making large reasoning steps. The DENDRAL program is a typical example of the emerging technology (Buchanan et al., 1969). DENDRAL was developed at Stanford University to analyse chemicals. The project was supported by NASA, because an un- manned spacecraft was to be launched to Mars and a program was required to determine the molecular structure of Martian soil, based on the mass spectral data provided by a mass spectrometer. Edward Feigenbaum (a former student of Herbert Simon), Bruce Buchanan (a computer scientist) and Joshua Lederberg (a Nobel prize winner in genetics) formed a team to solve this challenging problem. The traditional method of solving such problems relies on a generate- and-test technique: all possible molecular structures consistent with the mass spectrogram are generated first, and then the mass spectrum is determined or predicted for each structure and tested against the actual spectrum. However, this method failed because millions of possible structures could be generated – the problem rapidly became intractable even for decent-sized molecules. To add to the difficulties of the challenge, there was no scientific algorithm for mapping the mass spectrum into its molecular structure. However, analytical chemists, such as Lederberg, could solve this problem by using their skills, experience and expertise. They could enormously reduce the number of possible structures by looking for well-known patterns of peaks in the spectrum, and thus provide just a few feasible solutions for further examination. Therefore, Feigenbaum’s job became to incorporate the expertise of Lederberg into a computer program to make it perform at a human expert level. Such programs were later called expert systems. To understand and adopt Lederberg’s knowl- edge and operate with his terminology, Feigenbaum had to learn basic ideas in chemistry and spectral analysis. However, it became apparent that Feigenbaum used not only rules of chemistry but also his own heuristics, or rules-of-thumb, based on his experience, and even guesswork. Soon Feigenbaum identified one of the major difficulties in the project, which he called the ‘knowledge acquisi- tion bottleneck’ – how to extract knowledge from human experts to apply to computers. To articulate his knowledge, Lederberg even needed to study basics in computing. Working as a team, Feigenbaum, Buchanan and Lederberg developed DENDRAL, the first successful knowledge-based system. The key to their success was mapping all the relevant theoretical knowledge from its general form to highly specific rules (‘cookbook recipes’) (Feigenbaum et al., 1971). The significance of DENDRAL can be summarised as follows: . DENDRAL marked a major ‘paradigm shift’ in AI: a shift from general- purpose, knowledge-sparse, weak methods to domain-specific, knowledge- intensive techniques.10 INTRODUCTION TO KNOWLEDGE-BASED INTELLIGENT SYSTEMS . The aim of the project was to develop a computer program to attain the level of performance of an experienced human chemist. Using heuristics in the form of high-quality specific rules – rules-of-thumb – elicited from human experts, the DENDRAL team proved that computers could equal an expert in narrow, defined, problem areas. . The DENDRAL project originated the fundamental idea of the new method- ology of expert systems – knowledge engineering, which encompassed techniques of capturing, analysing and expressing in rules an expert’s ‘know-how’. DENDRAL proved to be a useful analytical tool for chemists and was marketed commercially in the United States. The next major project undertaken by Feigenbaum and others at Stanford University was in the area of medical diagnosis. The project, called MYCIN, started in 1972. It later became the Ph.D. thesis of Edward Shortliffe (Shortliffe, 1976). MYCIN was a rule-based expert system for the diagnosis of infectious blood diseases. It also provided a doctor with therapeutic advice in a convenient, user-friendly manner. MYCIN had a number of characteristics common to early expert systems, including: . MYCIN could perform at a level equivalent to human experts in the field and considerably better than junior doctors. . MYCIN’s knowledge consisted of about 450 independent rules of IF-THEN form derived from human knowledge in a narrow domain through extensive interviewing of experts. . The knowledge incorporated in the form of rules was clearly separated from the reasoning mechanism. The system developer could easily manipulate knowledge in the system by inserting or deleting some rules. For example, a domain-independent version of MYCIN called EMYCIN (Empty MYCIN) was later produced at Stanford University (van Melle, 1979; van Melle et al., 1981). It had all the features of the MYCIN system except the knowledge of infectious blood diseases. EMYCIN facilitated the development of a variety of diagnostic applications. System developers just had to add new knowledge in the form of rules to obtain a new application. MYCIN also introduced a few new features. Rules incorporated in MYCIN reflected the uncertainty associated with knowledge, in this case with medical diagnosis. It tested rule conditions (the IF part) against available data or data requested from the physician. When appropriate, MYCIN inferred the truth of a condition through a calculus of uncertainty called certainty factors. Reasoning in the face of uncertainty was the most important part of the system. Another probabilistic system that generated enormous publicity was PROSPECTOR, an expert system for mineral exploration developed by the Stanford Research Institute (Duda et al., 1979). The project ran from 1974 toTHE HISTORY OF ARTIFICIAL INTELLIGENCE 11 1983. Nine experts contributed their knowledge and expertise. To represent their knowledge, PROSPECTOR used a combined structure that incorporated rules and a semantic network. PROSPECTOR had over a thousand rules to represent extensive domain knowledge. It also had a sophisticated support package including a knowledge acquisition system. PROSPECTOR operates as follows. The user, an exploration geologist, is asked to input the characteristics of a suspected deposit: the geological setting, structures, kinds of rocks and minerals. Then the program compares these characteristics with models of ore deposits and, if necessary, queries the user to obtain additional information. Finally, PROSPECTOR makes an assessment of the suspected mineral deposit and presents its conclusion. It can also explain the steps it used to reach the conclusion. In exploration geology, important decisions are usually made in the face of uncertainty, with knowledge that is incomplete or fuzzy. To deal with such knowledge, PROSPECTOR incorporated Bayes’s rules of evidence to propagate uncertainties through the system. PROSPECTOR performed at the level of an expert geologist and proved itself in practice. In 1980, it identified a molybde- num deposit near Mount Tolman in Washington State. Subsequent drilling by a mining company confirmed the deposit was worth over 100 million. You couldn’t hope for a better justification for using expert systems. The expert systems mentioned above have now become classics. A growing number of successful applications of expert systems in the late 1970s showed that AI technology could move successfully from the research laboratory to the commercial environment. During this period, however, most expert systems were developed with special AI languages, such as LISP, PROLOG and OPS, based on powerful workstations. The need to have rather expensive hardware and complicated programming languages meant that the challenge of expert system development was left in the hands of a few research groups at Stanford University, MIT, Stanford Research Institute and Carnegie-Mellon University. Only in the 1980s, with the arrival of personal computers (PCs) and easy-to-use expert system development tools – shells – could ordinary researchers and engineers in all disciplines take up the opportunity to develop expert systems. A 1986 survey reported a remarkable number of successful expert system applications in different areas: chemistry, electronics, engineering, geology, management, medicine, process control and military science (Waterman, 1986). Although Waterman found nearly 200 expert systems, most of the applications were in the field of medical diagnosis. Seven years later a similar survey reported over 2500 developed expert systems (Durkin, 1994). The new growing area was business and manufacturing, which accounted for about 60 per cent of the applications. Expert system technology had clearly matured. Are expert systems really the key to success in any field? In spite of a great number of successful developments and implementations of expert systems in different areas of human knowledge, it would be a mistake to overestimate the capability of this technology. The difficulties are rather complex and lie in both technical and sociological spheres. They include the following:12 INTRODUCTION TO KNOWLEDGE-BASED INTELLIGENT SYSTEMS . Expert systems are restricted to a very narrow domain of expertise. For example, MYCIN, which was developed for the diagnosis of infectious blood diseases, lacks any real knowledge of human physiology. If a patient has more than one disease, we cannot rely on MYCIN. In fact, therapy prescribed for the blood disease might even be harmful because of the other disease. . Because of the narrow domain, expert systems are not as robust and flexible as a user might want. Furthermore, expert systems can have difficulty recognis- ing domain boundaries. When given a task different from the typical problems, an expert system might attempt to solve it and fail in rather unpredictable ways. . Expert systems have limited explanation capabilities. They can show the sequence of the rules they applied to reach a solution, but cannot relate accumulated, heuristic knowledge to any deeper understanding of the problem domain. . Expert systems are also difficult to verify and validate. No general technique has yet been developed for analysing their completeness and consistency. Heuristic rules represent knowledge in abstract form and lack even basic understanding of the domain area. It makes the task of identifying incorrect, incomplete or inconsistent knowledge very difficult. . Expert systems, especially the first generation, have little or no ability to learn from their experience. Expert systems are built individually and cannot be developed fast. It might take from five to ten person-years to build an expert system to solve a moderately difficult problem (Waterman, 1986). Complex systems such as DENDRAL, MYCIN or PROSPECTOR can take over 30 person- years to build. This large effort, however, would be difficult to justify if improvements to the expert system’s performance depended on further attention from its developers. Despite all these difficulties, expert systems have made the breakthrough and proved their value in a number of important applications. 1.2.5 How to make a machine learn, or the rebirth of neural networks (mid-1980s–onwards) In the mid-1980s, researchers, engineers and experts found that building an expert system required much more than just buying a reasoning system or expert system shell and putting enough rules in it. Disillusion about the applicability of expert system technology even led to people predicting an AI ‘winter’ with severely squeezed funding for AI projects. AI researchers decided to have a new look at neural networks. By the late 1960s, most of the basic ideas and concepts necessary for neural computing had already been formulated (Cowan, 1990). However, only in the mid-1980s did the solution emerge. The major reason for the delay was technological: there were no PCs or powerful workstations to model andTHE HISTORY OF ARTIFICIAL INTELLIGENCE 13 experiment with artificial neural networks. The other reasons were psychological and financial. For example, in 1969, Minsky and Papert had mathematically demonstrated the fundamental computational limitations of one-layer perceptrons (Minsky and Papert, 1969). They also said there was no reason to expect that more complex multilayer perceptrons would represent much. This certainly would not encourage anyone to work on perceptrons, and as a result, most AI researchers deserted the field of artificial neural networks in the 1970s. In the 1980s, because of the need for brain-like information processing, as well as the advances in computer technology and progress in neuroscience, the field of neural networks experienced a dramatic resurgence. Major contributions to both theory and design were made on several fronts. Grossberg established a new principle of self-organisation (adaptive resonance theory), which provided the basis for a new class of neural networks (Grossberg, 1980). Hopfield introduced neural networks with feedback – Hopfield networks, which attracted much attention in the 1980s (Hopfield, 1982). Kohonen published a paper on self-organised maps (Kohonen, 1982). Barto, Sutton and Anderson published their work on reinforcement learning and its application in control (Barto et al., 1983). But the real breakthrough came in 1986 when the back-propagation learning algorithm, first introduced by Bryson and Ho in 1969 (Bryson and Ho, 1969), was reinvented by Rumelhart and McClelland in Parallel Distributed Processing: Explorations in the Microstructures of Cognition (Rumelhart and McClelland, 1986). At the same time, back-propagation learning was also discovered by Parker (Parker, 1987) and LeCun (LeCun, 1988), and since then has become the most popular technique for training multilayer perceptrons. In 1988, Broomhead and Lowe found a procedure to design layered feedforward networks using radial basis functions, an alternative to multilayer perceptrons (Broomhead and Lowe, 1988). Artificial neural networks have come a long way from the early models of McCulloch and Pitts to an interdisciplinary subject with roots in neuroscience, psychology, mathematics and engineering, and will continue to develop in both theory and practical applications. However, Hopfield’s paper (Hopfield, 1982) and Rumelhart and McClelland’s book (Rumelhart and McClelland, 1986) were the most significant and influential works responsible for the rebirth of neural networks in the 1980s. 1.2.6 Evolutionary computation, or learning by doing (early 1970s–onwards) Natural intelligence is a product of evolution. Therefore, by simulating bio- logical evolution, we might expect to discover how living systems are propelled towards high-level intelligence. Nature learns by doing; biological systems are not told how to adapt to a specific environment – they simply compete for survival. The fittest species have a greater chance to reproduce, and thereby to pass their genetic material to the next generation.14 INTRODUCTION TO KNOWLEDGE-BASED INTELLIGENT SYSTEMS The evolutionary approach to artificial intelligence is based on the com- putational models of natural selection and genetics. Evolutionary computation works by simulating a population of individuals, evaluating their performance, generating a new population, and repeating this process a number of times. Evolutionary computation combines three main techniques: genetic algo- rithms, evolutionary strategies, and genetic programming. The concept of genetic algorithms was introduced by John Holland in the early 1970s (Holland, 1975). He developed an algorithm for manipulating artificial ‘chromosomes’ (strings of binary digits), using such genetic operations as selection, crossover and mutation. Genetic algorithms are based on a solid theoretical foundation of the Schema Theorem (Holland, 1975; Goldberg, 1989). In the early 1960s, independently of Holland’s genetic algorithms, Ingo Rechenberg and Hans-Paul Schwefel, students of the Technical University of Berlin, proposed a new optimisation method called evolutionary strategies (Rechenberg, 1965). Evolutionary strategies were designed specifically for solving parameter optimisation problems in engineering. Rechenberg and Schwefel suggested using random changes in the parameters, as happens in natural mutation. In fact, an evolutionary strategies approach can be considered as an alternative to the engineer’s intuition. Evolutionary strategies use a numerical optimisation procedure, similar to a focused Monte Carlo search. Both genetic algorithms and evolutionary strategies can solve a wide range of problems. They provide robust and reliable solutions for highly complex, non- linear search and optimisation problems that previously could not be solved at all (Holland, 1995; Schwefel, 1995). Genetic programming represents an application of the genetic model of learning to programming. Its goal is to evolve not a coded representation of some problem, but rather a computer code that solves the problem. That is, genetic programming generates computer programs as the solution. The interest in genetic programming was greatly stimulated by John Koza in the 1990s (Koza, 1992, 1994). He used genetic operations to manipulate symbolic code representing LISP programs. Genetic programming offers a solution to the main challenge of computer science – making computers solve problems without being explicitly programmed. Genetic algorithms, evolutionary strategies and genetic programming repre- sent rapidly growing areas of AI, and have great potential. 1.2.7 The new era of knowledge engineering, or computing with words (late 1980s–onwards) Neural network technology offers more natural interaction with the real world than do systems based on symbolic reasoning. Neural networks can learn, adapt to changes in a problem’s environment, establish patterns in situations where rules are not known, and deal with fuzzy or incomplete information. However, they lack explanation facilities and usually act as a black box. The process of training neural networks with current technologies is slow, and frequent retraining can cause serious difficulties.THE HISTORY OF ARTIFICIAL INTELLIGENCE 15 Although in some special cases, particularly in knowledge-poor situations, ANNs can solve problems better than expert systems, the two technologies are not in competition now. They rather nicely complement each other. Classic expert systems are especially good for closed-system applications with precise inputs and logical outputs. They use expert knowledge in the form of rules and, if required, can interact with the user to establish a particular fact. A major drawback is that human experts cannot always express their knowledge in terms of rules or explain the line of their reasoning. This can prevent the expert system from accumulating the necessary knowledge, and consequently lead to its failure. To overcome this limitation, neural computing can be used for extracting hidden knowledge in large data sets to obtain rules for expert systems (Medsker and Leibowitz, 1994; Zahedi, 1993). ANNs can also be used for correcting rules in traditional rule-based expert systems (Omlin and Giles, 1996). In other words, where acquired knowledge is incomplete, neural networks can refine the knowledge, and where the knowledge is inconsistent with some given data, neural networks can revise the rules. Another very important technology dealing with vague, imprecise and uncertain knowledge and data is fuzzy logic. Most methods of handling imprecision in classic expert systems are based on the probability concept. MYCIN, for example, introduced certainty factors, while PROSPECTOR incorp- orated Bayes’ rules to propagate uncertainties. However, experts do not usually think in probability values, but in such terms as often, generally, sometimes, occasionally and rarely. Fuzzy logic is concerned with the use of fuzzy values that capture the meaning of words, human reasoning and decision making. As a method to encode and apply human knowledge in a form that accurately reflects an expert’s understanding of difficult, complex problems, fuzzy logic provides the way to break through the computational bottlenecks of traditional expert systems. At the heart of fuzzy logic lies the concept of a linguistic variable. The values of the linguistic variable are words rather than numbers. Similar to expert systems, fuzzy systems use IF-THEN rules to incorporate human knowledge, but these rules are fuzzy, such as: IF speed is high THEN stopping_distance is long IF speed is low THEN stopping_distance is short. Fuzzy logic or fuzzy set theory was introduced by Professor Lotfi Zadeh, Berkeley’s electrical engineering department chairman, in 1965 (Zadeh, 1965). It provided a means of computing with words. However, acceptance of fuzzy set theory by the technical community was slow and difficult. Part of the problem was the provocative name – ‘fuzzy’ – which seemed too light-hearted to be taken seriously. Eventually, fuzzy theory, ignored in the West, was taken seriously in the East – by the Japanese. It has been used successfully since 1987 in Japanese-designed dishwashers, washing machines, air conditioners, television sets, copiers and even cars.16 INTRODUCTION TO KNOWLEDGE-BASED INTELLIGENT SYSTEMS The introduction of fuzzy products gave rise to tremendous interest in this apparently ‘new’ technology first proposed over 30 years ago. Hundreds of books and thousands of technical papers have been written on this topic. Some of the classics are: Fuzzy Sets, Neural Networks and Soft Computing (Yager and Zadeh, eds, 1994); The Fuzzy Systems Handbook (Cox, 1999); Fuzzy Engineering (Kosko, 1997); Expert Systems and Fuzzy Systems (Negoita, 1985); and also the best-seller science book, Fuzzy Thinking (Kosko, 1993), which popularised the field of fuzzy logic. Most fuzzy logic applications have been in the area of control engineering. However, fuzzy control systems use only a small part of fuzzy logic’s power of knowledge representation. Benefits derived from the application of fuzzy logic models in knowledge-based and decision-support systems can be summarised as follows (Cox, 1999; Turban and Aronson, 2000): . Improved computational power: Fuzzy rule-based systems perform faster than conventional expert systems and require fewer rules. A fuzzy expert system merges the rules, making them more powerful. Lotfi Zadeh believes that in a few years most expert systems will use fuzzy logic to solve highly nonlinear and computationally difficult problems. . Improved cognitive modelling: Fuzzy systems allow the encoding of knowl- edge in a form that reflects the way experts think about a complex problem. They usually think in such imprecise terms as high and low, fast and slow, heavy and light, and they also use such terms as very often and almost never, usually and hardly ever, frequently and occasionally. In order to build conventional rules, we need to define the crisp boundaries for these terms, thus breaking down the expertise into fragments. However, this fragmentation leads to the poor performance of conventional expert systems when they deal with highly complex problems. In contrast, fuzzy expert systems model imprecise information, capturing expertise much more closely to the way it is represented in the expert mind, and thus improve cognitive modelling of the problem. . The ability to represent multiple experts: Conventional expert systems are built for a very narrow domain with clearly defined expertise. It makes the system’s performance fully dependent on the right choice of experts. Although a common strategy is to find just one expert, when a more complex expert system is being built or when expertise is not well defined, multiple experts might be needed. Multiple experts can expand the domain, syn- thesise expertise and eliminate the need for a world-class expert, who is likely to be both very expensive and hard to access. However, multiple experts seldom reach close agreements; there are often differences in opinions and even conflicts. This is especially true in areas such as business and manage- ment where no simple solution exists and conflicting views should be taken into account. Fuzzy expert systems can help to represent the expertise of multiple experts when they have opposing views.SUMMARY 17 Although fuzzy systems allow expression of expert knowledge in a more natural way, they still depend on the rules extracted from the experts, and thus might be smart or dumb. Some experts can provide very clever fuzzy rules – but some just guess and may even get them wrong. Therefore, all rules must be tested and tuned, which can be a prolonged and tedious process. For example, it took Hitachi engineers several years to test and tune only 54 fuzzy rules to guide the Sendai Subway System. Using fuzzy logic development tools, we can easily build a simple fuzzy system, but then we may spend days, weeks and even months trying out new rules and tuning our system. How do we make this process faster or, in other words, how do we generate good fuzzy rules automatically? In recent years, several methods based on neural network technology have been used to search numerical data for fuzzy rules. Adaptive or neural fuzzy systems can find new fuzzy rules, or change and tune existing ones based on the data provided. In other words, data in – rules out, or experience in – common sense out. So, where is knowledge engineering heading? Expert, neural and fuzzy systems have now matured and have been applied to a broad range of different problems, mainly in engineering, medicine, finance, business and management. Each technology handles the uncertainty and ambiguity of human knowledge differently, and each technology has found its place in knowledge engineering. They no longer compete; rather they comple- ment each other. A synergy of expert systems with fuzzy logic and neural computing improves adaptability, robustness, fault-tolerance and speed of knowledge-based systems. Besides, computing with words makes them more ‘human’. It is now common practice to build intelligent systems using existing theories rather than to propose new ones, and to apply these systems to real- world problems rather than to ‘toy’ problems. 1.3 Summary We live in the era of the knowledge revolution, when the power of a nation is determined not by the number of soldiers in its army but the knowledge it possesses. Science, medicine, engineering and business propel nations towards a higher quality of life, but they also require highly qualified and skilful people. We are now adopting intelligent machines that can capture the expertise of such knowledgeable people and reason in a manner similar to humans. The desire for intelligent machines was just an elusive dream until the first computer was developed. The early computers could manipulate large data bases effectively by following prescribed algorithms, but could not reason about the information provided. This gave rise to the question of whether computers could ever think. Alan Turing defined the intelligent behaviour of a computer as the ability to achieve human-level performance in a cognitive task. The Turing test provided a basis for the verification and validation of knowledge-based systems.18 INTRODUCTION TO KNOWLEDGE-BASED INTELLIGENT SYSTEMS In 1956, a summer workshop at Dartmouth College brought together ten researchers interested in the study of machine intelligence, and a new science – artificial intelligence – was born. Since the early 1950s, AI technology has developed from the curiosity of a few researchers to a valuable tool to support humans making decisions. We have seen historical cycles of AI from the era of great ideas and great expectations in the 1960s to the disillusionment and funding cutbacks in the early 1970s; from the development of the first expert systems such as DENDRAL, MYCIN and PROSPECTOR in the 1970s to the maturity of expert system technology and its massive applications in different areas in the 1980s/ 90s; from a simple binary model of neurons proposed in the 1940s to a dramatic resurgence of the field of artificial neural networks in the 1980s; from the introduction of fuzzy set theory and its being ignored by the West in the 1960s to numerous ‘fuzzy’ consumer products offered by the Japanese in the 1980s and world-wide acceptance of ‘soft’ computing and computing with words in the 1990s. The development of expert systems created knowledge engineering, the process of building intelligent systems. Today it deals not only with expert systems but also with neural networks and fuzzy logic. Knowledge engineering is still an art rather than engineering, but attempts have already been made to extract rules automatically from numerical data through neural network technology. Table 1.1 summarises the key events in the history of AI and knowledge engineering from the first work on AI by McCulloch and Pitts in 1943, to the recent trends of combining the strengths of expert systems, fuzzy logic and neural computing in modern knowledge-based systems capable of computing with words. The most important lessons learned in this chapter are: . Intelligence is the ability to learn and understand, to solve problems and to make decisions. . Artificial intelligence is a science that has defined its goal as making machines do things that would require intelligence if done by humans. . A machine is thought intelligent if it can achieve human-level performance in some cognitive task. To build an intelligent machine, we have to capture, organise and use human expert knowledge in some problem area. . The realisation that the problem domain for intelligent machines had to be sufficiently restricted marked a major ‘paradigm shift’ in AI from general- purpose, knowledge-sparse, weak methods to domain-specific, knowledge- intensive methods. This led to the development of expert systems – computer programs capable of performing at a human-expert level in a narrow problem area. Expert systems use human knowledge and expertise in the form of specific rules, and are distinguished by the clean separation of the knowledge and the reasoning mechanism. They can also explain their reasoning procedures.SUMMARY 19 Table 1.1 A summary of the main events in the history of AI and knowledge engineering Period Key events McCulloch and Pitts, A Logical Calculus of the Ideas The birth of artificial Immanent in Nervous Activity, 1943 intelligence (1943–56) Turing, Computing Machinery and Intelligence, 1950 The Electronic Numerical Integrator and Calculator project (von Neumann) Shannon, Programming a Computer for Playing Chess, 1950 The Dartmouth College summer workshop on machine intelligence, artificial neural nets and automata theory, 1956 The rise of artificial LISP (McCarthy) intelligence The General Problem Solver (GPR) project (Newell and (1956–late 1960s) Simon) Newell and Simon, Human Problem Solving, 1972 Minsky, A Framework for Representing Knowledge, 1975 The disillusionment in Cook, The Complexity of Theorem Proving Procedures, artificial intelligence 1971 (late 1960s–early Karp, Reducibility Among Combinatorial Problems, 1972 1970s) The Lighthill Report, 1971 The discovery of DENDRAL (Feigenbaum, Buchanan and Lederberg, Stanford expert systems (early University) 1970s–mid-1980s) MYCIN (Feigenbaum and Shortliffe, Stanford University) PROSPECTOR (Stanford Research Institute) PROLOG – a Logic Programming Language (Colmerauer, Roussel and Kowalski, France) EMYCIN (Stanford University) Waterman, A Guide to Expert Systems, 1986 The rebirth of artificial Hopfield, Neural Networks and Physical Systems with neural networks Emergent Collective Computational Abilities, 1982 (1965–onwards) Kohonen, Self-Organized Formation of Topologically Correct Feature Maps, 1982 Rumelhart and McClelland, Parallel Distributed Processing, 1986 The First IEEE International Conference on Neural Networks, 1987 Haykin, Neural Networks, 1994 Neural Network, MATLAB Application Toolbox (The MathWork, Inc.)

Advise: Why You Wasting Money in Costly SEO Tools, Use World's Best Free SEO Tool Ubersuggest.