Neural network model in Information retrieval and Data mining

neural network-based model reference adaptive control system and neural network models in excel for prediction and classification neural networks vs graphical models
Dr.NaveenBansal Profile Pic
Dr.NaveenBansal,India,Teacher
Published Date:25-10-2017
Your Website URL(Optional)
Comment
Recent Advances of Neural Networks Models and Applications: An Introduction 1 2 3 Anna Esposito , Simone Bassis , and Francesco Carlo Morabito 1 Second University of Napoli, Department of Psychology and IIASS, Italy 2 University of Milano, Department of Computer Science, Italy 3 University “Mediterranea” of Reggio Calabria, Department of Civil Engineering, Energy, Environment and Materials (DICEAM), Italy iiass.annaesptin.it, bassisdi.unimi.it, morabitounirc.it Abstract. Recently, increasing attention has been paid to the development of approximate algorithms for equipping machines with an automaton level of intelligence. The aim is to permit the implementation of intelligent behaving systems able to perform tasks which are just a human prerogative. In this context, neural network models have been privileged, thanks to the claim that their intrinsic paradigm can imitate the functioning of the human brain. Nevertheless, there are three important issues that must be accounted for the implementation of a neural network based autonomous system performing an automaton human intelligent behavior. The first one is related to the collection of an appropriate database for training and evaluating the system performance. The second issue is the adoption of an appropriate machine representation of the data which implies the selection of suitable data features for the problem at hand. Finally, the choice of the classification scheme can impact on the achieved results. This introductive chapter summarizes the efforts that have been made in the field of neural network models along the abovementioned research directions through the contents of the chapters included in this book. Keywords: Neural network models, behaving systems, feature selection, big data collection. 1 Introduction Human-machine based applications turn out to be increasingly involved in our personal, professional and social life. In this context, human expectations and requirements become more and more highly structured, up to the desire to exploit them in most environments, in order to decrease human workloads and errors, as well as to be able to interact with them in a natural way. Along these directions, neural network models have been privileged because of their computational paradigm based on brain functioning and learning. However, it has soon become evident that, in order for machines to show autonomous behaviors, it would not suffice to exploit human learning and functioning paradigms. There are issues related to database collection, feature selection and classification schema that must be accounted for in order to 3 © Springer International Publishing Switzerland 2015 S. Bassis et al. (eds.), Recent Advances of Neural Networks Models and Applications, Smart Innovation, Systems and Technologies 37, DOI: 10.1007/978-3-319-18164-6_1 4 A. Esposito, S. Bassis, and F.C. Morabito obtain computational effectiveness and optimal performance. These issues are briefly discussed in Sections 2 to 4. Section 5 summarizes the contents of this book by grouping the received contributions into 5 different sections devoted to the use of neural networks for applications, new or improved models, pattern recognition, signal processing and special topics such as emotional expressions and daily cognitive functions, as well as bio-inspired networks memristor-based. 2 The Data Issue In training and assessing neural networks as a paradigm for complex systems to show autonomous behaviors, the first issue that arises is the appropriateness of the data exploited for it. It has become evident that system performances strongly depend on the database used and the related complexity of the task. If the database is poor in reproducing the features of the task at hand, inaccurate inferences can be drawn, and the trained neural system cannot perform accurately on other similar data. Therefore, it is necessary to assess the database in order to ascertain if it reproduces a genuine setting of the real world environment it aims to describe. The questions that must then be raised in order to define the suitability of the data are: a) Have data been collected in a natural or artificial context? As an example, this can be necessary if the system must discriminate among genuine emotional speech or real world seismic signals, as opposed to acted emotional speech or synthetic signals 3,4,6; b) Are data equally balanced among the categories the system must discriminate? In this case, consider as an instance a speech recognition task. If gender is not an issue, then the data must be equally balanced between male and female subjects; c) Are data representative of the final application they are devoted to? This last question calls for the importance, in designing the database, of the actual task the system is designed for. 3 Feature Selection This issue relates to the way the data are processed in order to extract from them suitable features efficiently describing the different categories among those the system must discriminate for the task at hand. The selection of features can be very hard and difficult depending on the task. An interesting example to describe this problem is to consider a speech emotional recognition task. In this case, the features selection task can be simple (as for a speaker dependent approach 17) or very complex (if the task is speaker independent 3,4) and even more in a noisy environment (as in the case of speech collected through phone calls 1,7). The features selection procedure is strongly dependent on the data and the task, and its effectiveness relies on the knowledge the experimenter applies to understand data and identify features for them, as illustrated by Likforman-Sulem et al. in this volume and deeply explained in 14. In addition, features from different sources can be combined Recent Advances of Neural Networks Models and Applications: An Introduction 5 and fused, as it is tradition in the field of speech, where linguistic (such as language and word models 12) and/or prosodic information (such as F0 contour 19) and visual features (such as action units 13 are fused with acoustic features 8,20. Automatic approach to feature selection can produce a huge amount of features 2 making hard the neural network training process. Of course, the relevance of this step is not limited to speech signal processing (see, for example, 21). 4 Classification Schema There are several classification schema proposed in literature for detection and classification tasks. The most exploited are Artificial Neural Networks (ANN) Gaussian Mixture Models (GMM), Hidden Markov Models (HMM), and Support Vector Machine (SVM) 9,10,18,22. Advantage and drawbacks in their use have been reviewed recently in 11. It is not the aim of this short chapter to go deep inside the problematics of the different classification schema. However, it is important to point out that they can be fused together in more complex models as reported in 15 or be complicated by sophisticated learning algorithms as those related to deep learning architectures, illustrated by Schuller in this volume and deeply explained in 5. 5 Contents of This Book For over twenty years, Neural Networks and Machine Learning (NN/ML) have been an area of continued growth. The need for a Computational (bioinspired) Intelligence has increased dramatically for various reasons in a number of research areas and application fields, spanning from Economic and Finance, to Health and Bioengineering, up to the industrial and entrepreneurial world. Besides the practical interest in these approaches, the progress in NN/ML derives from its interdisciplinary nature. This book is a follow-up of the scientific workshop on Neural Network held in Vietri sul Mare, Italy in May 15-16th 2014, as a continued tradition since its founder, Professor Eduardo Caianiello, thought to it as a way of exchanging information on worldwide activities on the field. The volume brings together the peer-reviewed contributions of the attendees: each paper is an extended version of the original submission (not elsewhere published) and the whole set of contributions has been collected as chapters of this book. It is worth emphasizing that the book provides a balance between the basics, evolution, and NN/ML applications. To this end, the content of the book is organized in six parts: four general sections are devoted to Neural Network Models, Signal Processing, Pattern Recognition, and Neural Network Applications; two sections focused on more specialized topics, namely, “Emotional Expression and Daily Cognitive Functions” and “Memristors and Complex Dynamics in Bio-inspired Networks”. This organization aims indeed at reflecting the wide interdisciplinarity of the field, which on the one hand is capable of motivating novel paradigms and relevant improvement on known paradigms, while, on the other hand, is largely accepted in 6 A. Esposito, S. Bassis, and F.C. Morabito many applicative fields as an efficient and effective way to solve classification, detection, identification and related tasks. In Chapter 2 either novel ways to apply old learning paradigms or recent updates to new ones are proposed. To this aim the chapter includes six contributions respectively on Belief propagation in Normal Factor Graphs (proposed by Buonanno et al.), Genetic Embedding and NN regression (proposed by Panella et al.), Echo-State Networks and Pruning for Reservoir’s Neurons (proposed by Scardapane et al.), Functional Link (proposed by Comminiello et al.), Continuous-Time Spiking Neural Networks (proposed by Cristini et al.) and Online Spectral Clustering (proposed by Rovetta & Masulli). Chapter 3 presents interesting signal processing procedures and results obtained using either Neural Networks or Machine Learning techniques. In this context, section 1 (proposed by Labate et al.) describes an Empirical Mode Decomposition (EMD) to diagnose brain diseases. The following section reports on the effects of artifact rejection and the complexity of EEG (Labate et al., 2015b). Section 3 (proposed by D’Auria et al.) describes the ability of Self-Organizing Maps to de-noise real world as well as synthetic seismic signals, explaining how a self-learning algorithm would be preferable in this context. The following two sections in this chapter focus respectively on the integration of audio and video clues for source localization (by Parisi et al.) and an integrated system based on Spiking Neural Networks known as NeuCube (by Capecci et al.) to model EEGs in Alzheimer Disease data. Chapter 3 main objective is to illustrate pattern recognition procedures defined through neural networks and machine learning algorithms. To this aim, Camastra et al. propose semantic graphs for document characterization, while Graph Neural Networks are used for web spam detection by Belahcen et al. Some complex network concepts, like hubs and communities, are proposed (by Mahmoud et al.) in financial applications. The last section of this chapter (proposed by Di Nardo et al.) presents a video-based access control by automatic license plate recognition. Chapter 4 is devoted to various applications of ML/NN. They span different research fields such as behavioral analysis in maritime environment (by Castaldo et al.), forecasting of domestic water and natural gas demand (by Fagiani et al.), referenceless thermometry (by Agnello et al.), risk assessment (by Cardin and Giove), fingerprint classification (by Vitello et al.), FEEM sustainable composite indicator (by Farnia and Giove); autonomous physical rehabilitation at home (by Borghese et al.) and building automation systems (by De March et al.). Chapter 5 is devoted to illustrate the contributions that were submitted to the workshop special session on emotional expressions and daily cognitive functions organized by Anna Esposito, Vincenzo Capuano and Gennaro Cordasco form the International Institute for Advanced Scientific Studies (IIASS) and the Second University of Napoli (Department of Psychology). The session intended to collect contributes on the current efforts of research for developing automatic systems capable to detect and support users’ psychological wellbeing. To this aim the proposed contributions were on behavioral emotional analysis and perceptual experiments aimed to the identification of cues for detecting healthy and/or non-healthy psychological/physical states such as stress, anxiety, and emotional disturbances, as well as cognitive declines from a social and Recent Advances of Neural Networks Models and Applications: An Introduction 7 psychological perspective. These aspects are covered by the contributions proposed by Esposito et al., as well as, Maldonato and Dell’Orco, Matarazzo and Baldassarre, Baldassarre et al., Hristova and Grinberg, Senese et al, Gnisci et al., included in this volume. In addition, the special session was also devoted to show possible applications and algorithms, biometric and ICT technologies to design innovative and adaptive systems able to detect such behavioral cues as a multiple, theoretical, and technological investment. These aspects are covered by the sections proposed by Schuller, as well as, Likforman et al., and Faundez-Zanuy et al. Chapter 6 includes five papers on Memristive NN, a fast developing field for NN neurons and synapses implementation based on the original concept invented by Leon Chua, in 1971 16. They have been presented within the related session, organized by Fernando Corinto and Eros Pasero from the Polytechnic of Milano, Italy. Memristive systems are used for the synchronization of two Rossler oscillators (in Frasca et al.); for realizing an electrostatic loudspeaker (by Troiano et al.); for an analogic implementation of nonlinear networks in complex dynamic analysis (by Petrarca et al.); for high efficient learning with binary synapses circuitry (by Secco et al.); for quantum-inspired optimization techniques (by Fiaschè). The nature of an edited volume like this, containing a collection of contributions from experts that have been first presented and discussed at the WIRN 2014 Workshop, and then developed in a full paper is quite different from a journal or a conference publication. Each work has been left the needed space to present the details of the proposed topic. The chapters of the volume have been organized in such a manner that the readers can easily seek for additional information from a vast number of cited references. It is our hope the book can contribute to the progress of NN/ML related methods and to their spread to many different fields, as it was in the original spirit of the SIREN (Italian Society of Neural Networks ‒ Società Italiana REti Neuroniche) Society. References 1. Atassi, H., Smékal, Z., Esposito, A.: Emotion recognition from spontaneous Slavic speech. In: Proceedings of 3rd IEEE International Conference on Cognitive Infocommunications (CogInfoCom 2012), Kosice, Slovakia, December 2-5, pp. 389–394 (2012) 2. Atassi, H., Esposito, A., Smekal, Z.: Analysis of high-level features for vocal emotion recognition. In: Proceedings of 34th IEEE International Conference on Telecom. and Signal Processing (TSP), Budapest, Hungary, August 18-20, pp. 361–366 (2011) 3. Atassi, H., Riviello, M.T., Smékal, Z., Hussain, A., Esposito, A.: Emotional vocal expressions recognition using the COST 2102 Italian database of emotional speech. In: Esposito, A., Campbell, N., Vogel, C., Hussain, A., Nijholt, A. (eds.) Second COST 2102. LNCS, vol. 5967, pp. 255–267. Springer, Heidelberg (2010) 4. Atassi, H., Esposito, A.: Speaker independent approach to the classification of emotional vocal expressions. In: Proceedings of IEEE Conference on Tools with Artificial Intelligence (ICTAI 2008), Dayton, OH, USA, November 3-5, vol. 1, pp. 487–494 (2008) 5. Bengio, Y.: Learning Deep Architectures for AI. Foundations and Trends in Machine Learning 2(1), 1–127 (2009) 8 A. Esposito, S. Bassis, and F.C. Morabito 6. D’Auria, L., Esposito, A.M., Petrillo, Z., Siniscalchi, A.: Denoising magnetotelluric recordings using Self-Organizing Maps. In: Bassis, S., Esposito, A., Morabito, F.C. (eds.) Recent Advances of Neural Networks Models and Applications. SIST, vol. 37, pp. 139–149. Springer, Heidelberg (2015) 7. Galanis, D., Karabetsos, S., Koutsombogera, M., Papageorgiou, H., Esposito, A., Riviello, M.T.: Classification of emotional speech units in call centre interactions. In: Proceedings of 4th IEEE International Conference on Cognitive Infocommunications (CogInfoCom 2013), Budapest, Hungary, December 2-5, pp. 403–406 (2013) 8. Karunaratnea, S., Yanb, H.: Modelling and combining emotions, visual speech and gestures in virtual head models. Signal Processing: Image Comm. 21, 429–449 (2006) 9. Kwon, O., Chan, K., Hao, J., Lee, T.: Emotion recognition by speech signal. In: Proceedings of EUROSPEECH 2003, Geneva, Switzerland, September 1-4, pp. 125–128 (2003) 10. Labate, D., Palamara, I., Mammone, N., Morabito, G., Foresta, F.L., Morabito, F.C.: SVM classification of epileptic EEG recordings through multiscale permutation entropy. In: Proc. of Int. Joint Conf. on Neural Networks (IJCNN), Dallas, TX, USA, August 4-9 (2013) 11. Larochelle, H., Erhan, D., Courville, A., Bergstra, J., Bengio, Y.: An empirical evaluation of deep architectures on problems with many factors of variation. In: Proc. of 24th Int. Conf. on Machine Learning (ICML 2007), Corvallis, OR, USA, June 20-24, pp. 473–480 (2007) 12. Lee, C., Pieraccini, R.: Combining acoustic and language information for emotion recognition. In: Proceedings of the ICSLP 2002, pp. 873–876 (2002) 13. Lien, J., Kanade, T., Li, C.: Detection, tracking and classification of action units in facial expression. J. Robotics Autonomous Syst. 31(3), 131 (2002) 14. Lin, F., Liang, D., Yeh, C.-C., Huang, J.-C.: Novel feature selection methods to financial distress prediction. Expert Systems with Applications 41(5), 2472–2483 (2014) 15. Mohamed, A., Dahl, G.E., Hinton, G.: Acoustic Modeling Using Deep Belief Networks. IEEE Transactions on Audio, Speech, and Language Processing 20(1), 14–22 (2012) 16. Morabito, F.C., Andreou, A.G., Chicca, E.: Neuromorphic engineering: from neural systems to brain-like engineered systems. Neural Networks 45, 1–3 (2013) 17. Navas, E., Luengo, H.I.: An objective and subjective study of the role of semantics and prosodic features in building corpora for emotional TTS. IEEE Transactions on Audio, Speech, and Language Processing 14, 1117–1127 (2006) 18. Ou, G., Murphey, Y.L.: Multi-class pattern classification using neural networks. Pattern Recognition 40, 4–18 (2007) 19. Ishi, C.T., Ishiguro, H., Hagita, N.: Automatic extraction of paralinguistic information using prosodic features related to F0, duration and voice quality. Speech Communication 50(6), 531–543 (2008) 20. Schuller, B., Rigoll, G., Lang, M.: Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief-network architecture. In: Proceedings of the ICASSP 2004, vol. 1, pp. 577–580 (2004) 21. Simone, G., Morabito, F.C., Polikar, R., Ramuhalli, P., Udpa, L., Udpa, S.: Feature extraction techniques for ultrasonic signal classification. International Journal of Applied Electromagnetics and Mechanics 15(1-4), 291–294 (2001) 22. Vlassis, N., Likas, A.: A greedy EM algorithm for Gaussian mixture learning. Neural Process. Lett. 15, 77–87 (2002) Part II Models Simulink Implementation of Belief Propagation in Normal Factor Graphs Amedeo Buonanno and Francesco A.N. Palmieri Seconda Universit`a di Napoli (SUN) Dipartimento di Ingegneria Industriale e dell’Informazione, via Roma 29, 81031 Aversa (CE) - Italy amedeo.buonanno,francesco.palmieriunina2.it Abstract. A Simulink Library for rapid prototyping of belief network architecturesusingForney-styleFactorGraphispresented.Ourapproach allows to draw complex architectures in a fairly easy way giving to the user the high flexibility of Matlab-Simulink environment. In this frame- work the user can perform rapid prototyping because belief propagation is carried in a bi-directional data flow in the Simulink architecture. Re- sults on learning a latent model for artificial characters recognition are presented. Keywords: Belief Propagation, Factor Graph, Pattern Recognition, Machine Learning. 1 Introduction Graphical models are a ”marriage between probability theory and graph the- ory”1 as they compactly encode complex distributions overa high-dimensional space. When a problem can be formulated in the form of a graph, it is very ap- pealingtostudy the variablesinvolvedaspartofaninterconnectedsystemwhere the reached equilibrium point is the solution. The similarities with the working of the nervous system makes this paradigm even more fascinating 2. Bayesian inference on graphs, pioneered by Pearl 3, has become a very popular paradigm for approaching many problems in different fields such as communication, signal processing and artificial intelligence 4. The Factor Graph is a particular type of Graphical model and represents an interesting way to model the interaction between stochastic variables. Following the formulation of Forney-style Factor Graphs (FFG) 5 (or normal graphs), Bayesian graphs can be drawn as block diagramsandprobabilitydistribution easilytransformedand propagated.In this paper we report the results of our work in which we have designed and imple- mented aSimulinkLibraryforquickprototypingofseveralnetworkarchitectures using the FFG paradigm. In Section 2 we briefly review the Factor Graph paradigm introducing the building blocks of our proposed Simulink Library. In Section 3 the two operating modes are introduced. In Section 4 we present the application of this tool to an artificial character recognition task.  c Springer International Publishing Switzerland 2015 11 S. Bassis et al. (eds.), Recent Advances of Neural Networks Models and Applications, Smart Innovation, Systems and Technologies 37, DOI: 10.1007/978-3-319-18164-6_2 www.allitebooks.com12 A. Buonanno and F.A.N. Palmieri 2 Simulink Factor Graph Library Factor Graphs model the interaction among stochastic variables. In the FFG approach there are blocks, variables and directed edges 5. Even if edges have a defined direction, probability flows in both directions (foward and backward)4. To associate to each stochastic variable two messages, we have used the built-in Two-Way Connection block that in Simulink allows bidirectional signal flow. In our Simulink implementation all the architectures can be built with just three main functional blocks: Variable, Factor and Diverter (Figure 1) that will be described in the folllowing. In our notation, we avoid the upper arrows 4 and use explicit letters: b for backward and f for forward. Fig.1. Functional Blocks: (a) Variable, (b) Diverter, (c) Factor 2.1 Variable For a variable X (Figure 1(a)) that takes values in the discrete alphabet 1 2 M X X = x ,x ,...,x , forward and backward messages are in function form: i i b (x ),f (x ),i=1: M X X X and in vector form 1 2 M T X b =(b (x ),b (x ),...,b (x )) X X X X 1 2 M T X f =(f (x ),f (x ),...,f (x )) X X X X All messages are proportional (∝) to discrete distributions and may be nor- malized to sum to one. Comprehensive knowledge about X iscontainedinthe distribution p obtained through the product rule (in function form): X i i i p (x ) ∝ f (x )b (x ),i=1: M X X X X or p ∝ f  b , in vector form, where  denotes the element-by-element X X X product. Eachmessageb, f or p in the data flow is an nT×M matrix with nT the num- ber of realizations and M the variable cardinality. Two-way connection blocksSimulink Implementation of Belief Propagation in Normal Factor Graphs 13 allow the construction of a bi-directional data flow. The implementation for an Internal Variable block is shown in Figure 2 where the forward message on the port up (f b up) is transmitted on the port down (f b down) and conversely the backward message on the port down is transmitted on the port up. All distribu- tion flow can be saved to workspace. Fig.2. The implementation of the Internal Variable block. The icon in the library (a) and its detailed scheme (b) SimilarlyFigure3showsthedetailedschemesofSourceandSinkVariableblocks. Fig.3. The implementation of the Source Variable block and of the Sink Variable block. The icon in the library (a,c) and its detailed scheme (b,d) respectively for the Source and for the Sink 2.2 Diverter Block The diverter block (Figure 1(b)) in the Bayesian model represents the equality constraint with the variable X replicated D + 1 times. Messages for incom- ing and outgoing branches carry different forward and backward information.14 A. Buonanno and F.A.N. Palmieri Messages that leave the block are obtained as the product of the incoming ones (in function form): D  i i b (0)(x ) ∝ b (j)(x ) X X j=1 D  i i f ∝ f (x ) b (x ),m=1: D,i=1: M (m) (0) (j) X X X X j=1,j=m In vector form: D b ∝ b , (0) (j) X j=1 X D f (m) ∝ f (0)  b (j),m=1: D X X j=1,j=m X Figure 4 shows the detailed scheme of our implementation of the Diverter Block. Each port is connected to a variable in the network. After element-wise prod- uct among variables each variable is returned after normalization to one (each message is normalized to be a valid distribution). Fig.4. Simulink implementation of a Diverter Block with three ports. The icon in the library (a) and its detailed scheme (b) 2.3 Factor Block The factor block (Figure 1(c)) is the main block that represents the conditional probability matrix of Y given X. More specifically if X takes values in the 1 2 M 1 2 M X Y discrete alphabet X = x ,x ,...,x and Y in Y = y ,y ,...,y , P(YX) is the M ×M row-stochastic matrix: X Y j i j=1:M j=1:M Y Y P(YX)=PrY = y X = x =θ = θ ij i=1:M i=1:M X XSimulink Implementation of Belief Propagation in Normal Factor Graphs 15 Outgoing messages are (in function form): M M X Y   j i i j f (y ) ∝ θ f (x ),b (x ) ∝ θ b (y ) Y ij X X ij Y i=1 j=1 In vector form: T f ∝ P(YX) f , b ∝ P(YX)b Y X X Y The above rules are rigorous translation of Bayes’ theorem and marginalization (a complete review and proofs can be found in classical papers 4, 6). Figure 5 shows our implemention of the Factor Block with a Level2-MATLAB S-Function that wraps the Maximum Likelihood (ML) algorithm described in 7. The system learns locally using nT realizations of the forward message of variable X,the nT realizations of backward message of variable Y andaninitial value of matrix P. During learning, a new value of P is produced on each epoch and nT realizations of backward message for variable X and forward message for Y are sent to the adjacent blocks. If the number of iteration is set to 0, the Block simply computes the nT realizationsofbackwardofvariableX andthenT realizationsofforwardmessage of variable Y (using the results in 8). Fig.5. Simulink implementation of the Factor Block. The icon in the library (a) and its detailed scheme (b) - During learning phase, given the initial value of Conditional Probability Matrix (Hin), the bacward messages for variable Y, the forward messages for variable X and the learning mask (L), a new value of H is computed applying Nit iterations of ML algorithm. If the Nit is set to 0, the block works in inference mode. Using the implemented library, simply by dragging and connecting, the user can define a wide range of architectures that otherwise would have required the16 A. Buonanno and F.A.N. Palmieri Fig.6. A complex architecture designed using the proposed library writing of a custom algorithm of belief propagation. Figure 6 shows a complex network drawn using the building blocks previously introduced. 3 Flow Control During the simulation, each block uses messages coming from connected blocks and evolves producing new messages. The distributions exchanged among blocks are bi-directional and simultaneous, but the network flow is controlled from the top by a MATLAB script that sets parameters, triggers execution and collects results. The network can work in Inference Mode, when the block parameters are fixed, and in Learning Mode, when the block parameters are learned. In the Learning Phase (Figure 7(a)), based on epochs, after the Network Initialization (set to uniform all the variables, set the dimension of the messages), the model simulation is started defining purposely the Simulation Time and Model Param- eters (values of Factors). At the end of simulation the new Model Parameters are used as initialization values for next epoch. This is done until the Maximum Number of Epochs is reached. In the Evolution Phase (Figure 7(b)), in the Pa- rameter Initialization, the user has to adopt the correct values of parameters learned during Learning Phase. The Model Simulation step is performedin the Simulink environmentthat has to be purposely configured using Fixed-Step Solver Type and with a Fixed Size Time Step. During the updating phase of simulation, Simulink determines the order in which the block methods must be triggered. The user cannot explicitly change this order,but he can assign priorities to non virtual blocks to indicate to Simulink their execution order relative to other blocks. Simulink tries to honorSimulink Implementation of Belief Propagation in Normal Factor Graphs 17 Fig.7. Scheme for model simulation in the Inference mode (a) and in the Learning mode (b) block priority settings, unless there is a conflict with data dependencies 9. We have verified that Simulink automatically assigns the correct execution order, evaluating the From Workspace block (in the source blocks) and then the other blocks. To avoid wrongly assigned variables, each variable in each block is ini- tialized with an uniform distribution. Each block automatically determines the dimension of the variable to which it is connected. During the simulation, each block uses the inputs coming from other blocks and evolves producing output to connecting blocks using the rules outlined in 8. 4 Characters Recognition Example We have used the proposed Library in several applications. In this work we present the result obtained with a simple Latent Model applied to a recognition task on the Artificial Characters Dataset 10. This dataset is formed by thou- sands of 12x8 black and white images representing the characters ’A’, ’C’, ’D’, ’E’, ’F’, ’G’, ’H’, ’L’, ’P’, ’R’. The network we have implemented is composed of 96 factors (a factor for each pixel) and only one hidden variable. Animageisamatrixofpixels,whereeachpixelcanbeconsideredasastochas- tic variable that can assume value in a finite alphabet (2 symbols for black and white images).We haveasetofrandomvariablesX ,X ,...,X that belongto 1 2 n18 A. Buonanno and F.A.N. Palmieri Fig.8. The designed network for Artificial Characters recognition task using the implemented Library a same finite alphabet X. This set of variables is fully characterized by its joint probability mass function p(X ,X ,...,X ). All the mutual interactions among 1 2 n the variables is contained in the structure of p. A variable can be: 1) known (instantiated): the backward message is the delta distribution; 2) completely unknown (erased): the backward message is a uniform distribution; 3) known softly: the backward message is a density. In all cases after message propaga- tion the system responds with a forward message that is related to information stored in the system during the learning phase 11. We use a simple Latent Model where each variable X (pixel) is connected to a Latent Variable (Figure i 8) and there is also a Variable that contains the information of the presented character (X ). In the Learning Phase the instantiated variables of training 101 examples are injected in the network and using the ML algorithm in 7 the matrices P(YX)−i are learned. 4.1 A Simulation Using the Artificial Characters Dataset 10 we have trained our network with 800 training images of 12x8 black and white images representing the characters: ’A’, ’C’, ’D’, ’E’, ’F’, ’G’, ’H’, ’L’, ’P’, ’R’ (Figure 9). The dimension of the embedding space is set to 150. The number of epochs for learning phase is set to 20 and each epoch is formed by 10 evolution steps. 96 To store all configurations the embedding space should have been set to 2 , but the real configurations are much less. We limited the embedding space to 150 because computational issues. Even if we have used a small dimension of the embedding space, the system stores relevant structures of the presented imagesSimulink Implementation of Belief Propagation in Normal Factor Graphs 19 Fig.9. 25 samples from the Training Set Fig.10. Network answer-Animage isretrievedfromtheTestSet(a),abigpercentage of pixels are erased (gray pixels in (b)) and this information is injected in the network as backward messages. The network, after evolution, returns the Reconstructed image (c) and a probability distribution on the character set (d)) and presenting 800 test images, the system recognize the characters presented with an accuracy of 76%. In Figure 10 the results of the recognition and completion task are presented. An image is retrieved from Test Set (Figure 10 (a)), a big percentage of pixels are erased (gray pixels in (Figure 10 (b))) and this information is injected in the network as backward messages of Source variables. The information about the presented character is set to uniform. The network, after the evolution (In- ference Mode) returns the forward messages of Source variables that, combined20 A. Buonanno and F.A.N. Palmieri with the provided backward messages, give us the Reconstructed image (Figure 10 (c)). The network provides also the probability distribution on whole vocabulary (Figure 10 (d)) 5Conclusion WehaveimplementedaLibraryofSimulinkblocksthatpermitstorapidlydesign a wide range of architectures using the Factor Graph paradigm. This approach allowsto experimenton different architecturesusing Simulink bi-directional con- nectionsasprobabilitypipelines.Currenteffortsaredevotedtousethisparadigm for various applications and to find more efficient implementations when the architectures grow in size and complexity. References 1. Jordan, M. (ed.): Learning in Graphical Models. MIT Press (1998) 2. Hawkins, J.: On Intelligence (with Sandra Blakeslee). Times Books (2004) 3. Pearl, J.: Probabilistic reasoning inintelligentsystems-networksofplausibleinfer- ence.Morgan Kaufmann series inrepresentation and reasoning. Morgan Kaufmann (1989) 4. Loeliger, H.A.: An introduction to factor graphs. IEEE Signal Processing Maga- zine 21(1), 28–41 (2004) 5. Forney, G.D.: Codes on graphs: Normal realizations. IEEE Transactions on Infor- mation Theory 47(2), 520–548 (2001) 6. Kschischang, F., Member, S., Frey, B.J., Loeliger, H.-A.: Factor graphs and the sum-product algorithm. IEEE Transactions on Information Theory 47, 498–519 (2001) 7. Palmieri, F.A.N.: A Comparison of Algorithms for Learning Hidden Variables in Normal Graphs. ArXiv e-prints (2013) 8. Palmieri, F.: Notes on factor graphs. In: Apolloni, B., Bassis, S., Marinaro, M. (eds.) WIRN. Frontiers in Artificial Intelligence and Applications, vol. 193, pp. 154–162. IOS Press (2008) 9. MATLAB Documentation Center - R2014A, ch. Control and Display the Sorted Order 10. Guvenir,H.A.,Acar,B., Muderrisoglu, H.:Artificialcharacters dataset.In:Bache, K., Lichman, M. (eds.) UCI Machine Learning Repository (2013), https://archive.ics.uci.edu/ml/datasets/Artificial+Characters 11. Palmieri, F., Ciuonzo, D., Mattera, D., Romano, G., Rossi, P.S.: From examples to bayesian inference. In: Apolloni, B., Bassis, S., Esposito, A., Morabito, F.C. (eds.) WIRN. Frontiers in Artificial Intelligence and Applications, vol. 234, pp. 97–104. IOS Press (2011)Time Series Analysis by Genetic Embedding and Neural Network Regression Massimo Panella, Luca Liparulo, and Andrea Proietti DIET Department, University of Rome “La Sapienza” via Eudossiana 18, 00184 Rome, Italy massimo.panellauniroma1.it http://massimopanella.site.uniroma1.it Abstract. In this paper, the time series forecasting problem is ap- proached by using a specific procedure to select the past samples of the sequence to be predicted, which will feed a suited function approx- imation model represented by a neural network. When the time series to be analysed is characterized by a chaotic behaviour, it is possible to demonstrate that such an approach can avoid an ill-posed data driven modelling problem. In fact, classical algorithms fail in the estimation of embedding parameters, especially when they are applied to real-world sequences. To this end we will adopt a genetic algorithm, by which each individual represents a possible embedding solution. We will show that the proposed technique is particularly suited when dealing with the pre- diction of environmental data sequences, which are often characterized by a chaotic behaviour. Keywords: time series prediction, embedding technique, genetic algo- rithm, environmental data. 1 Introduction Environmental data sequences often exhibit a chaotic behaviour that is typical for almost all real-world observed systems. In this regard, the performance of a predictor depends on how accurate it models the unknown context delivering the sequence to be predicted. Due to the actual importance of forecasting, the technical literature is full of proposed methods for implementing a predictor, especially in the field of neural and fuzzy neural networks 3, 8, 9, 11, 12. The generalapproach to solve a prediction problem is based on the solution of a suitable function approximation problem, that is by synthesizing the function that links the actual sample to be predicted to a suitable set of past ones. The embedding technique is the way to determine the input vector based on past samples of a sequence S(n), which can be considered as the output of an un- known autonomous system that is observable only through S(n). Consequently, the sequence S(n) should be embedded in order to reconstruct the state-space evolutionofthissystemthat, in actualapplications,isinherentlyboth non-linear and non-stationary. In this regard, the relationship between the reconstructed  c Springer International Publishing Switzerland 2015 21 S. Bassis et al. (eds.), Recent Advances of Neural Networks Models and Applications, Smart Innovation, Systems and Technologies 37, DOI: 10.1007/978-3-319-18164-6_3 www.allitebooks.com22 M. Panella, L. Liparulo, and A. Proietti state and its corresponding output must be a non-linear function 1. It follows that the implementation of a predictor will coincide with the estimation of a non-linear model by using any data driven function approximation technique. As a case study, in this paper we consider the observation of some pollution agents in Rome (Italy), whose prediction is very important in terms of health monitoring and risk prevention of daily activities. In this regard, we suggest to use a neural network approach because of its efficacy and flexibility in solving such problems. Classicalneural networks(such as MultiLayerPerceptron- MLP, Radial Basis Function - RBF, Mixture of Gaussian - MoG, etc.) are function approximation models that can easily fail in the case of environmental data sequences. In fact, the complexity of the function to be approximated, caused by the chaotic behaviour, is further enhanced by the contamination of spurious noise. This inconvenience is evidently due to a lack of an accurate and complete description of data, which can be provided by means of a full conditional density p(yx) 2, 7. In the case of the problem introduced above, the process to be estimated is often represented by a training set of P input-output pairs x ,y , i=1...P. i i Several approaches, based on a suitable clustering procedure of the training set, can be found for the synthesis of p(yx). In fact, in 10 different types of clustering approaches are proposed; one of the described approaches estimates the joint density p(x,y) with no distinction between input and output variables. The joint density is successively conditioned, so that the resulting p(yx)canbe used for obtaining the mapping to be approximated, i.e.: p(x,y) p(x,y)  p(yx)= = . (1) p(x) p(x,y)dy y∈ In this paper, we will refer to this approach since it ensures the largest robust- ness with respect to the approximationofnon-convexmulti-valued mappings 4. The most popular way for obtaining p(yx) is therefore based on the prior deter- mination of p(x,y); a useful approach to the modelling of p(x,y) is commonly based on a mixture of Gaussian components 13. The determination of the said mixture yields directly the architecture of neural networkssuch as RBF or MoG, which are involved in this paper. In Sect. 2 the significance of a chaotic system will be introduced. Unfortu- nately, the classical embedding approach, which will be briefly summarized in Sect. 3, may lead to an unsatisfactory prediction accuracy even when advanced neural networklearningparadigmsare used. In fact, trying to synthesize directly the unknown mapping between the current sample to be predicted and the past ones can be a difficult task that often corresponds to an ill-posed function ap- proximation problem 6. For these reasons, we will propose in Sect. 4 a different approach,which is based on a genetic algorithmas an advanced embedding tech- nique. In this way, each individual in a generation represents a possible solution for the vector of past samples of S(n) to be used in the approximation task. The use of a genetic algorithm allows the automatic determination of past samples without using the classical techniques for estimating the embedding parameters,Time Series Analysis 23 which are often characterized by a critical accuracy when applied to real-world data sequences. Moreover, the choice of the optimal parameters depends upon the use of a specific approximation model (i.e., a neural network), since the fit- ness of each individual is evaluated through that model fitted on the basis of the given individual (i.e., the embedded past samples). We will consider in this work some environmental time series relevant to air pollution, whose forecasting is very important in terms of pollution control and resource management. In Sect. 5 we will discuss the chaotic nature of these sequences and we will demonstrate the suitability of the proposed technique for their prediction, as the performances in terms of accuracy are better than other well-known prediction models. The performances are evaluated by using a custom implementation of a ‘Master-Slave’ distributed genetic algorithm in a cluster of computers connected through the intranet of our laboratories. 2 Time Series Forecasting: Embedding for State Space Reconstruction As previously said, a chaotic sequence S(n) can be considered as the output of a chaotic system that is observable only through S(n), which should be em- bedded in order to reconstruct the state-space evolution of this system. The general embedding technique is based on the determination of the following parameters 1: – embedding dimension D of the reconstructed state-space attractor, obtained by using the False Nearest Neighbors (FNN) method 14; – time lag T between the embedded past samples of S(n), obtained by using the Average Mutual Information (AMI) method; i.e.:   x = S(n) S(n−T) ... S(n−(D −1)T) , (2) n where x is a row vector representing the reconstructed state at time n. n The solution of the embedding problem is useful for time series prediction. In a chaotic sequence, the prediction of S(n) can be obtained by using the re- lationship between the (reconstructed) state and the system output. In fact, the embedding of S(n) is intended to obtain an ‘unfolded’ version of the actual system attractor, so that the difficulty of the prediction task can be reduced. Therefore, the prediction of a chaotic sequence S(n) can be considered as the D determination of the function f :  → that approximates the link between the reconstructed state x and the output sample S(n + m) at the prediction n distance m,beingm 0. Another technique can be based on the determination D D of the function F :  → that approximates the link between the recon- structed state x and the reconstructed state x at the prediction distance n n+m m. Both these methods will be described in detail in the next Sect. 3.

Advise: Why You Wasting Money in Costly SEO Tools, Use World's Best Free SEO Tool Ubersuggest.