Localization Robotics 2018

Source Localization by a Robotic Head and sound localization robotics and visual localization for robots
Dr.NaveenBansal Profile Pic
Dr.NaveenBansal,India,Teacher
Published Date:25-10-2017
Your Website URL(Optional)
Comment
Integration of Audio and Video Clues for Source Localization by a Robotic Head Raffaele Parisi, Danilo Comminiello, Michele Scarpiniti, and Aurelio Uncini DIET Dept., University of Rome “Sapienza”, Rome, Italy raffaele.parisiuniroma1.it Abstract. In this work the first step of an integration process between audio and video information for the localization of speakers in closed environments is presented. The proposed metod is based on binaural source localization followed by face recognition and tracking and was re- alized and implemented in a real environment. Some preliminary results demonstrated the effectiveness of this approach. Keywords: Binaural source localization, face detection and tracking, audio and video integration. 1 Introduction Binaural localization consists in estimating the position of a sound source in a generic environment by use of a single pair of microphones. This approach gets inspiration from biological organisms, where the auditive system works by integratinginformation acquired by the body, the outer ear and the inner ear 1. Different models of binaural localization are available 2. A popular approach is based on combined use of Interaural Level Difference (ILD) and Interaural Time Difference (ITD) 3. These cues can separately give information about the source position in different range of frequencies and can be fruitfully combined so as to generate an effective binaural localization algorithm 3. The exploitation ofaudio signalsis just one side ofa localizationsystem based onproperintegrationofaudio andvideo clues. As a matter offact, in biologythe two senses of hearing and vision cooperate in order to augment the information acquired on the surrounding environment. Of course this is a fundamental task, both for hunting and for escaping from hunters. Some works exist that deal with the fusion of audio and video signals at different levels and for different applications 4 5 6 7 8. As far as we know, there are not works explicitely dealing with the topic of integration of binaural audio signals and video signals. In this paper some preliminary results toward effective integration of audio and video signals in a robotic head are described. Fig. 1 shows the robotic head that was realized in the ISPAMM Laboratory of the DIET Dept. at the University of Rome “La Sapienza”. The device is equipped with two omnidirec- tional microphones and two cameras. Two stepper motors can rotate the head  c Springer International Publishing Switzerland 2015 149 S. Bassis et al. (eds.), Recent Advances of Neural Networks Models and Applications, Smart Innovation, Systems and Technologies 37, DOI: 10.1007/978-3-319-18164-6_15150 R. Parisi et al. and move the eyes. These stepper motors are controlled using the Arduino Uno board. The Arduino Uno is a very common microcontroller board: it has 14 dig- ital input/output pins that can be used to control some external devices, and a USB connection, used to load the control software from a personal computer. In this preliminary setup, two main tasks were implemented: 1. a binaural source localization procedure. The joint ILD/ITD estimation was employedtolocalizethespeakerintermsofangulardistancefromthecenter. Themainissueofthisapproachisthepresenceofreverberation,thatactually reduces the accuracy of the estimation. 2. A face detector/tracking procedure. It is possible to find and to track a human face in images captured by the cameras. In this way it is possible to track the movement of the speaker and to correct the sound localization errors due to reverberation. Experimental results in a real environment demonstrated the effectiveness of this preliminary idea, as a first step toward a full integration of audio and video information. In the following the main steps of the developed procedure are described. Fig.1. The artificial head described in this paper 2 Description of the Audio System In this section we briefly recall the main concepts of localization of audio sources by binaural processing. Binaural perception was studied by Lord Rayleigh at theIntegration of Audio and Video Clues for Source Localization 151 beginning ofthe 20thcentury1. Fromthat time onseveralmodels ofthe human binaural system have been proposed. An estensive description was presented in 9. Binaural localization can be realized by using the Interaural Level Difference (ILD) and the Interaural Time Difference (ITD) in a joint way. ILD is propor- tional to the difference in the sound levels reaching the left and right ear, while ITD is the measure of the time difference of arrival of a signal to each ear. These cues can be used to obtain information about the source position in different ranges of frequencies. In fact, independent use of ILD and ITD does not yield robust source position estimators 3, since ITD is affected by ambiguity due to anapriori unknown phase unwrapping factor, while ILD estimates display a significant standard deviation. Localization of sources can be realized by prop- erly combining ILD and ITD. In the following we briefly describe a possibile approach 3. The binaural model of received signals is x n= h n∗sn+η n, (1) l l l x n= h n∗sn+η n, (2) r r r where l and r refer to the left and right ear respectively. In this equation h n i (i = l,r) is the impulse response, sn is the source signal while η nrepresent i an additive uncorrelated noise term. In the following description noise will be considered negligible, a simplifying assumption which is true in many practical situations. As in 3, ILD and ITD for the generic n-th time-frame are   n   X (ω,θ,φ) n r   ILD (ω,θ,φ)=20log , (3) 10  n  X (ω,θ,φ) l   n 1 X (ω,θ,φ) n r  ITD (ω,θ,φ)= +2πp . (4) n ω X (ω,θ,φ) l In these equations ω is frequency, θ and φ are the elevation and azimuth angles n n respectively, X (ω,θ,φ)and X (ω,θ,φ) are the Short Time Fourier Transforms r l (STFTs) of the right and left ear signals and p is the phase unwrapping factor, which is unknownapriori and needs to be estimated. The new joint ILD and ITD localization method 3 is based on comparison between the particular estimated pair (ILD, ITD) and a reference set of pairs contained in a data lookup matrix. This matrix is constructed by exploiting the fact that Head Related Transfer Functions (HRTFs) are stationary and can be used in calculating two different ITD and ILD reference sets that depend on azimuth and frequency alone. Equations (3) and (4) in this case can be written as     HRTF (ω,φ) r   ILD(ω,φ)=20log , (5) 10   HRTF (ω,φ) l   1 HRTF (ω,φ) r  ITD(ω,φ)= +2πp . (6) ω HRTF (ω,φ) l152 R. Parisi et al. In these equationsHRTF andHRTF arethe HRTF functions on the right and r l left ears respectively. By assumption the value of the unwrapping factor p does not change dramatically across azimuth 3. Smoothing across azimuth with a constantQfilterwasperformedontheILDlookupsetinordertobetterrepresent the limits of human interaural level difference perception. More specifically, a Gaussian filter was employed, as indicated in the CIPIC database 10. Comparison between the ILD and ITD lookup sets and the estimated ILD and ITD allows to estimate the azimuth of the sound source. In particular ILD is exploited to find the correct value of the unwrapping factor p and to select the azimuth value minimizing the difference between the ITD-only and ILD- only estimates. This p-estimation procedure was repeated for each availabletime frame. A time average across frames was performed and the results graphed. The final azimuth estimations selected were those displaying a minimum in the difference function that was consistent across frequencies. As an example, fig. 2showsthe results obtained in simulationswith the source placed at different azimuth angles. Joint exploitation of ILD and ITD allows to obtain an azimuth estimate which is correct over the whole frequency band and for different positions of the source.               Fig.2. Source azimuth estimate in an anechoic room and Gaussian noise: ILD, ITD and joint ILD-ITD methods. Columns from left to right refer to source azimuth angles ◦ ◦ ◦ ◦ ◦ of 0 ,20 ,45 ,60 and 80 respectively. Darkest pixels are lowest in value.                   Integration of Audio and Video Clues for Source Localization 153 Fig.3. Source azimuth estimate in a real room and a female speaker placed at the ◦ azimuth angle of 15 : ILD and ITD estimates in different ranges of frequencies. Fig. 3 shows the results obtained in a real environment with a female speaker, ◦ speaking fromanazimuth angleof 15 , in terms ofthe ILD and ITD estimates in different ranges of frequencies. Slight reverberation is present. It is clear that in the presence of reverberation 11, commonly assumed in closed environments, proper prefiltering techniques should be adopted 12 13 14. An example is cepstral prefiltering 15. 3 Description of the Video System In this preliminary study, the video information was used for localizing and tracking the head of a speaker, after she/he has been localized by using the audio information. The main task in this process is the localization of the face of the speaker in the image acquired. 3.1 Face Detection Thefacerecognitiontaskwasrealizedbyusingthe Viola-Jones method16. This technique was one of the first methods introduced for detecting the presence of objects in images and it is currently used for the detection of faces. It is based on classification of specific features rather than on the intensity values of the image pixels. Namely the steps of the classification process are: 1. extraction of Haar features. Haar features are basically determined by com- puting the sum and/or the differences of the pixels within two rectangular regions of the image.154 R. Parisi et al. 2. Construction of the integral image. The integral image is an intermediate representation of the original image. Namely, the generic point (x,y)ofthe integral image is defined as the sum of the pixels above and to the left of (x,y). 3. AdaBoost. The AdaBoost(shortfor Adaptive Boosting) isa machinelearning meta-algorithm used to improve the performance of learning algorithms 17. It is based on the combination of various weak classifiers in order to obtain a final robust classifier and it is employed in the Viola-Jones method 4. Chain classifier. The Viola-Jones method is based on a cascade of AdaBoost classifiers in order to classify portions of images. As a consequence of this processing phase, the performance of the detection task is increased, while reducing the computation time required. 3.2 Face Tracking Once the region containing the face has been detected, the next step is to move the image of the face to the central position of the video image. This task can be realized by a feedback loop where a pair of proportional controls is employed to progressively reduce the difference between the position of the detected face and the center of the video image. To this goal, the tilt and pan angles of the head are used. Figure 4 shows the scheme of the head control unit. Proportional controller et ut Input Head Output K p coordinates actuators coordinates Fig.4. The head control loop 4 Experiments The head was equipped with two omnidirectional microphones AKG C562M. Signals were acquired through an Edirol UA-1000 acquisition board. Figure 5 shows the configuration of the testbed, with five possible positions of the source. 1 The control of the servomotors was realized by an Arduino board . The face- tracking algorithm was written in C++ by using the functions available at the 2 OpenCV website . Figure 6 shows in detail the Arduino board used for process- ing of the video part. The face recognition and tracking algorithm was used to localize the face of the speaker after the audio localization task and to move it to the center of the 1 www.arduino.cc 2 www.opencv.orgIntegration of Audio and Video Clues for Source Localization 155 Fig.5. Testbed configuration Fig.6. The Arduino board and its connections156 R. Parisi et al. Fig.7. A single frame of the localization process image. As an example of the experimental results, fig. 7 shows a single frame of the output video. The artificial head can be seen in the upper part of the image, together with the speaker. In the lower part, it is shown the image as acquired by the camera mounted on the head, after the face of the speaker has been moved to the center. 5Conclusion In this paper a possible cooperation between binaural audio and video signals wasdescribed.The objectivewasthe localizationandtrackingofanaudiosource moving in a closed environment. Some preliminary experiments demonstrated the quality of the proposed solution, also in a real environment. Further research will be devoted to pursue full integration of both audio and visual information. References 1. Rayleigh, L.: On our perception of sound direction. Phil. Mag. 13, 214–232 (1907) 2. Blauert, J.: Spatial Hearing - The Psychophysics of Human Sound Localization. MIT Press (1996) 3. Raspaud, M., Viste, H., Evangelista, G.: Binaural source localization by joint es- timation of ILD and ITD. IEEE Trans. on Audio, Speech and Language Process- ing 18(1), 68–77 (2010) 4. Monaci, G., Jost, P., Vandergheynst,P.,Mail´e, B., Lesage, S.,Gribonval, R.:Learn- ing multimodal dictionaries. IEEE Trans. on Image Processing 16(9), 2272–2283 (2007)Integration of Audio and Video Clues for Source Localization 157 5. Zhang, C., Yin, P., Rui, Y., Cutler, R., Viola, P., Sun, X., Pinto, N., Zhang, Z.: Boosting-based multimodal speaker detection for distributed meeting videos. IEEE Trans. on Multimedia 10(8), 1541–1552 (2008) 6. Schmalenstroeer, J., Haeb-Umbach, R.: Online diarization of streaming audio- visual data for smart envirnments. IEEE Journ. of Selected Topics in Signal Pro- cessing 4(5), 845–856 (2010) 7. Naqvi, S.M., Wang, W., Khan, M.S., Barnard, M., Chambers, J.A.: Multimodal (audio-visual) source separation exploiting multi-speaker tracking, robust beam- forming and time-frequency masking. IET Signal Processing 6(5), 466–477 (2012) 8. Minotto, V.P., Jung, C.R., Lee, B.: Simultaneous-speaker voice activity detection and localization using mid-fusion of svm and hmms. IEEE Trans. on Multime- dia 16(4), 1032–1044 (2014) 9. Wang, D., Brown, G.J.: Computational Auditory Scene Analysis - Principles, Al- gorithms, and Applications. IEEE Press, Wiley Interscience (2006) 10. Algazi, V.R., Duda, R.O., Thompson, D.M., Avendano, C.: The CIPIC HRTF database. In: 2001 IEEE Workshop on Applications of Digital Signal Processing to Audio and Acoustics (2001) 11. Kuttruff, H.: Room Acoustics, 4th edn. Taylor & Francis (2000) 12. St´ephenne,A.,Champagne, B.:Anewcepstralprefilteringtechniquefor estimating time delay under reverberant conditions. Signal Processing 59(3), 253–266 (1997) 13. Parisi, R., Gazzetta, R., Di Claudio, E.: Prefiltering approaches for time de- lay estimation in reverberant environments. In: Proceedings of ICASSP, vol. 3, pp. III-2997–III-3000 (2002) 14. Zannini, C.M., Parisi, R., Uncini, A.: Binaural sound source localization in the presence of reverberation. In: Proc. of the 17th International Conference on Digital Signal Processing (July 2011) 15. Parisi, R., Camoes, F., Scarpiniti, M., Uncini, A.: Cepstrum prefiltering for binau- ral source localization in reverberant environments. IEEE Signal Processing Let- ters 19(2), 99–102 (2012) 16. Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. of Computer Vi- sion 57(2), 137–154 (2004) 17. Freund,Y.Y.,Schapire,R.E.: Adecision-theoretic generalization ofon-linelearning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)A Feasibility Study of Using the NeuCube Spiking Neural Network Architecture for Modelling Alzheimer’s Disease EEG Data 1, 2 2 Elisa Capecci , Francesco Carlo Morabito , Maurizio Campolo , 2 2 1 Nadia Mammone , Domenico Labate , and Nikola Kasabov 1 Auckland University of Technology - Knowledge Engineering and Discovery Research Institute, Auckland, New Zealand ecapecci,nkasabovaut.ac.nz 2 DICEAM - Mediterranea University of Reggio Calabria, Italy morabito,campolo,nadia.mammone,domenico.labateunirc.it Abstract. The paper presents a feasibility analysis of a novel Spiking Neural Network (SNN) architecture called NeuCube 10 for classifica- tion and analysis of functional changes in brain activity of Electroen- cephalography (EEG) data collected amongst two groups: control and Alzheimer’s Disease (AD). Excellent classification results of 100% test accuracyhavebeenachievedandthesehavealso beencomparedwith tra- ditional machine learning techniques. Outputs confirmed that the Neu- Cube is better suited to model, classify, interpret and understand EEG data and the brain processes involved. Future applications of a NeuCube model are discussed including its use as an indicator of the early onset of Mild Cognitive Impairment(MCI)to studydegeneration of thepathology toward AD. Keywords: Spiking Neural Networks, NeuCube, EEG data classifica- tion, Alzheimer’s Disease. 1 Introdution and Problem Specification During the past few decades, researchers from all-over the world have been con- centrating their efforts towards understanding of the human brain. As a conse- quence of the efforts made, a relevant progress has been achieved and a huge amount of brain data is becoming available. Neuroinformatics researchers have been playing a pivotal role in the advancement of these studies and especially with the use of machine learning techniques. Some of the major contributions are the improvements in the understanding of the Spatio-Temporal Brain Data (STBD) available and the development of predictive systems. These are of high importance for society, as the increase in human lifespan has been followed by the dramatic rise in the appearance of neurological disorders such as AD 18.  Corresponding author.  c Springer International Publishing Switzerland 2015 159 S. Bassis et al. (eds.), Recent Advances of Neural Networks Models and Applications, Smart Innovation, Systems and Technologies 37, DOI: 10.1007/978-3-319-18164-6_16160 E. Capecci et al. We have used spatio-temporal EEG as a type of brain data to study this pathol- ogy and its degeneration, as it is one of the most commonly collected data for studying the neural processes and it has been for long used to analyse and stage the decline from MCI to AD (e.g. 11,14,15). Moreover, it is an affordable tech- nique, easy to manage and it is not considered aggressive for the subjects being studied 19. In this paper, we analyse and classify the spatio-temporal information avail- able (described in section 2) by use of the brain inspired SNN model called NeuCube 10. In section 3, we introduce the NeuCube model and the experi- mental design ofthe study. Section 4 presents the classificationresults,which are then compared with traditional approaches. Particularly, in section 4.2, through visualization and analysis of the SNN cube (SNNc) after training, new knowl- edge is also extracted from the data. Finally, conclusions and future directions based on the proposed methodology are presented in section 5. 2 Data Collection and Description The EEG data has been collected and made available by the IRCCS Centro Neurolesi of Messina, Italy. For this preliminary analysis, we decided to use just the data recordedfromonehealthy subjects andone subjectdiagnosedashaving AD. The control was a male subject of 58 years of age and the AD patient was a female subject of 80 years of age. They were both at random selected. Each recording session was carried out using 19 electrodes: Fp1, Fp2, F7, F3, Fz, F4, F8, T3, C3, Cz, C4, T4, T5, P3, Pz, P4, T6, O1, O2 and the G2 electrode was used as reference. Electrodes were placed according to the sites defined by the standard 10−20 international system. Data was recorded for 65 seconds at 256Hz, resulting in 16640 data points collected per session. A brain computer interface device was used to collect the EEG data, which was recorded under resting condition. During the experiment, the subjects were sitting with the eyes closed and always under vigilant control. The data was band-pass filtered between 0.5 and 32Hz, which includes so the relevant bands for AD diagnosis. No further pre-processing of the data was applied, as the NeuCube model is able to accommodate raw data directly; how- ever, screening and selection of the signals that were visually artefact-free was performed prior to data analysis to avoid misleading results. Then, the orig- inal EEG signal concatenated was treated to avoid sub-effects related to the inevitable information loss implied by excluding some components. For this preliminary study, he EEG data was resized into 3 seconds epochs. Thus, for each of the two classes we had 21 samples of 768 data points recorded for every of the 19 EEG channels. In total, we used 42 samples to run the NeuCube experiments. 3 The NeuCube Spiking Neural Network Architecture This paper evaluates the ability of the NeuCube SNN framework 10 (Fig. 1) to classify and analyse the functional brain activity produced by the EEG dataA Feasibility Study of Using the NeuCube SNN Architecture 161 recordedfrom a subject affected by AD anda healthy control. This methodology allows for the creation of different models for STBD based on the following information processing principles as listed in 10: – The model has a spatial structure that maps approximately the spatially located areas of the brain where STBD is collected. – The same information paradigm - spiking information processing, that ulti- mately generates STBD at a low level of brain information processing. This is used in the model to represent and to process this STBD. – Brain-like learning rules are used in the model to learn STBD, mapped into designated spatial areas of the model. – A model is evolving in terms of new STBD patterns being learnt, recog- nised and added incrementally, which is also a principle of brain cognitive development. – A model always retains a spatio-temporal memory that can be mined and interpreted for a better understanding of the cognitive processes. – A visualization of the model evolution during learning can be used as a bio-feedback. Such models can be used to learn and reveal complex spatio-temporal patterns “hidden” in the STBD, which would not be possible to achieve using other infor- mation processing methods. As a result a significantly improved understanding of complex brain processes that generates the data can be gained, along with improved classification and/or prediction accuracy. Fig.1. The NeuCube architecture with its three main modules: input data encoding module; a 3D SNN cube module; an output classification module. Also, an optional Gene Regulatory Network (GRN) module can be incorporated if gene information is available. The spiking neurons can be simple leaky integrate and fire model or proba- bilistic models (shown in the lower left section).162 E. Capecci et al. 3.1 Experimental Design and Implementation The NeuCube-based model used for this study was implemented with a software simulator written in MATLAB 27. This particular NeuCube consists of three modules: 1. An input information encoding module. 2. The NeuCube 3D SNNc module. 3. An output module for data classification and knowledge extraction. The process scheme in Fig. 2 summarises the experimental design applied to the study. I. The raw time series data, obtained from the EEG device, is directly fed into the model as ordered sequence of real-valued data vectors. One of the great advantages of the NeuCube framework is that in many cases there is no need of pre-processing (such as normalization of the data, scaling, smoothing, etc.). II. Eachreal value input stream of data is transformedinto a spike train using Address EventRepresentation (AER) method 2. AER is more convenient when using continuous input data, such as EEG STBD, as this algorithm identify just differences in consecutive values. III. The spike sequences are then presented to the SNNc, which was imple- mented using Leaky Integrate and Fire (LIF) neurons 13, as that mimics the information processing of the human brain and it is less computational expensive 20,5. The SNNc can also evolve according to the number of in- put variables (i.e. the EEG channels) and the data available. Due to the size of the data set used for this study, we generated a 3D cube of 13×15×11spiking neurons. 1471of these spiking neurons were mapped according to a brain atlas, the Talairach Atlas 12,24. Each of these neu- rons were representing the centre coordinates of a one cubic centimetre area from the 3D Talairach Atlas, including the 19 EEG channels, which also identified the input neurons of the network. IV. The SNNc is then trained on the input spike trains via unsupervised learn- ingmethod,usingSpikeTimingDependantPlasticity(STDP)23learning rule. Unsupervised learning is performed to modify the initially set con- nection weights. The SNNc will learn to activate same groups of spiking neurons when similar input stimuli are presented 6. This makes the Neu- Cube architecture useful for learning consecutive spatio-temporal patterns and therefore representing a more biologically plausible associative type of memory 10. V. The output classifier is then trained via supervised method. The same STBD used for the unsupervisedtraining is now propagatedagainthrough the trained SNNc and output neurons are generated (evolved) and trained toclassifythespatio-temporalspikingpatternoftheSNNcintopre-defined classes (or output spike sequences). Different SNN methods can be used to learn and classify spiking patterns from the SNNc. For this experi- mental study, Dynamic Evolving SNN (deSNN) algorithm 9 was used.A Feasibility Study of Using the NeuCube SNN Architecture 163 I IMPUT MODULE n Samples EEG data n Encoding Train of Spikes II AER encoding III NeuCube MODULE Unsupervised IV Learning STDP VII V Validation Supervised Learning LOOCV deSNN VI OUTPUT MODULE VIII Data Understanding and Knowledge Extraction CLASSIFICATION Class 1 …….. Class n Analysis of the SNNc activity after training Fig.2. Process scheme of the NeuCube framework with its three principal modules: the input module, where input data are transformed into trains of spikes that are then presented to the main module, the SNNc; the NeuCube module, where time and space characteristics of the STBD are captured and learned to extract new knowledge from them through the SNNc visualization; and the output module for data classification and understanding. In the scheme are also indicated the VIII processes involved in the NeuCube experiment.164 E. Capecci et al. This classification method combines the rank-order learning rule 26 with the STDP 23 temporal learning, for each output neuron to learn a how spatio-temporal pattern using only one pass of data propagation. VI. Theclassificationresultsareevaluatedusingrepeatedrandomsub-sampling validation or Leave-One-OutCross-Validation(LOOCV)respectively. VII. In order to achieve a desirable classification accuracy, the numerous pa- rameters of the NeuCube needs to be optimized. Therefore, steps (III) to (VI) are repeated changing parameter values. That can be done using a grid search method, a genetic algorithm or the Quantum-Inspired Evolu- tionary Algorithm 17. In this study, we have used a grid search, as we will explain in the next section. VIII. The trained SNNc is visualized and its connectivity and spiking activity analysed for a better understanding of the data and the brain processes that generates it. In fact, it can be observed that new connections are formed between the neurons and this can be further interpreted in the context of different neural activity. Therefore, this represents another key advantage that NeuCube offers: the possibility of knowledge extraction. 4 Results and Discussion The NeuCube framework has been used and promising results on the analysis of cognitive mental activity 8 and the classification of complex muscular move- mentsforneuro-rehabilitation25hasbeenreported.Inthis paper,weevaluated the feasibility of a NeuCube-based model to correctly classify data with known pattern and extract knowledge from the spatio-temporal EEG signals of a sub- ject affected by AD versus a healthy control. Our aim is to develop an analysis and prediction tool to be used by clinician for identifying the appearance of MCI and predict the onset of AD. To achieve satisfying classification results, the numerous parameters of the NeuCube need to be accurately selected. Based on previous studies that we have conducted (e.g. 8), we have identified some critical variables requiring careful optimization and we have selected the values that correspond to some of them, making them default. Taking into account that every parameter tuned also involves a considerable amount of processing time, we need to select the proper number of variables to be optimised. The AER threshold was chosen for this study, as it is applied to the entire signal gradient according to the time and therefore the rate of the generated spike trains depend on this threshold. Moreover, since the NeuCube is a stochastic model, altering this value means also altering the initial model configuration each time. Thus, using a grid search, we evaluated the classification accuracy of 10 model configurations adjusting the AER threshold at every new configuration. For that, we have used 50% of the entire time series for training and the other 50% for testing. The parameter’s settings which were obtained after optimization are summarised in Tab.1.A Feasibility Study of Using the NeuCube SNN Architecture 165 Table 1. NeuCube Parameters NeuCube Parameters AER threshold Conn. Distance STDP rate Firing threshold 0,94 0,15 0,01 0,5 Refractory Time Training Time deSNN mod deSNN drift 6 1 0,4 0,25 Table 2. NeuCube’s classification results expressed by accuracy percent. Results are obtained using both 50/50%-Trainin/Testing and LOOCV. NeuCube RESULTS CLASS (50/50%-Tra/Test.) (LOOCV) Control 100% 100% AD 100% 100% Average Acuracy 100% 100% As common practice in machine learning, classification accuracy was calcu- lated by statistical processing of the information obtained from the confusion table. Classifier outputs were evaluated using both random sub-sampling vali- dation and LOOCV, as reported in Tab.2. 4.1 Comparative Analysis NeuCube results have been also compared with other approaches, such as Multi Layer Perceptron (MLP), Support Vector Machine(SVM), Inductive Evolving Classification Function (IECF)7 and Evolving Clustering Method for Classifi- cation (ECMC)22. To process these experiments, the NeuCom platform was used 7, which is a self-programmable, learning and reasoning computer environment freely avail- able on-line (www.theneucom.com). The LOOCV method was used to evaluate the outputs and datasets were normalised prior to the experiments to ensure the highest classification accuracy result. (i.e. the normalisation protocol applied to each method consisted in a linear standardization of the data’s vectors using values between 0 and 1 as a scale). Classification accuracy was analysed via supervised learning method, which is based on classification of data with a known pattern. The results obtained are expressed in the confusion table as number of True Positives (TP) and True Negatives(TN)againstFalsePositive(FP)andFalseNegative(FN). Ananalysis of the classification outputs obtained by all different methods was performed based on this information, which was further processedto calculate the following metrics: – Accuracy percent (A %): A%=(TP +TN)/(TP +FN +FP +TN)∗100 (1)166 E. Capecci et al. – Sensitivity (S): S =TP/(TP +FN)(2) – Specificity (SP): SP =TN/(FP +TN)(3) Results obtained are summarised in Tab. 3 and they were used to plot a Receiver Operating Characteristics (ROC) graph (Fig.3) 3. The parameter’s settings used for each method are reported in Tab. 4. Table 3. Comparison of the results obtained via NeuCube versus traditional machine learningmethods(MLP,SVM,IECFandECMC),confusion tableandresultedmetrics. Confusion Table Control AD MLP SVM TP FN 11 14 11 14 FP TN 10 7 10 7 IECF ECMC NeuCube 21 13 21 1 21 0 0 8 0 20 0 21 METRICS NeuCube MLP SVM IECF ECMC A% (1) 100 43 43 69 98 S(2) 1 0.52 0.52 1 1 SP (3) 1 0.33 0.33 0.38 0.95 1-SP 0 0.67 0.67 0.62 0.05 Table 4. Traditional machine learning methods (MLP, SVM, IECF and ECM) param- eter’s settings MLP parameters SVM parameters Normalization yes Normalization yes Number of Hidden Units 3 kernel Polynomial Number of Training Cycles 300 Degree,γ 1 Output Value Precision 0.0001 Output Function Precision 0.0001 Output Activation Function linear Optimization scg IECF parameters ECMC parameters Normalization yes Normalization yes Max. Influence Field 1 Max. Influence Field 1 Min. Influence Field 0.01 Min. Influence Field 0.01 MofN 3 MofN 3 Membership Function 2 Epochs 4A Feasibility Study of Using the NeuCube SNN Architecture 167 ROC graph 1 ECM NeuCube IECF 0.8 0.6 MLP, SVM 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 1-Specificity Fig.3. ROC graph. 1-Specificity is plotted on the X axis and Sensitivity is plotted on the Y axis As far as the ROC graph is concerned, each classifier produce a sensitivity and a specificity value, which results in a single point on the graph’s space. Clas- sifiers falling on the top-left area of the graph are considered achieving desirable results 3. The NeuCube appears on the top-left hand side of the graph (point 0,1) performing as a perfect classifier. Interesting performance is also reported by ECMC method, while IECF method, even classifying nearly all positives cor- rectly, it reports high false positive, which bring it too far on the right hand side of the graph. Optimization of the results obtained via these techniques is not a trivial pro- cess and it requires more sophisticate optimization methods. In fact, the few parameters that influence these methods cannot be tuned independently, one of the reasons is that some of them are discrete values and others are continuous. Thus, the work involved to improve the output results is not viable. We can conclude that, in terms of the comparisonwith the other classification methods, NeuCube performed significantly better with the highest accuracy, sensitivity and specificity over all. By means of these metrics, the closest to NeuCube’s results was ECMC, whilst the poorest performing were MLP and SVM. Sensitivity 168 E. Capecci et al. In addition to the above, the NeuCube-based model has other important benefits, such as: – It requires only one iteration data propagationfor learning, while traditional methods as SVM requires numerous iterations. – The NeuCube-based model is adaptable to new data and new classes, while the other models are fixed and difficult to adapt on new data. – Thereisnoneedofpre-processingofthedata(suchasnormalization,scaling, smoothing, etc.) with the NeuCube model. The raw data can be fed directly into the model as time series transformed into spike trains. – The NeuCube model demonstrated to be able to achieve a better classifica- tion accuracy per class than the other methods. – The NeuCube model also offers a better understanding of the data and therefore the brain processes that generates it through visualization and analysis of the output SNNc state, as discussed in the following section. 4.2 Model Interpretation and Data Understanding The NeuCube model constitutes a SNN environment based on some of the most important principles governing the neural activity in the human brain. Thus, it constitutes a valuable model for on-line learning and recognition of STBD.Italsotakesintoaccountdatafeatures,offeringabetter understandingof the informationandthe phenomena of study. In fact, one ofthe main advantages of the NeuCube model is that after training the SNNc can be visualized and its connectivity and spiking activity observed. This ability of the NeuCube models allows us to trace the development/decline of neurological processes over time and to extract new information and knowledge about them. Illustrated in Fig. 4 is the SNNc state obtained after it was trained with data from a control subject (top picture) and then after it was trained with data from the subject affected by AD (bottom picture). We can observe that new connections are formed between the neurons of the network and especially around the input neurons, which were mapped according to the Talairachcoordinatesof the 19 EEG electrodes. We can depict from Fig. 4 thattheneuralactivityofboththehealthysubjectandthesubjectsufferingfrom AD is quiet different. In fact, in the case of the healthy control, the connections evolved are equally distributed in every brain region. On the other hand, in the case of the patient affected by AD, we can observe that this activity decreased in the left hemisphere and so there is a higher activity evolved in the right hemisphere, maybe to compensate the lack of its counterpart and therefore as a consequence of the degeneration of the pathology.A Feasibility Study of Using the NeuCube SNN Architecture 169 CONTROL AD Fig.4. The SNNc connectivity after training (top control, bottom AD). The figure shows both the 3D cube and the (x,y) plane only of the SNNc. The SNNc can be anal- ysed and interpreted for a better understanding of the EEG data to identify differences between brain states. Blue lines are positive connections, while red lines are negative connections. The brighter the color of a neuron the stronger its activity with a neigh- bour neuron. Thickness of the lines also identify the neurons enhanced connectivity. In yellow are the input neurons with their labels corresponding to the 19 EEG channels.