Data analysis Machine learning and knowledge discovery

data knowledge and discovery machine learning meets natural science and machine learning and knowledge discovery in databases
ZoeTabbot Profile Pic
ZoeTabbot,Germany,Professional
Published Date:13-07-2017
Your Website URL(Optional)
Comment
Machine Learning (ML) and Knowledge Knowledge Discovery Discovery in in Databases Databases (KDD) (KDD) Instructor: Rich MaclinCourse Information • Class web page: http://www.d.umn.edu/rmaclin/cs8751/ – Syllabus – Lecture notes – Useful links – Programming assignments • Methods for contact: – Email: rmaclind.umn.edu (best option) – Offi Office: 315 315 HH HH – Phone: 726-8256 • Textbooks: – Machine Learning, Mitchell • Notes based on Mitchell’s Lecture Notes CS 8751 ML & KDD Chapter 1 Introduction 2Course Objectives • Specific knowledge of the fields of Machine Learningg g and Knowledge Discovery y in Databases (Data Mining) – Experience with a variety of algorithms – Experience with experimental methodology • In-depth knowledge of several research papers • Programming/implementation practice • Presentation practice CS 8751 ML & KDD Chapter 1 Introduction 3Course Components • 2 Midterms, 1 Final – Midterm 1 (150), February 18 – Midterm 2 (150), April 1 – Final (300), Thursday, May 14, 14:00-15:55 (comprehensive) • Programming Programming assignments assignments (100), (100), 3 3 (C (C++ or or Java, Java, maybe maybe in in Weka?) • Homework assignments (100), 5 • Research Paper Implementation (100) • Research Paper Writeup – Web Page (50) • R Research h P Paper O Oral l P Present tat ti ion (50) (50) • Grading based on percentage (90% gets an A-, 80% B-) – Minimum Minimum E Effort ffort Requirement Requirement CS 8751 ML & KDD Chapter 1 Introduction 4Course Outline • Introduction Mitchell Chapter 1 – Basics/Version Spp aces M2 – ML Terminology and Statistics M5 • Concept/Classification Learning – Decision Trees M3 – Neural Networks M4 – Instance Based Learning M8 – Genetic Algorithms M9 – R Rul le L Learni ing M10 M10 CS 8751 ML & KDD Chapter 1 Introduction 5Course Outline (cont) • Unsupervised Learning – Clustering g Jain et al. review p pap per • Reinforcement Learning M13 • Learningy g Theory – Bayesian Methods M6, Russell & Norvig Chapter – PAC Learning M7 • Support Vector Methods Burges tutorial •Hy ybrid Methods M12 • Ensembles Opitz & Maclin paper, WF7.4 • Mining g Association Rules Appriori ppapper CS 8751 ML & KDD Chapter 1 Introduction 6What is Learning? Learning denotes changes in the system that are adaptive in the sense that they enable the system to do the same task or taskkd s drawn ffrom thhe same popullatiion more effffectiivelly thhe next time. Simon, 1983 Learning is making useful changes in our minds. Minsky, 1985 Learning is constructing or modifying representations of what is being experienced. McCarthy, 1968 Learning is improving automatically with experience. Mitchell Mitchell, 1997 1997 CS 8751 ML & KDD Chapter 1 Introduction 7Why Machine Learning? • Data, Data, DATA – Examples • World wide web • Human genome project • Business data (WalMart sales “baskets”) – Idea: sift heap of data for nuggets of knowledge • Some tasks beyond programming –El Example: ddriiviing – Idea: learn by doing/watching/practicing (like humans) • Customizing Customizing software software – Example: web browsing for news information – Idea: observe user tendencies and incorporate CS 8751 ML & KDD Chapter 1 Introduction 8Typical Data Analysis Task Data: PatientId=103: EmergencyC-Section=yes Age=23, Time=53, FirstPreg?=no, Anemia=no, Diabetes=no, PrevPremBirth=no, Ult UltraS Sound d=? ?, El Electi tiveC C-St Sectiion=?? Age=23, Time=105, FirstPreg?=no, Anemia=no, Diabetes=yes, PrevPremBirth=no, UltraSound=abnormal, ElectiveC-Section=no Age=23, Time=125, FirstPreg?=no, Anemia=no, Diabetes=yes, PrevPremBirth=no, Ult UltraS Sound d=? ?, El Electi tiveC C-SSt ectiion=no PatientId=231: EmergencyC-Section=no Age=31, Time=30, FirstPreg?=yes, Anemia=no, Diabetes=no, PrevPremBirth=no, UltraSound=?, ElectiveC-Section=? Age=31, Time=91, FirstPreg?=yes, Anemia=no, Diabetes=no, PrevPremBirth=no, UltraSound=normal, ElectiveC-Section=no … Given – 9714 patient records, each describing a pregnancy and a birth – Each patient record contains 215 features (some are unknown) Learn to predict: – Characteristics Characteristics of of patients patients at at high high risk risk for for Emergency Emergency C C-Section Section CS 8751 ML & KDD Chapter 1 Introduction 9Credit Risk Analysis Data: ProfitableCustomer=No, CustId=103, YearsCredit=9, LoanBalance=2400, Income=52,000, OwnHouse=Yes, OtherDelinqAccts=2, MaxBillingCyclesLate=3 ProfitableCustomer=Yes, CustId=231, YearsCredit=3, LoanBalance=500, Income=36,000, OwnHouse=No, OtherDelinqAccts=0, MaxBillingy gCyclesLate=1 ProfitableCustomer=Yes, CustId=42, YearsCredit=15, LoanBalance=0, Income=90,000, OwnHouse=Yes, OtherDelinqAccts=0, MaxBillingCyclesLate=0 … Rules that might be learned from data: IF Other-Delinquent-Accounts 2, AND Number-Delinqquent-Billingg-Cyycles 1 THEN Profitable-Customer? = No Deny Credit Application IF Other-Delinquent-Accounts == 0, AND ((Income 30K) OR (Years-of-Credit 3)) THEN THEN Profitable Profitable-CCusstomer? tomer? = Yes Yes Accept Accept Application Application CS 8751 ML & KDD Chapter 1 Introduction 10Analysis/Prediction Problems • What kind of direct mail customers buy? • What What products products will/won will/won’t t customers customers buy? buy? • What changes will cause a customer to leave a bank? bank? • What are the characteristics of a gene? •Does a piii cture contain an objbject (d(does a piicture off space contain a metereorite especially one hd headiing ttowardds us)?)? •… Lots more CS 8751 ML & KDD Chapter 1 Introduction 11Tasks too Hard to Program ALVINN Pomerleau drives 70 70 MPH MPH on h hi igh hways CS 8751 ML & KDD Chapter 1 Introduction 12Software that Customizes to User CS 8751 ML & KDD Chapter 1 Introduction 13Defining a Learning Problem Learning = improving with experience at some task – improve over task T – with respect to performance measure P – based on experience E Ex Ex 1: 1: Learn Learn to to play play checkers checkers T: play checkers P: % of g games won E: opportunity to play self Ex 2: Sell more CDs T: sell CDs P: of CDs sold E E: : different different locations/prices locations/prices of of CD CD CS 8751 ML & KDD Chapter 1 Introduction 14Key Questions T: play checkers, sell CDs P: P: % % g games ames won, won, CDs CDs sold sold To generate machine learner need to know: – What experience? • Direct or indirect? • Learner controlled? • Is the experience representative? – What exactly should be learned? – How to represent the learning function? – What algorithm used to learn the learning function? CS 8751 ML & KDD Chapter 1 Introduction 15Types of Training Experience Direct or indirect? Di Direct t -obb bservablle, measurablble – sometimes difficult to obtain • Checkers - is a move the best move for a situation? – sometimes straightforward • Sell CDs - how many CDs sold on a day? (look at receipts) Indirect - must be inferred from what is measurable – Checkers - value moves based on outcome of game – Credit Credit assignment assignment p problem roblem CS 8751 ML & KDD Chapter 1 Introduction 16Types of Training Experience (cont) Who controls? – Learner - what is best move at each p point? (Exploitation/Exploration) – Teacher - is teacher’s move the best? (Do we want to j just st em emulate late the the teachers teachers mo moves??) es??) BIG Question: is experience representative of performance goal? – If Checkers learner only plays itself will it be able to pllh ay humans?? – What if results from CD seller influenced by factors not measured ( (holiday y shopp pping, g, weather,, etc.) )? CS 8751 ML & KDD Chapter 1 Introduction 17Choosing Target Function Checkers - what does learner do - make moves ChooseMove - select move based on board ChooseMove : Board→ Move V : Board→ℜ ChooseMove(b) ChooseMove(b): : from from b b pick pick move move with with highest highest value value But how do we define V(b) for boards b? Possible definition: V(b) = 100 if b is a final board state of a win V(b) = -100 if b is a final board state of a loss V(() b) = 0 if b is a final board state of a draw if b not final state, V(b) =V(b´) where b´ is best final board reached by starting at b and playing optimally from there Correct, but not operational CS 8751 ML & KDD Chapter 1 Introduction 18Representation of Target Function • Collection of rules? IF double jjp ump available THEN make double jump • • Neural Neural network? network? • Polynomial function of problem features? w+ w blackPieces(b)+ w redPieces(b)+ 0 1 2 w w blackKings blackKings( (b b) )++ w w redKings redKings( (b b) )++ 3 3 4 4 w redThreatened(b)+ w blackThreatened(b) 5 6 CS 8751 ML & KDD Chapter 1 Introduction 19Obtaining Training Examples V (b) : the true target function ˆ V (b) : the learned function V V ( (b b) ) : : the the training training value value train One rule for estimating training values: ˆ V V ( (b b) )←←V V ( (Successor Successor( (b b)) )) train CS 8751 ML & KDD Chapter 1 Introduction 20