Database processing vs Data mining processing

difference between database processing and data mining processing, how data warehouse helps in decision making,difference between data mining and query processing, data mining processing steps pdf free download
Dr.JakeFinlay Profile Pic
Dr.JakeFinlay,Germany,Teacher
Published Date:22-07-2017
Your Website URL(Optional)
Comment
DATA MINING Introductory and Advanced Topics Part I Source : Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist University Companion slides for the text by Dr. M.H.Dunham, Data Mining, Introductory and Advanced Topics, Prentice Hall, 2002.Data Mining Outline ■ PART I ◆ Introduction ◆ Related Concepts ◆ Data Mining Techniques ■ PART II ◆ Classification ◆ Clustering ◆ Association Rules ■ PART III Web Mining ◆ Spatial Mining ◆ Temporal Mining ◆Introduction Outline Goal: Provide an overview of data mining. Goal: Provide an overview of data mining. ■ Define data mining ■ Data mining vs. databases ■ Basic data mining tasks ■ Data mining development ■ Data mining issuesIntroduction ■ Data is growing at a phenomenal rate ■ Users expect more sophisticated information ■ How? UNCOVER HIDDEN INFORMATION UNCOVER HIDDEN INFORMATION DATA MINING DATA MININGData Mining Definition ■ Finding hidden information in a database ■ Fit data to a model ■ Similar terms ◆ Exploratory data analysis Data driven discovery ◆ ◆ Deductive learningDatabase Processing vs. Data Mining Processing • Query • Query – Well defined – Poorly defined SQL – – No precise query  Data Data  Data Data language – Operational data Operational data – Not operational data Not operational data  Output Output  Output Output – Precise Precise – Fuzzy Fuzzy – Subset of database Subset of database – Not a subset of database Not a subset of databaseQuery Examples ■ Database – Find all credit applicants with last name of Smith. Find all credit applicants with last name of Smith. – Identify customers who have purchased more Identify customers who have purchased more than 10,000 in the last month. than 10,000 in the last month. Find all customers who have purchased milk – Find all customers who have purchased milk ■ Data Mining – Find all credit applicants who are poor credit Find all credit applicants who are poor credit risks. (classification) risks. (classification) – Identify customers with similar buying habits. Identify customers with similar buying habits. (Clustering) (Clustering) – Find all items which are frequently purchased Find all items which are frequently purchased with milk. (association rules) with milk. (association rules)Data Mining Models and TasksBasic Data Mining Tasks ■ Classification maps data into predefined groups or classes ◆ Supervised learning ◆ Pattern recognition ◆ Prediction ■ Regression is used to map a data item to a real valued prediction variable. ■ Clustering groups similar data together into clusters. Unsupervised learning ◆ Segmentation ◆ Partitioning ◆Basic Data Mining Tasks (cont’d) ■ Summarization maps data into subsets with associated simple descriptions. Characterization ◆ ◆ Generalization ■ Link Analysis uncovers relationships among data. ◆ Affinity Analysis ◆ Association Rules Sequential Analysis determines sequential patterns. ◆Ex: Time Series Analysis • Example: Stock Market Predict future values • • Determine similar patterns over time • Classify behaviorData Mining vs. KDD ■ Knowledge Discovery in Databases (KDD): process of finding useful information and patterns in data. ■ Data Mining: Use of algorithms to extract the information and patterns derived by the KDD process. KDD Process Modified from FPSS96C ■ Selection: Obtain data from various sources. ■ Preprocessing: Cleanse data. ■ Transformation: Convert to common format. Transform to new format. ■ Data Mining: Obtain desired results. ■ Interpretation/Evaluation: Present results to user in meaningful manner.KDD Process Ex: Web Log ■ Selection: ◆ Select log data (dates and locations) to use ■ Preprocessing: ◆ Remove identifying URLs ◆ Remove error logs ■ Transformation: ◆ Sessionize (sort and group) ■ Data Mining: ◆ Identify and count patterns ◆ Construct data structure ■ Interpretation/Evaluation: Identify and display frequently accessed sequences. ◆ ■ Potential User Applications: Cache prediction ◆ Personalization ◆Data Mining Development •Similarity Measures •Hierarchical Clustering •Relational Data Model •IR Systems SQL • •Imprecise Queries •Association Rule Algorithms •Textual Data •Data Warehousing •Web Search Engines •Scalability Techniques •Bayes Theorem •Regression Analysis •EM Algorithm •K-Means Clustering •Time Series Analysis •Algorithm Design Techniques •Algorithm Analysis •Neural Networks •Data Structures •Decision Tree AlgorithmsKDD Issues ■ Human Interaction ■ Overfitting ■ Outliers ■ Interpretation ■ Visualization ■ Large Datasets ■ High DimensionalityKDD Issues (cont’d) ■ Multimedia Data ■ Missing Data ■ Irrelevant Data ■ Noisy Data ■ Changing Data ■ Integration ■ ApplicationSocial Implications of DM ■ Privacy ■ Profiling ■ Unauthorized useData Mining Metrics ■ Usefulness ■ Return on Investment (ROI) ■ Accuracy ■ Space/TimeDatabase Perspective on Data Mining ■ Scalability ■ Real World Data ■ Updates ■ Ease of Use

Advise: Why You Wasting Money in Costly SEO Tools, Use World's Best Free SEO Tool Ubersuggest.