Question? Leave a message!




Big data and NoSQL databases

Big data and NoSQL databases
Dr.GordenMorse Profile Pic
Dr.GordenMorse,France,Professional
Published Date:22-07-2017
Website URL
Comment
Big data and NoSQL databases Seminar on big data management Lecturer: Jiaheng Lu Spring 2016 www.helsinki.fi 25.1.2016 1Information on preparing Presentation and Report Goals for presentation and report are different: 1. Presentation: Let the audience to understand your topic; 2. Report: Show your own critical thinking and new ideas. www.helsinki.fiContents of Presentation (Length: 35-40 minutes) • 1. Introduction: please make a clear introduction • 1.1 Why you are interested in this topic: what kind of problems do you hope to solve? • 1.2 How had the problem been studied before? • 1.3 What is the application of this problem for big data? • • 2. Related works: • 2.1 Make sure you leave sufficient time to present all related prior work. Do not assume that the audience knows the prior work, • 2.2 Present it on an intuitive level. • Matemaattis-luonnontieteellinen tiedekunta / Iso tiedonhallinta/ www.helsinki.fi 25.1.2016 3 Jiaheng LuContents of Presentation (Cont.) • • 3 Main algorithms and contributions • 3.1 Show the main solutions of the paper(s). • 3.2 Present it with examples. The examples are quite important for understanding. • • 4. Your own comments and conclusion • 4.1 Present your own comments about the paper(s) • 4.2 It would be very good to identify the weak points of the paper(s) after your critical thinking. • Matemaattis-luonnontieteellinen tiedekunta / Iso tiedonhallinta/ www.helsinki.fi 25.1.2016 4 Jiaheng LuContents of Report (6-8 pages, Single column) • 1. What are the research problems? • 2. What are the strengths of the paper(s)? • 3. What are the main weaknesses of the paper(s)? • 4. If you were to solve this problem, what would you do? • 5. Why do you like/dislike the paper(s)? • 6. Conclusion and summary of your report. Matemaattis-luonnontieteellinen tiedekunta / Iso tiedonhallinta/ www.helsinki.fi 25.1.2016 5 Jiaheng LuOpponent • Carefully listen to the presentation • Ask questions after the presentation • Complete an opponent assessment form and submit it to the teacher after the presentation Matemaattis-luonnontieteellinen tiedekunta / Iso tiedonhallinta/ www.helsinki.fi 25.1.2016 6 Jiaheng Lu• Big data and NoSQL databases Matemaattis-luonnontieteellinen tiedekunta / Iso tiedonhallinta/ www.helsinki.fi 25.1.2016 7 Jiaheng LuData storage and history Before-1950s Data was stored as paper records Lot of time was wasted. e.g. when searching. Therefore inefficient. www.helsinki.fiMagnetic tapes and hard disk • 1950s and early 1960s: Data processing using magnetic tapes for storage • Late 1960s and 1970s: Hard disks allow direct access to data • • Data stored in files Matemaattis-luonnontieteellinen tiedekunta / Iso tiedonhallinta/ www.helsinki.fi 25.1.2016 9 Jiaheng LuDrawbacks of file system • Each program has its own data format • Programs are written in different languages, and so cannot easily access each other’s files. • Any new requirement needs a new program Matemaattis-luonnontieteellinen tiedekunta / Iso tiedonhallinta/ www.helsinki.fi 25.1.2016 10 Jiaheng LuDatabase Approach • 1960’s Network databases • 1970’s Relational databases • 1990’s Object-oriented and object-relational • 1995+ XML, Mobile, GeoDB, Embedded DB • 2005+ NoSQL DB, NewSQL DB Matemaattis-luonnontieteellinen tiedekunta / Iso tiedonhallinta/ www.helsinki.fi 25.1.2016 11 Jiaheng LuHistory of databases: Turing awards 1973 Charles W. Bachman 1981 Edgar F. Codd 1998 Jim Gray 2014 Michael Stonebraker www.helsinki.fi 12History of databases: Turing awards Object-relational model, column stores,…Modern databases Distributed databases and transaction Relational databases Network databases 2014 Michael Stonebraker 1998 Jim Gray 1981 Edgar F. Codd 1973 Charles W. Bachman www.helsinki.fi 13Network Model Physical file pointers are used to model the relations between files Most suitable for large databases with well-defined queries and well-defined applications Matemaattis-luonnontieteellinen tiedekunta / Iso tiedonhallinta/ www.helsinki.fi 25.1.2016 14 Jiaheng LuRelational model • E. F Codd introduced the relational model in 1970 Matemaattis-luonnontieteellinen tiedekunta / Iso tiedonhallinta/ www.helsinki.fi 25.1.2016 15 Jiaheng LuRelational model • Support relational algebra and operations • Data and program are separated • Improved data sharing and better integration • DB2, Oracle and SQL server are the most prominent commercial DBMS products Matemaattis-luonnontieteellinen tiedekunta / Iso tiedonhallinta/ www.helsinki.fi 25.1.2016 16 Jiaheng LuObject oriented data model (1990’s) • The purpose of OODBMS is to store object-oriented programming objects in a database without having to transform them into relational format Matemaattis-luonnontieteellinen tiedekunta / Iso tiedonhallinta/ www.helsinki.fi 25.1.2016 17 Jiaheng LuObject-relational model • Extend the relational data model by including object orientation • Allow attributes of tuples to have complex types, including non-atomic values such as nested relations Matemaattis-luonnontieteellinen tiedekunta / Iso tiedonhallinta/ www.helsinki.fi 25.1.2016 18 Jiaheng LuBig Data Challenge www.helsinki.fi5V’s of big data • Volume ‒TB  PB  EB • Variety ‒ Text, audio, video • Velocity ‒ Real time Operational / Analytic Applications • Value ‒ Extract Value from big data, complex Analytics • Veracity ‒ Biases, noise and abnormality in data. www.helsinki.fi