Big Data Massive Parallel Processing (MapReduce)

Big Data Massive Parallel Processing (MapReduce)
Dr.GordenMorse Profile Pic
Dr.GordenMorse,France,Professional
Published Date:22-07-2017
Your Website URL(Optional)
Comment
Ghislain Fourny Big Data 6. Massive Parallel Processing (MapReduce) 1Let's begin with a field experiment 2200 blocks, 8 different shapes 3How many of each? ? ? ? ? ? ? ? ? 4200 pieces distributed to 10 volunteers 5Task 1 (10 people) 6Task 1 (10 people) 1 3 1 2 1 1 7Task 2 (8 people) 8Task 2 (8 people) – part 1 aka "The big mess" 1 2 1 3 1 9Task 2 (8 people) – part 2 1 2 8 1 3 1 10Final summary 10 8 5 6 3 4 7 2 11Let's go 12So far, we have... Storage as file system (HDFS) 13So far, we have... Storage as tables (HBase) Storage as file system (HDFS) 14Data is only useful if we can query it Querying Storage as tables (HBase) Storage as file system (HDFS) 15... in parallel Querying Storage as tables (HBase) Storage as file system (HDFS) 16Data Processing Input data 17Data Processing Input data Query 18Data Processing Input data Query Output data 19MapReduce 20