'=?utf-8?q?=5BHadoop_Wiki=5D_Update_of_=22Hive/JoinOptimization=22_by_Liyi?='

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       hadoop-commits
Subject:    =?utf-8?q?=5BHadoop_Wiki=5D_Update_of_=22Hive/JoinOptimization=22_by_Liyi?=
From:       Apache Wiki <wikidiffs () apache ! org>
Date:       2010-11-30 23:15:39
Message-ID: 20101130231539.94520.58522 () eosnew ! apache ! org
[Download RAW message or body]

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change \
notification.

The "Hive/JoinOptimization" page has been changed by LiyinTang.
http://wiki.apache.org/hadoop/Hive/JoinOptimization?action=diff&rev1=8&rev2=9

--------------------------------------------------

  In this case, the query processor will launch the original Common Join task as a \
Backup Task to run, which is totally transparent to user. The basic idea is shown as \
Fig 7.  
  == 2.4 Performance Evaluation ==
+ Here are some performance comparison results. All the benchmark queries here can be \
converted into Map Join.  
+ '''Table 2: The Comparison between the previous join with the new optimized join'''
+ 
+ ''' {{attachment:fig8.jpg}} '''
+ 
+ For the previous common join, the experiment only calculates the average time of  \
map reduce task execution time. Because job finish time will include the job \
scheduling overhead. Sometimes it will wait for some time to start to run the job in \
the cluster. Also for the new optimized common join, the experiment only adds up the \
average time of local task execution time with the average time of map reduce \
execution time. So both of the results should avoid the job scheduling overhead. + 
+ From the result, if the new common join can be converted into map join, it will get \
57% ~163 % performance improvement. + 


[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic