[prev in list] [next in list] [prev in thread] [next in thread]
List: hadoop-commits
Subject: =?utf-8?q?=5BHadoop_Wiki=5D_Update_of_=22Hive/JoinOptimization=22_by_Liyi?=
From: Apache Wiki <wikidiffs () apache ! org>
Date: 2010-11-30 23:15:39
Message-ID: 20101130231539.94520.58522 () eosnew ! apache ! org
[Download RAW message or body]
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change \
notification.
The "Hive/JoinOptimization" page has been changed by LiyinTang.
http://wiki.apache.org/hadoop/Hive/JoinOptimization?action=diff&rev1=8&rev2=9
--------------------------------------------------
In this case, the query processor will launch the original Common Join task as a \
Backup Task to run, which is totally transparent to user. The basic idea is shown as \
Fig 7.
== 2.4 Performance Evaluation ==
+ Here are some performance comparison results. All the benchmark queries here can be \
converted into Map Join.
+ '''Table 2: The Comparison between the previous join with the new optimized join'''
+
+ ''' {{attachment:fig8.jpg}} '''
+
+ For the previous common join, the experiment only calculates the average time of \
map reduce task execution time. Because job finish time will include the job \
scheduling overhead. Sometimes it will wait for some time to start to run the job in \
the cluster. Also for the new optimized common join, the experiment only adds up the \
average time of local task execution time with the average time of map reduce \
execution time. So both of the results should avoid the job scheduling overhead. +
+ From the result, if the new common join can be converted into map join, it will get \
57% ~163 % performance improvement. +
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic