[prev in list] [next in list] [prev in thread] [next in thread]
List: sas-l
Subject: Catersion Join Efficiency
From: pigpigpig <pigzhu740 () GMAIL ! COM>
Date: 2009-10-30 18:05:36
Message-ID: 2b09e2a8-7e68-437f-9a51-320d798ca409 () p8g2000yqb ! googlegroups ! com
[Download RAW message or body]
Catersion Join is quite frequently used in my projects.. It is an
inefficient join, but I have to do it.. because I have to match the
data in table a with the data in table b. If table a has 10000 and
table b has 5000, then the table generated by the catesion join will
have 5000*10000=50,000,000 records.
This is very time consuming.. To save time, what I have done is to
cut the observations of one table, and run the join parallelly in a
few SAS sessions (open like 5 SAS window, one of the window runs for
%catersion_join ( Low=0, high=1000); another one run ( Low=1000,
high=2000), .. etc..
I am posting my solution here to share with you all, feel free to
comment , I am not sure if this is a good solution..... I believe
there must be better way to do this.. Any suggestion??
Thanks!!!!!
For example:
%macro catersion_join ( Low=0, high=1000)
Data partial_a;
set a;
if &low <_n_<=&high then output;
run;
Proc sql;
create table c_&high as select partial_a.* , b. * from partial_a, b;
quit;
proc append base=c_1 data=c_&high ;
run;
%mend;
%catersion_join ( Low=0, high=1000);
%catersion_join ( Low=1000, high=2000);
%catersion_join ( Low=2000, high=3000);
%catersion_join ( Low=3000, high=4000);
%catersion_join ( Low=4000, high=5000);
%catersion_join ( Low=5000, high=6000);
%catersion_join ( Low=6000, high=7000);
%catersion_join ( Low=7000, high=8000);
%catersion_join ( Low=8000, high=9000);
%catersion_join ( Low=9000, high=10000);
The one of the best things in the world is sharing.................
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic