'Re: a SET or a MERGE problem,'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       sas-l
Subject:    Re: a SET or a MERGE problem,
From:       Sigurd Hermansen <HERMANS1 () WESTAT ! COM>
Date:       2003-10-31 18:44:58
[Download RAW message or body]

Howard:

O'Reilly lets us view a chapter from Gennick's 2nd Ed.

http://www.oreilly.com/catalog/orsqlpluspr2/chapter/ch01.html

Oracle introduced natural joins some time ago. Check what Gennick has to say
about them. As with many other shortcuts, they work well when implemented by
those who know enough to do so intelligently. Otherwise, they may cost more
time than they save.

Sig

-----Original Message-----
From: Howard Schreier [mailto:Howard_Schreier@ITA.DOC.GOV]
Sent: Friday, October 31, 2003 1:16 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: a SET or a MERGE problem, not a standard merge problem : plea
se help!


I too would make allowance for key mismatches unless I knew that there were
none or that they did not matter. Too often people describe and/or
illustrate problems in terms of special cases. That is, they do not indicate
the possibility of complications such as missing values, gaps in series,
etc. Then people have to guess, or the solutions offered are not as robust
as they need to be.

As to JOIN flavors, NATURAL and outer are not mutually exclusive. As I
understand it, the NATURAL keyword just offers shortcuts.

Here is my explicit code for the FULL join which Sig recommends:

   create table final as
    select coalesce(file1.dpt,file2.dpt) as dpt,
           file1.tree_sp,
           file1.index,
           file2.reg
     from file1 full join file2
     on file1.dpt=file2.dpt
     order by dpt, reg, tree_sp;

Here is the NATURAL version:

   create table final as
    select *
     from file2 natural full join file1
     order by dpt, reg, tree_sp;

I was able to use the asterisk to select columns, and I did not have to code
the COALESCE function to blend the key values into a single column, or code
the ON clause. That's because all of the common columns were keys and all of
the keys were common columns.

Test data:

   DATA File1;
     INPUT dpt tree_sp index;
     CARDS;
   1            1            5
   4           10            6
   ;

   DATA File2;
     INPUT dpt reg;
     CARDS;
   1        2
   5        7
   ;

Result (from either SQL statement):

   Obs    dpt    tree_sp    index    reg

    1      1         1        5       2
    2      4        10        6       .
    3      5         .        .       7

I did this with v. 8.2, though I did not find the NATURAL keyword in the v.
8. documentation. It is in the v. 9 documentation, but I did not see it in
the summary of changes and enhancements.

On Fri, 31 Oct 2003 10:59:25 -0500, Sigurd Hermansen <HERMANS1@WESTAT.COM>
wrote:

>Howard:
>I agree with Jim about defaulting to an outer (full) or left join
>unless
the
>problem description specifically limits a solution to a key set
>intersection. Perhaps I misunderstand the purpose of the natural join.
>I
see
>no essential reason for it. The inner and outer joins cover all of the
>relational operations on keys.
>
>If each of the key values appears in each set, the outer joins yield
>the same result as an inner join. The differences occur when some of
>the key values do not appear in both datasets. The inner join hides the
>mismatches. Sig
>
>-----Original Message-----
>From: Howard Schreier [mailto:Howard_Schreier@ITA.DOC.GOV]
>Sent: Friday, October 31, 2003 9:45 AM
>To: SAS-L@LISTSERV.UGA.EDU
>Subject: Re: a SET or a MERGE problem, not a standard merge problem :
please
>help!
>
>
>SQL is well suited to this type of problem. Try:
>
>   proc sql;
>   create table final as
>    select * from file1 natural join file2
>     order by dpt, reg, tree_sp;
>   quit;
>
>The example does not suggest any need for an outer join.
>
>On Fri, 31 Oct 2003 11:41:37 +0100, Groeneveld, Jim
><jim.groeneveld@VITATRON.COM> wrote:
>
>>Hi Jean-Pierre,
>>
>>How about:
>>
>>DATA File1;
>>  INPUT dpt tree_sp index;
>>  CARDS;
>>1            1            5
>>1            2            5
>>1            3            4
>>2            4            9
>>2            5            8
>>2            6            7
>>3            7            2
>>3            8            1
>>3            9            0
>>;
>>RUN;
>>
>>DATA File2;
>>  INPUT dpt reg;
>>  CARDS;
>>1        1
>>1        2
>>2        1
>>2        2
>>2        3
>>2        4
>>2        5
>>3        3
>>3        4
>>3        5
>>;
>>RUN;
>>
>>DATA Both (DROP=ObsNr2 dpt2);
>>  SET File1;
>>  DO ObsNr2 = 1 TO NobsFile2;
>>    SET File2 (RENAME=(dpt=dpt2)) NOBS=NobsFile2 POINT=ObsNr2;
>>    IF (dpt EQ dpt2) THEN OUTPUT;
>>    PUT _ALL_;
>>  END;
>>RUN;
>>
>>PROC PRINT DATA=Both; RUN;
>>
>>Regards - Jim.
>>
>>P.S. An outer SQL join also might be a solution.
>>--
>>.  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .
>>
>>Jim Groeneveld, MSc.
>>Biostatistician
>>Science Team
>>Vitatron B.V.
>>Meander 1051
>>6825 MJ  Arnhem
>>Tel: +31/0 26 376 7365
>>Fax: +31/0 26 376 7305
>>Jim.Groeneveld@Vitatron.com
>>www.vitatron.com
>>
>>My computer has the solutions, I have the problems.
>>
>>[common disclaimer]
>>
>>
>>-----Original Message-----
>>From: Jean-Pierre RENAUD [mailto:renaud@NANCY.INRA.FR]
>>Sent: Friday, October 31, 2003 10:23
>>To: SAS-L@LISTSERV.UGA.EDU
>>Subject: a SET or a MERGE problem, not a standard merge problem :
>please
>>help!
>>
>>
>>I have 2 files:
>>    one with department names (dpt) ; tree species (1 dpt could have
>many
>>species); and a 3rd var (index_a).
>>    the other with dpt and forested region names (1 dpt could have
>many
>>forested reg.)
>>
>>I want to multiply file 1 by file 2, in order to obtain for each
>"forested
>>region within a dpt" the info about the "dpt species index" line .
>>
>>It is not just a normal merge... but I am lacking experience to do
>the
>>coding.
>>
>>Could someone help me please!
>>
>>Thank in advance!
>>
>>Here is an example of want I want to do
>>
>>File 1
>>dpt      tree_sp    index
>>1            1            5
>>1            2            5
>>1            3            4
>>2            1            1
>>...
>>
>>File 2
>>dpt    reg
>>1        1
>>1        2
>>2        1
>>...
>>
>>Final file should look like
>>dpt      reg     tree_sp     index 1
>>1        1        1                5
>>1        1        2                5
>>1        1        3                4
>>1        2        1                5
>>1        2        2                5
>>1        2        3                4
>>2        1        1                1
>>...

[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic