[prev in list] [next in list] [prev in thread] [next in thread] 

List:       sas-l
Subject:    SAS/WPS/R: How to find link between nodes
From:       Roger DeAngelis <rogerjdeangelis () GMAIL ! COM>
Date:       2017-03-31 18:27:04
Message-ID: 1817714064238077.WA.rogerjdeangelisgmail.com () listserv ! uga ! edu
[Download RAW message or body]

SAS/WPS/R: How to find link between nodes

link to this message
https://goo.gl/KgN8Xw
https://communities.sas.com/t5/General-SAS-Programming/How-to-find-link-between-nodes/td-p/341819

HAVE
====

Up to 40 obs SD1.HAVE total obs=8

Obs                    STR

 1     za1 > email1 > ip1 > address1 > phone1       >> first cluster
 2     za2 > email2 > ip2 > address1 > phone2
 3     za3 > email3 > ip2 > address2 > phone5
 4     za4 > email5 > ip1 > address3 > phone13
 5     za5 > email1 > ip13 > address13 > phone13

 6     za11 > email21 > ip21 > address21 > phone21  >> second cluster
 7     za12 > email22 > ip21 > address22 > phone22
 8     za13 > email22 > ip22 > address23 > phone23

 6 and 7 are connected by ip21
 7 and 8 are connected by email22


WANT ( There are two non-connected clusters )
==============================================

CLUSTER                    UNIQUE CONNECTED CLUSTERS (ie ip2 only occurs once)

 1     za1>email1>ip1>address1>phone1>za2>email2>ip2>phone2>za3>email3>
       address2>phone5>za4>email5>address3>phone13>za5>ip13>address13

 2     za11>email21>ip21>address21>phone21>za12>email22>address22>phone22>za13>ip22>address23>phone23

WORKING CODE
============

   R - all other code is prep for input and output

        cl <- clusters(graph.data.frame(combspl))$membership[-(1:length(spl))];

    igraph package is heavily used and debugged (not true of all R packages)

FULL SOLUTION

*                _                  _       _
 _ __ ___   __ _| | _____        __| | __ _| |_ __ _
| '_ ` _ \ / _` | |/ / _ \_____ / _` |/ _` | __/ _` |
| | | | | | (_| |   <  __/_____| (_| | (_| | || (_| |
|_| |_| |_|\__,_|_|\_\___|      \__,_|\__,_|\__\__,_|

;

options validvarname=upcase;
libname sd1 "d:/sd1";
data sd1.have(keep=str);
input (Application_ID Email_ID IP_ID Address_ID phone_ID) ( :$20.);
array chr _character_;
str=catx('>',of _character_);
cards4;
za1 email1 ip1 address1 phone1
za2 email2 ip2 address1 phone2
za3 email3 ip2 address2 phone5
za4 email5 ip1 address3 phone13
za5 email1 ip13 address13 phone13
za11 email21 ip21 address21 phone21
za12 email22 ip21 address22 phone22
za13 email22 ip22 address23 phone23
;;;;
run;quit;


%utl_submit_wps64('
libname sd1 "d:/sd1";
options set=R_HOME "C:/Program Files/R/R-3.3.2";
libname wrk "%sysfunc(pathname(work))";
proc r;
submit;
source("c:/Program Files/R/R-3.3.2/etc/Rprofile.site",echo=T);
library(igraph);
library(haven);
data <-read_sas("d:/sd1/have.sas7bdat");
data<-as.character(data$STR);
spl <- strsplit(data,">");
combspl <- data.frame(
  grp = rep(seq_along(spl),lengths(spl)),
  val = unlist(spl)
);
cl <- clusters(graph.data.frame(combspl))$membership[-(1:length(spl))];
dat <- data.frame(cl);
dat[,2] <- row.names(dat);
a <- character(0);
for (i in 1:max(cl)) {
  a[i] <- paste(paste0(dat[(dat[,1] == i),][,2]), collapse=">");
};
endsubmit;
import r=a data=wrk.linkages;
run;quit;
');

proc print data=linkages width=min;
run;quit;

Up to 40 obs from linkages total obs=2

CLUSTERS

 1     za1 > email1 > ip1 > address1 > phone1 > za2 > email2 > ip2 > phone2 >
       za3 > email3 > address2 > phone5 > za4 > email5 > address3 > phone13 >
       za5 > ip13 > address13

 2     za11 > email21 > ip21 > address21 > phone21 > za12 > email22 >
       address22 > phone22 > za13 > ip22 > address23 > phone23
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic