[prev in list] [next in list] [prev in thread] [next in thread] 

List:       sas-l
Subject:    Re: SAS Forum: Randomly delete three observations by group
From:       Paul Dorfman <sashole () BELLSOUTH ! NET>
Date:       2018-10-31 19:09:34
Message-ID: 5168012760727772.WA.sasholebellsouth.net () listserv ! uga ! edu
[Download RAW message or body]

Roger,

Methinks the old good K/N method executed independently within the confines of each \
BY group will do just fine:

data have ;                                
  input ID YEAR PREP ;                     
  cards ;                                  
1    2000     550                          
1    2001     600                          
1    2002     650                          
1    2003     700                          
1    2004     750                          
1    2005     800                          
2    2000     850                          
2    2001     900                          
2    2002     950                          
2    2003    1000                          
2    2004    1050                          
2    2005    1100                          
3    2000    1150                          
3    2001    1200                          
3    2002    1250                          
3    2003    1300                          
3    2004    1350                          
3    2005    1400                          
run ;                                      
                                           
data want (drop = K N) ;                   
  do N = 1 by 1 until (last.id) ;          
    set have ;                             
    by ID ;                                
  end ;                                    
  K = 3 ;                                  
  do _n_ = 1 to N ;                        
    set have ;                             
    if ranuni (1) < divide (K, N) then do ;
      NEW_PREP = . ;                       
      K +- 1 ;                             
    end ;                                  
    else NEW_PREP = PREP ;                 
    N +- 1 ;                               
    output ;                               
  end ;                                    
run ;          

If perchance the file isn't sorted by ID, no big deal, either; the same can be done \
via a hash:

data want (drop = K N) ;                                           
  dcl hash h (ordered:"A") ;                                       
  h.definekey  ("ID") ;                                            
  h.definedata ("ID", "N") ;                                       
  h.definedone () ;                                                
  dcl hiter ih ("h") ;                                             
  dcl hash r (multidata:"Y") ;                                     
  r.definekey ("ID") ;                                             
  r.definedata ("YEAR", "PREP") ;                                  
  r.definedone () ;                                                
  do until (z) ;                                                   
    set have end = z ;                                             
    if h.find() ne 0 then N = 1 ;                                  
    else                  N + 1 ;                                  
    h.replace() ;                                                  
    r.add() ;                                                      
  end ;                                                            
  do while (ih.next() = 0) ;                                       
    if r.find() = 0 then do K = 3 by 0 until (r.find_next() ne 0) ;
      if ranuni (1) < divide (K, N) then do ;                      
        NEW_PREP = . ;                                             
        K +- 1 ;                                                   
      end ;                                                        
      else NEW_PREP = PREP ;                                       
      N +- 1 ;                                                     
      output ;                                                     
    end ;                                                          
  end ;                                                            
run ;       

Best regards
Paul Dorfman


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic