[prev in list] [next in list] [prev in thread] [next in thread] 

List:       sas-l
Subject:    Re: Extracting data from strings
From:       Nat Wooding <nathani () VERIZON ! NET>
Date:       2013-08-31 0:14:01
Message-ID: 014a01cea5de$ff16f360$fd44da20$ () verizon ! net
[Download RAW message or body]

Here is my code with a modification that looks for any character string that
does not include numbers in the line. This gets away from my testing for a
string starting with age. You could combine this with Tom's suggestion about
the delimiter.

Furthermore, if you know the maximum number of numeric fields, you could
change the input to read that many.

You may have answered this already but why are you reading statistical data
from rtf output. Did SAS produce this and if so, why don't you use an output
statement or ods writing to a file to capture the output.

Nat

-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Gmail
Sent: Friday, August 30, 2013 12:53 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: Extracting data from strings

Nice try!  Unfortunately, this is just a sample table. There are other
tables which do not have N, Mean, etc. So you cannot embed text for logic
selection.

We need a very general approach.

But thanks for trying and offering solution!

On Aug 30, 2013, at 9:47 AM, Arthur Tabachneck <art297@ROGERS.COM> wrote:

I'm sure the following code can be cleaned up quite a bit, but I'm
interested in finding out whether it does what you want:

data want;
 filename indd "C:\art\have.txt";
 infile indd   truncover;
 informat var $32. Statistic $6.;
 retain var;
 input @;
 _infile_ = translate( _infile_ , ' ' , '^'  );  if _infile_ gt '';
_infile_ = left( compbl ( _infile_ ));  if _infile_ = 'N' then do;
   statistic='N';
   input;
   input @;
   _infile_ = translate( _infile_ , ' ' , '^'  );
   if _infile_ gt '';
   _infile_ = left( compbl ( _infile_ ));
   input value1-value4;
   output;
 end;
 else if _infile_ = 'Mean' then do;
   statistic='Mean';
   input;
   input @;
   _infile_ = translate( _infile_ , ' ' , '^'  );
   if _infile_ gt '';
   _infile_ = left( compbl ( _infile_ ));
   input value1-value4;
   output;
 end;
 else if _infile_ = 'SD' then do;
   statistic='SD';
   input;
   input @;
   _infile_ = translate( _infile_ , ' ' , '^'  );
   if _infile_ gt '';
   _infile_ = left( compbl ( _infile_ ));
   input value1-value4;
   output;
 end;
 else if _infile_ = 'Median' then do;
   statistic='Median';
   input;
   input @;
   _infile_ = translate( _infile_ , ' ' , '^'  );
   if _infile_ gt '';
   _infile_ = left( compbl ( _infile_ ));
   input value1-value4;
   output;
 end;
 else if _infile_ = 'Min' then do;
   statistic='Min';
   input;
   input @;
   _infile_ = translate( _infile_ , ' ' , '^'  );
   if _infile_ gt '';
   _infile_ = left( compbl ( _infile_ ));
   input value1-value4;
   output;
 end;
 else if _infile_ = 'Max' then do;
   statistic='Max';
   input;
   input @;
   _infile_ = translate( _infile_ , ' ' , '^'  );
   if _infile_ gt '';
   _infile_ = left( compbl ( _infile_ ));
   input value1-value4;
   output;
 end;
 else if anyalpha(_infile_) then do;
   var=_infile_;
   input;
 end;
run;

HTH,
Art
------
On Fri, 30 Aug 2013 08:12:11 -0700, Gmail <sdcausa2012@GMAIL.COM> wrote:

> Okay, I am glad you are still interested.
>
> The wrapping is caused by your screening as it is out of space. So
> there is
no wrapping for each string.
>
> The purpose is to read the data in to reconstruct the table.
>
> Hope that helps.
>
> On Aug 30, 2013, at 8:02 AM, "Data _null_;" <iebupdte@gmail.com> wrote:
>
> I don't think you are answering my question.  Let's forget how (PRX or
> other) for now and talk about the input data.  Show exact data lines
> with no wrapping and be specific.  If they have to wrap make sure it
> is clear how to put them back together.
>
> Also the purpose would be useful.  Is this for QC because if it is
> reading the data into SAS variables is not necessary and actully makes
> QC harder.
>
> On Fri, Aug 30, 2013 at 9:52 AM, Gmail <sdcausa2012@gmail.com> wrote:
>> Nat's method does not work. But thanks for his try!
>>
>> We can ignore the blank rows which is not important. I think the only
right way to do it is using PRX functions. However, the logic is not easy. I
hope someone knows PRX well and can offer the solution.
>>
>> Where is David Cassell?  I know he knows PRX well.
>>
>> On Aug 30, 2013, at 5:24 AM, "Data _null_;" <iebupdte@GMAIL.COM> wrote:
>>
>> That seems like it would work but I how did you decide what makes a
>> record.  Everything up a blank line?
>>
>> On Fri, Aug 30, 2013 at 6:38 AM, Nat Wooding <nathani@verizon.net> wrote:
>>> Null
>>>
>>> I took the easy way out and copied the text to a file and then
>>> unwrapped
the
>>> lines.
>>>
>>> Nat
>>>
>>> -----Original Message-----
>>> From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
Data
>>> _null_;
>>> Sent: Friday, August 30, 2013 7:10 AM
>>> To: SAS-L@LISTSERV.UGA.EDU
>>> Subject: Re: Extracting data from strings
>>>
>>> Are the "strings" exactly as they appear in the e-mail?  With the
wrapping
>>> and blank lines.  Flat file or SAS data set?
>>>
>>> On Thu, Aug 29, 2013 at 2:42 PM, David Ford <sdcausa2012@gmail.com>
wrote:
>>>> Dear SAS-Lers,
>>>>
>>>> I'd like to extract the data from the following strings to make a
>>>> table to look like this:
>>>>
>>>> Age (Years)
>>>> N                 188                185              273
0.800
>>>> Mean               42.1               42.0             42.1
>>>> SD                 11.40              12.53            11.95
>>>> Median             42.0               42.0             42.0
>>>> Min                25                 25               25
>>>> Max                70                 70               70
>>>>
>>>> It would be better to use Pearl Regular Expression.  However, I do
>>>> not know it well.  Please help if you can.
>>>>
>>>> Thanks a lot in advance for the help!
>>>>
>>>> David
>>>>
>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>> ^^^
>>>> ^^^^^
>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>
>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>> ^^^
>>>> ^^^^^
>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>> ^^^
>>>> ^^^^^
>>>> ^^^^^^^^^^^^Age (Years)
>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>> ^^^
>>>> ^^^^^
>>>> ^^^^^^^^
>>>>
>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>> ^^^
>>>> ^^^^^
>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>> ^^^ ^^^^^ ^^^^^^^^^^^^^^^^N ^^^^^^^^^^^^^^^^^^^^^^^^
>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^188^^^^^^^185^^^^^^
>>>> ^27 3^^^^ ^^^0.800^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>
>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>> ^^^
>>>> ^^^^^
>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>> ^^^
>>>> ^^^^^
>>>> ^^^^^^^^^^^^^^Mean ^^^^^^^^^^^^^^^^^
>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^42.1^^^^^^^42.0^^^^
>>>> ^^^
>>>> 42.1^
>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>
>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>> ^^^
>>>> ^^^^^
>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>> ^^^ ^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^SD ^^^^^^^^^^^^^^^^^
>>>>
>>>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^11.40^^^^^^^12.53^^^^^^^11.
>>>> 95^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>
>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>> ^^^
>>>> ^^^^^
>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>> ^^^ ^^^^^ ^^^^^^^^^^^^^^^^^^^^^Median ^^^^^^^^^^^^^^^^^
>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^42.0^^^^^^^42.0^^^^
>>>> ^^^ 42.0^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>
>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>> ^^^
>>>> ^^^^^
>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>> ^^^ ^^^^^ ^^^^^^^^^^^^^^^^^^^^^Min ^^^^^^^^^^^^^^^^^
>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^25^^^^^^^25^^^^^^^2
>>>> 5^^
>>>> ^^^^^
>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>
>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>> ^^^
>>>> ^^^^^
>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>> ^^^ ^^^^^ ^^^^^^^^^^^^^^^^^^^^^Max ^^^^^^^^^^^^^^^^^
>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^70^^^^^^^70^^^^^^^7
>>>> 0^^
>>>> ^^^^^
>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic