Friday, April 1, 2011
How to preserve leading zeros when export to csv file
input id $;
/* create formatting to preserve leading zeros in excel */
cid  =  ' = " '  ||  id  ||  ' " ';
cards;
00123
12398
01234
00034
;
run;
data one (drop=id);
      set one;
run;
ods listing close;
ods csv file='lead0.csv' ;
      proc print data=one noobs;
      run;
ods csv close;
Wednesday, February 9, 2011
Create transport file and Read in transport file
libname source "/home/kwxxx/requests/xxxdata";
libname tran xport "/home/kwxx/requests/xxxdata/transport/xxxdata.xpt";
proc copy in=source out=tran memtype=data;
run;
*** Read in transport file;
libname library ‘c:\kwxxx’;
libname tran xport ‘c:\kwxxx\xxxdata.xpt’;
proc copy in=tran out=library;
run;
*** If format was included as a dataset in the transport file, need to convert the dataset back to format library;
proc format library=library cntlin=library.fmtout;
run;
Monday, February 7, 2011
How to kill a Unix SAS job
      ps -ef|grep kwxxxx
And you'll see the output like this:
      kwxxxx 21506 35726 0 11:07:45 pts/15 0:00 vi zzz.sas
      kwxxxx 29200 34432 1 11:27:40 pts/18 0:00 ps -ef
      kwxxxx 30128 34432 0 09:36:54 pts/18 0:00 -ksh
      kwxxxx 30478 1 0 09:36:52 - 0:00 xterm
      kwxxxx 34432 30478 1 09:36:54 pts/18 0:00 -ksh
      kwxxxx 35334 1 0 09:36:51 - 0:00 xterm
      kwxxxx 35726 35334 0 09:36:53 pts/15 0:00 -ksh
The second column is the process number, use the following command to delete the job that you'd like to remove:
      kill -9 process#
      (i.e. kill -9 21506)
Thursday, January 20, 2011
Create a SAS dataset from FORMAT library
proc format library=fmts cntlout=fmts.fmtout;
run;
Friday, January 14, 2011
Use BYTE function to create special ASCII character, e.g. plus/minus sign
      do i=0 to 255;
            x=byte(i);
            put i= x=;
      end;
      y=rank('a');
      put y=;
run;
BYTE function results:
i=65 -- 90 x=A -- Z
i=97 -- 122 x=a -- z
i=33 x=!
i=34 x="
i=35 x=#
i=36 x=$
i=37 x=%
i=38 x=&
i=39 x='
i=40 x=(
i=41 x=)
i=42 x=*
i=43 x=+
i=44 x=,
i=45 x=-
i=46 x=.
i=47 x=/
i=162 x=¢
i=163 x=£
i=164 x=¤
i=165 x=¥
i=166 x=¦
i=167 x=§
i=168 x=¨
i=169 x=©
i=170 x=ª
i=171 x=«
i=172 x=¬
i=173 x=
i=174 x=®
i=175 x=¯
i=176 x=°
i=177 x=±
i=178 x=²
i=179 x=³
i=180 x=´
i=181 x=µ
i=182 x=¶
i=183 x=•
i=184 x=¸
i=185 x=¹
i=186 x=º
i=187 x=»
i=188 x=¼
i=189 x=½
i=190 x=¾
RANK function results:
y=97
Tuesday, January 4, 2011
LAG function
*** Use a third variable to assign the value from the previous record;
array one a b c d;
array two e f g h;
      do over one;
            temp = lag(one);
            if first.id = 0 and last.id=1 and two = . then do;
                  two = temp;
            end;
      end;
Calculate new variables using the sum of other columns in PROC REPORT
proc report data=fmxx center headline headskip split='/';
      column site_id havevis &type.2 &type.3 &type.p1 &type.p2;
      define site_id / group id order=internal width=16 left 'Clinical Center'style(header)={cellwidth= 1.2in just=left} style(column)={cellwidth= 1.2in cellheight=0.17in just=left};
      define havevis / sum style(header)={cellwidth=.9in} 'All/Visits' center ;
      define &type.2 / sum style(header)={cellwidth=.9in} width=15 "Any/&desc" center ;
      define &type.3 / sum style(header)={cellwidth=.9in} width=15 "3+ Volumes/for &desc/n" center ;
      define &type.p1 / computed width=10 format=5.1 "3+ Volumes/for &desc/% (1)" center;
      define &type.p2 / computed width=10 format=5.1 "3+ Volumes/for &desc/% (2)" center;
      compute &type.p1;
            &type.p1=(&type.3.sum*100)/&type.2.sum;
      endcomp;
      compute &type.p2;
            &type.p2=(&type.3.sum*100)/havevis.sum;
      endcomp;
title1 " ";
run;
%mend;
%vol(head, Head);
If data is already summarized, use the following code for compute:
      compute &type.p1;
            _c5_=(_c4_*100)/_c3_;
      endcomp;
      compute &type.p2;
            _c6_=(_c4_*100)/_c2_;
      endcomp;
PROC LOGISTIC odds ratio estimate
==================== From SAS =======================================
The calculation of the Odds Ratios depends upon the parameterization used for the categorical independent variable. By default, Proc LOGISTIC uses effects coding so the odds ratios are not calculated as EXP(estimate). You can change the parameterization to reference cell coding by using the PARAM=GLM option on the CLASS statement. Using this coding does lead to odds ratios being calculated as EXP(estimate). Note that for continuous variables the odds ratios are always calculated as EXP(estimate). Also note that no matter what parameterization is used, the values of the odds ratios are always the same.
For additional details on this topic, as well as setting the reference levels for the CLASS statement please reference the Syntax section (for the CLASS
statement) and the "Odds Ratio Estimation" portion of the Details section of the LOGISTIC chapter in the SAS/STAT User's Guide for V8.
==================== Syntax from SAS online manual =========================
PARAM=keyword
specifies the parameterization method for the classification variable or variables. Design matrix columns are created from CLASS variables according to the following coding schemes. The default is PARAM=EFFECT. If PARAM=ORTHPOLY or PARAM=POLY, and the CLASS levels are numeric, then the ORDER= option in the CLASS statement is ignored, and the internal, unformatted values are used.
EFFECT
specifies effect coding
GLM
specifies less than full rank, reference cell coding; this option can only be used as a global option ORTHPOLY specifies orthogonal polynomial coding POLYNOMIAL | POLY specifies polynomial coding REFERENCE | REF specifies reference cell coding
The EFFECT, POLYNOMIAL, REFERENCE, and ORTHPOLY parameterizations are full rank. For the EFFECT and REFERENCE parameterizations, the REF= option in the CLASS statement determines the reference level.
Consider a model with one CLASS variable A with four levels, 1, 2, 5, and 7. Details of the possible choices for the PARAM= option follow.
EFFECT
Three columns are created to indicate group membership of the nonreference levels. For the reference level, all three dummy variables have a value of -1. For instance, if the reference level is 7 (REF=7), the design matrix columns for A are as follows.
Effect Coding
A Design Matrix
1 1 0 0
2 0 1 0
5 0 0 1
7 -1 -1 -1
Parameter estimates of CLASS main effects using the effect coding scheme estimate the difference in the effect of each nonreference level compared to the average effect over all 4 levels.
GLM
As in PROC GLM, four columns are created to indicate group membership. The design matrix columns for A are as follows.
GLM Coding
A Design Matrix
1 1 0 0 0
2 0 1 0 0
5 0 0 1 0
7 0 0 0 1
Parameter estimates of CLASS main effects using the GLM coding scheme estimate the difference in the effects of each level compared to the last level.
ORTHPOLY
The columns are obtained by applying the Gram-Schmidt orthogonalization to the columns for PARAM=POLY. The design matrix columns for A are as follows.
Orthogonal Polynomial Coding
A Design Matrix
1 -1.153 0.907 -0.921
2 -0.734 -0.540 1.473
5 0.524 -1.370 -0.921
7 1.363 1.004 0.368
POLYNOMIAL
POLY
Three columns are created. The first represents the linear term (x), the second represents the quadratic term (x2), and the third represents the cubic term (x3), where x is the level value. If the CLASS levels are not numeric, they are translated into 1, 2, 3, ... according to their sorting order. The design matrix columns for A are as follows.
Polynomial Coding
A Design Matrix
1 1 1 1
2 2 4 8
5 5 25 125
7 7 49 343
REFERENCE
REF
Three columns are created to indicate group membership of the nonreference levels. For the reference level, all three dummy variables have a value of 0. For instance, if the reference level is 7 (REF=7), the design matrix columns for A are as follows.
Reference Coding
A Design Matrix
1 1 0 0
2 0 1 0
5 0 0 1
7 0 0 0
Parameter estimates of CLASS main effects using the reference coding scheme estimate the difference in the effect of each nonreference level compared to the effect of the reference level.
PROC GPLOT options: AUTOHREF and AUTOVREF
goptions reset=all;
/* Create sample data */
data a;
input x y;
cards;
1 30
2 15
3 40
4 80
5 35
6 40
7 85
8 75
9 55
10 30
;
run;
/* Create an annotate data set to draw darker reference lines */
/* at X values of 3, 6, and 9; Y values of 20, 50, and 80. */
/* The SIZE variable sets the line thickness. */
data anno;
      length function color $ 8;
      /* Vertical reference lines */
      do x=3 to 9 by 3;
            function='move'; xsys='2'; ysys='1';
            y=0; output;
            function='draw'; xsys='2'; ysys='1';
            size=3; line=1; color='black'; y=100; output;
      end;
      /* Horizontal reference lines */
      do y=20 to 80 by 30;
            function='move'; xsys='1'; ysys='2';
            x=0; output;
            function='draw'; xsys='1'; ysys='2';
            size=3; line=1; color='black'; x=100; output;
      end;
run;
/* Create the graph. */
proc gplot data=a;
      plot y*x / haxis=axis1 vaxis=axis2 autohref autovref noframe annotate=anno;
      symbol1 i=none v=dot c=blue;
      axis1 order=(0 to 12 by 1);
      axis2 order=(0 to 100 by 10);
      title1 h=1.5 f=swiss 'Reference Lines of Varying Thickness';
run;
quit;
PROC LOGISTIC options: selection=, hierarchy=
Model hierarchy refers to the requirement that, for any term to be in the model, all effects contained in the term must be present in the model. For example, in order for the interaction A*B to enter the model, the main effects A and B must be in the model. Likewise, neither effect A nor B can leave the model while the interaction A*B is in the model.
The keywords you can specify in the HIERARCHY= option are described as follows:
NONE
Model hierarchy is not maintained. Any single effect can enter or leave the model at any given step of the selection process.
SINGLE
Only one effect can enter or leave the model at one time, subject to the model hierarchy requirement. For example, suppose that you specify the main effects A and B and the interaction of A*B in the model. In the first step of the selection process, either A or B can enter the model. In the second step, the other main effect can enter the model. The interaction effect can enter the model only when both main effects have already been entered. Also, before A or B can be removed from the model, the A*B interaction must first be removed. All effects (CLASS and interval) are subject to the hierarchy requirement.
SINGLECLASS
This is the same as HIERARCHY=SINGLE except that only CLASS effects are subject to the hierarchy requirement.
MULTIPLE
More than one effect can enter or leave the model at one time, subject to the model hierarchy requirement. In a forward selection step, a single main effect can enter the model, or an interaction can enter the model together with all the effects that are contained in the interaction. In a backward elimination step, an interaction itself, or the interaction together with all the effects that the interaction contains, can be removed. All effects (CLASS and interval) are subject to the hierarchy requirement.
MULTIPLECLASS
This is the same as HIERARCHY=MULTIPLE except that only CLASS effects are subject to the hierarchy requirement.
The default value is HIERARCHY=SINGLE, which means that model hierarchy is to be maintained for all effects (that is, both CLASS and interval effects) and that only a single effect can enter or leave the model at each step.
Create Oracle Tables
proc sql;
      create table oradb.dsn as
      select * from temp;
quit;
run;
proc append base=oradb.dsn data=temp1 force;
run;
Friday, December 17, 2010
Trapezoidal Rule
Trapezoidal Rule
Using Trapezoidal Rule for the Area Under a Curve Calculation
Estimate the Area Under a ROC Curve
SAS Calculations of AUC for Multiple Metabolic Readings
/************************************************************************
AREA.SAS
DISCLAIMER: THIS INFORMATION IS PROVIDED BY SAS INSTITUTE INC. AS A SERVICE TO ITS USERS. IT IS PROVIDED "AS IS". THERE ARE NO WARRANTIES, EXPRESSED OR IMPLIED, AS TO MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE REGARDING THE ACCURACY OF THE MATERIALS OR CODE CONTAINED HEREIN.
PURPOSE: This program uses PROC EXPAND to calculate the approximate area under the curve for some sample data. The sample data should consist of (x,y) pairs.
DETAILS: For this example, the sample data is generated from a high degree polynomial. PROC EXPAND is then used to compute the approximate area under the curve using each of the following methods:
a. Cubic Spline interpolation. b. Trapezoid rule.
The exact area, given by the definite integral, is calculated for the polynomial curve in order to assess the precision of the approximations.
************************************************************************/
%let lower=-2; %let upper=1; %let interval=0.2;
* generate some data according to a high order polynomial; data kvm; do x=&lower to &upper by &interval; y=15+(x-2)*(x-1.5)*(x-1)*(x-.5)*x*(x+.5)*(x+1)*(x+1.5)*(x+2); output; end;
proc sort; by x;
/* PROC EXPAND will include a contribution for the last interval. For an accurate approximation to the integral, we need to make sure that this last contribution is negligible. So we'll append an additional x value which is extremely close to the last x value. Of course, the two Y values will be identical. But the result is that the last interval is extremely short, so any contribution to the integral approximation is negligible. */
proc print data=kvm(obs=50); title 'First few observations of the original data'; run;
data one; set kvm end=eof; output; if eof then do; x=x+(1e-10); output; end; run;
proc print data=one(obs=50); title 'First few observations of the original data'; run;
proc gplot data=one; title 'original series'; plot y*x; run;
proc expand data=one out=three method=spline ; convert y=total/observed=(beginning,total) transformout=(sum); id x; run;
proc sort data=three; by descending total;
proc print data=three(obs=1); var total; title 'Approximate Integral Using Spline method'; run;
*************************************************;
proc expand data=one out=three method=join; convert y=total/observed=(beginning,total) transformout=(sum); id x; run;
proc sort data=three; by descending total;
proc print data=three(obs=1); title 'Approximate Integral Using Trapezoid Rule'; var total; run;
***********************************************;
/* Since this data was generate using a high order polynomial, it's easy to compute the definite integral value. Then we can compare our approximations and assess their precision. */
data four; set one; x3= x*x*x; x5= x*x*x*x*x; x7= x*x*x*x*x*x*x; x9= x*x*x*x*x*x*x*x*x; run;
proc autoreg data=four outest=five noprint; title 'Calculate polynomial coefficients'; model y=x x3 x5 x7 x9; run;
proc print data=five; run;
data six; set five; title 'brute force (hand) integration of polynomial'; val=&upper; eval=intercept*val+x*(val**2)/2+x3/4*(val**4)+x5/6*(val**6) +x7/8*(val**8)+x9/10*(val**10); temp=eval; val=&lower; eval=intercept*val+x*(val**2)/2+x3/4*(val**4)+x5/6*(val**6) +x7/8*(val**8)+x9/10*(val**10); integral=temp-eval; keep integral; run;
proc print; run; title;
Friday, December 10, 2010
Tuesday, November 30, 2010
Harizontal High-Low Plot
goptions reset=all cback=white border;
/* Set system options */
data a;
input TEST $8. BREAKS;
cards;
Cold 5
Cold 12
Cold 14
Cold 22
Cold 52
Heat 20
Heat 25
Heat 10
Heat 22
Heat 47
Gases 12
Gases 25
Gases 33
Gases 48
Gases 24
Pressure 10
Pressure 12
Pressure 14
Pressure 22
Pressure 60
Xrays 20
Xrays 25
Xrays 14
Xrays 22
Xrays 29
Humidity 20
Humidity 25
Humidity 33
Humidity 40
Humidity 24
;
run;
/* Sort data by variable TEST */
proc sort; by TEST; run;
/**************************************************/
/* Create an output data set, B using PROC MEANS */
/* that contain new variables, MEAN, STD, STDERR, */
/* MIN, and MAX. */
/**************************************************/
proc means mean std stderr min max data=a;
by TEST;
output out=b mean=mean min=min max=max;
run;
/****************************************************************/
/* Create an annotate data set, ANNO to draw the bars at +/- 1, */
/* 2, or 3 Standard Deviation or Standard Error of the mean. */
/****************************************************************/
data anno;
retain xsys ysys '2' when 'a';
length color function $8 ;
set b;
/* Draw the horizontal line from min to max */
function='move'; xsys='2'; ysys='2'; yc=TEST; x=min; color='blue'; output; function='draw'; x=max; color='blue'; size=2; output;
/* Draw the MEAN horizontal line making the SIZE bigger */
function='move'; xsys='2';ysys='2';yc=TEST;x=mean; color='red'; output; function='draw'; x=mean; ysys='9'; y=+2; size=4; output;
function='draw'; x=mean; y=-4; size=4; output;
/* Draw the line for the MIN value */
function='move';xsys='2';ysys='2';yc=TEST;x=min;color='red';output;
function='draw';x=min;ysys='9';y=+2;size=2;output;
function='draw';x=min;y=-4;size=2;output;
/* Draw the line for the MAX value */
function='move';xsys='2';ysys='2';yc=TEST;x=max;color='red';output;
function='draw';x=max;ysys='9';y=+2;size=2;output;
function='draw';x=max;y=-4;size=2;output;
axis1 order=(0 to 100 by 10);
symbol1 i=none v=none c=black;
proc gplot data=b ;
plot test*mean / anno=anno haxis=axis1 href=30 60 90; /* The HREF= option draws reference lines */
run;
quit;
Use DO WHILE to read in every word in a string
%let cnt=1;
%let tword=%qscan(&tstring,&cnt,%str( ));
%* let ntot1=%qscan(&tcount,&cnt,%str( ));
%do %while(&tword ne);
      data _null_;
            set &dir..&dsn;
            if &tname="&tword";
            call symput('twlbl',trim(left(&tlname)));
      run;
      %concat; *** Macro program is appended after Macro descrip;
      %let cnt=%eval(&cnt+1);
      %let tword=%qscan(&tstring,&cnt,%str( ));
      %* let ntot1=%qscan(&tcount,&cnt,%str( ));
%end; *** End of While(tword ne);
%let cnt=%eval(&cnt-1);
Tuesday, November 23, 2010
Define delimiter in PROC IMPORT
proc import datafile="..." dbms=dlm out=... replace;
      delimiter='|';
      getnames=yes;
run;
Label variable with special character
ods escapechar='^';
ods rtf file='temp.rtf'; *** Can be HTML or PDF too;
proc report data=sashelp.class nowd ps=40;
      col age new;
      define new / format=$30. ;
/* for the RTF */
      compute new/char ;
            new='^S={font_face=symbol} D';
      endcomp;
/* for the listing */
      label age="my label" 'F0'X ;
run;
ods rtf close;
Create superscript:
Get superscripts for 0, 1, 2, and 3 in printed SAS output by using the extended characters specified in hexadecimal. The following example shows the superscripts for 0, 1, 2, and 3 using 'b0'x, 'b9'x, 'b2'x 'b3'x respectively:
data _null_;
      file print;
      put 'This is a superscript zero' 'b0'x;
      put 'This is a superscript one' 'b9'x;
      put 'This is a superscript two' 'b2'x;
      put 'This is a superscript three' 'b3'x;
run;
proc print label data=sashelp.class;
      label age=age'b2'x;
run;
Thursday, November 18, 2010
Rename all vars in a dataset using SASHELP
data dsn;
a=1;
b=2;
c=3;
d=4;
e=5;
f=6;
run;
%macro test(lib,dsn);
*/1)*/ data _null_;
      set sashelp.vtable(where=(libname="&LIB" and memname="&DSN"));
      call symput('nvars',nvar);
run;
*/2)*/ data dsn;
      set sashelp.vcolumn(where=(libname="&LIB" and memname="&DSN"));
      call symput(cats("var",_n_),name);
run;
*/3)*/ proc datasets library=&LIB;
      modify &DSN;
      rename
      %do i = 1 %to &nvars;
            &&var&i=Rename_&&var&i.
      %end;
      ;
quit;
run;
%mend;
%test(WORK,DSN);
Monday, November 15, 2010
Generate random numbers
data u1(keep=x);
      seed = 104;
      do i = 1 to 5;
            call ranuni(seed, x);
            output;
      end;
      call symputx('seed', seed);
run;
data u2(keep=x);
      seed = &seed;
      do i = 1 to 5;
            call ranuni(seed, x);
            output;
      end;
run;
data all;
      set u1 u2;
      z = ranuni(104);
run;
proc print label;
      label x = 'Separate Streams' z = 'Single Stream';
run;
Output from the CALL RANUNI Routine
              Separate      Single
Obs      Streams      Stream
1      0.23611      0.23611
2      0.88923      0.88923
3      0.58173      0.58173
4      0.97746      0.97746
5      0.84667      0.84667
6      0.80484      0.80484
7      0.46983      0.46983
8      0.29594      0.29594
9      0.17858      0.17858
10      0.92292      0.92292
Thursday, October 28, 2010
Create FORMAT library from a SAS dataset
      set temp;
      fmtname="TRTFMT";
      start=trtgrp;
      label=trtlbl;
run;
proc format library=work cntlin=fmt;
run;
=================
TEMP:
trtgrp      trtlbl
1            Treatment 1
2            Treatment 2
3            Placebo
Tuesday, October 26, 2010
Thursday, October 14, 2010
Friday, August 27, 2010
Delete single format from library
        delete yn.format dnk.formatc;
quit;
*** yn is numeric format, dnk is character format;
Thursday, August 19, 2010
Tuesday, August 3, 2010
Wednesday, June 16, 2010
Handling leading, trailing blanks in macro variable
%let macvar = &macvar;
Friday, June 11, 2010
KUPDATE - replace nth position
KUPDATE(argument,position <,n > , characters-to-replace)
Inserts, deletes, and replaces character value contents.
Product: Base SAS
Document: SAS National Language Support: Reference Guide
Link: kupdate
Wednesday, June 9, 2010
Function to get file path & name
Wednesday, April 28, 2010
Thursday, April 22, 2010
Read CSV file with carriage return characters
/* double quotes with a specified character. This sample */
/* uses '@' and '$', but any character can be used, including */
/* spaces. CR/LFs not in double quotes will not be replaced. */
%let repA=' '; /* replacement character LF */
%let repD=' '; /* replacement character CR */
%let dsnnme="/home/help/zzz.csv"; /* use full path of CSV file */
data _null_;
/* RECFM=N reads the file in binary format. The file consists */
/* of a stream of bytes with no record boundaries. SHAREBUFFERS */
/* specifies that the FILE statement and the INFILE statement */
/* share the same buffer. */
infile &dsnnme recfm=n sharebuffers;
file &dsnnme recfm=n;
/* OPEN is a flag variable used to determine if the CR/LF is within */
/* double quotes or not. Retain this value. */
retain open 0;
input a $char1.;
/* If the character is a double quote, set OPEN to its opposite value. */
if a = '"' then open = ^(open);
/* If the CR or LF is after an open double quote, replace the byte with */
/* the appropriate value. */
if open then do;
if a = '0D'x then put &repD;
else if a = '0A'x then put &repA;
end;
run;
filename infle '/home/help/zzz.csv' termstr=CRLF;
PROC IMPORT OUT= zzz DATAFILE= infle
DBMS=CSV REPLACE;
GUESSINGROWS=1000;
GETNAMES=YES;
DATAROW=2;
RUN;
Monday, April 19, 2010
MSGLEVEL option
The default is MSGLEVEL=N.
OPTIONS MSGLEVEL=I;
See also
Wednesday, April 14, 2010
Read all files in one directory
%macro readraw(dir=.);
%local fileref rc did dnum dmem memname;
%let fileref=thisdir;
%let rc=%sysfunc(filename(fileref,&dir));
%let did=%sysfunc(dopen(&fileref));
%let dnum=%sysfunc(dnum(&did));
%do dmem=1 %to &dnum;
%let memname=%sysfunc(dread(&did, &dmem));
%if %upcase(%scan(&memname,-1,.)) = DAT %then %do;
%let dataset=%scan(&memname,1,.);
data &dataset;
infile "&dir\&memname";
input Course_code $4. Location $15. Begin_date date9.;
format Begin_date date9.;
run;
proc print data=&dataset;
title "%trim(&syslast)";
run;
%end;
%end;
%let rc=%sysfunc(dclose(&did));
%let rc=%sysfunc(filename(fileref));
%mend readraw;
options mprint;
%readraw(dir=c:\workshop\winsas\amacr);
Monday, April 5, 2010
Friday, March 19, 2010
Tuesday, March 16, 2010
Monday, March 15, 2010
RSASUSER System Option: OpenVMS
“WARNING: Unable to copy SASUSER registry to WORK registry. Because of this, you will not see registry customizations during this session.”
Link: RSASUSER
Thursday, March 11, 2010
Proc Import trick
guessingrows = 1000;
specifies the number of rows of the file to scan in order to determine the appropriate data type and length for the columns. The scan data process scans from row 1 to the number that is specified by GUESSINGROWS=. The default value is 20. However, you can change the default value in the SAS Registry under SAS REGISTRY -> PRODUCTS -> BASE -> EFI -> GuessingRows.
| Default: | 20 |
| Requirement: | This number should be greater than the value that is specified for DATAROW=. |
Monday, March 8, 2010
Sunday, February 28, 2010
Wednesday, February 24, 2010
Tuesday, February 23, 2010
ANY* and NOT* Function
data test;
do dec=0 to 255;
byte=byte(dec);
hex=put(dec,hex2.);
anycntrl=anycntrl(byte);
anygraph=anygraph(byte);
anypunct=anypunct(byte);
notalpha=notalpha(byte);
notprint=notprint(byte);
output;
end;
run;
proc print data=test;
run;
Sunday, February 21, 2010
PROC SURVEYSELECT
Select a random sample where no observation can be chosen more than once.
/* WORK.EASTHIGH is a data base of student grade point averages */
/* from East High School, Grades 9 through 12, 100 or more students per grade. */
data EastHigh;
format GPA 3.1;
do Grade=9 to 12;
do StudentID=1 to 100+int(201*ranuni(432098));
GPA=2.0 + (2.1*ranuni(34280));
output;
end;
end;
run;
/* Using PROC SURVEYSELECT */
/* Use METHOD=SRS. N= is the number of observations to select. */
/* The sample is stored in the OUT= data set, SAMPLE1. */
proc surveyselect data=EastHigh method=srs n=15 out=sample1;
run;
title "PROC SURVEYSELECT";
proc print data=sample1;
run;
Simple random sample with replacement
Select a random sample where an observation can be chosen more than once.
/* Using PROC SURVEYSELECT */
/* Use METHOD=URS. N= is the number of observations to select. */
/* The sample is stored in the OUT= data set, SAMPLE. */
/* The OUTHITS option includes an observation in the OUT= data set */
/* for each selected unit. By default, the OUT= will contain one */
/* observation for each unique selected unit and the NumberHits */
/* variable identifies the number of times each unit is selected. */
proc surveyselect data=EastHigh method=urs n=15 out=sample outhits;
run;
title "PROC SURVEYSELECT ";
proc print;
run;
Saturday, February 20, 2010
Wednesday, February 10, 2010
Look up system options and display setting
Superscript, Subscript in RFT
length secnam $15;input sortord secnam $ pvalue;
cards;
1 demog 0.8812
2 ae 0.7112
3 disposition 0.8112
4 medicalhistory 0.9112
;
run;
ods listing close;
ods rtf file="Test.rtf" style=rtfout;
ods escapechar='\'; *** value not limited to \;
proc report data = test missing split="$" spacing=0 headline nowd;
column sortord secnam pvalue;
define sortord / order noprint;
define secnam / order flow "Demographics$Variable\{super a}";
define pvalue / display flow "ANOVA$P-Value\{sub p}";
footnote1 "testing\{super 1}";
footnote2 "new test\{sub 2}";
run;
ods rtf close;
ods trace off;
ods listing;
Monday, February 8, 2010
Changing titles in ODS Content Page
ODS PROCLABEL "" ; SAS Qualification Tools
- The SAS IQ assists in demonstrating the SAS System has been installed and maintained to the manufacturer's specifications. SAS IQ verifies the integrity of each file in the SAS System 9 and provides the customer a set of reports detailing the results.
- The SAS OQ assists in demonstrating the SAS System is operational. SAS OQ uses SAS programs provided by the component development groups and will execute, process, and report the program results.
Thursday, February 4, 2010
Tuesday, February 2, 2010
Creating a Stored Macro Facility
libname macro '/home/dltmp/maclib';
%macro test / store des='This is my program';
%let x=1;
%mend;
%copy test/source;
proc catalog cat=macro.sasmacr;
contents;
run;
Link: http://www2.sas.com/proceedings/forum2008/101-2008.pdf
Handle special character when import data
Link: http://support.sas.com/kb/18/688.html
Sunday, January 31, 2010
Utilize SAS Audit Files to Keep Track of SAS Data Set Updates
Creating an audit trail file for a SAS data set is easily done via the DATASETS Procedure.
proc datasets library=ctemp;
audit orsales;
initiate;
run;
quit;
Executing the code results in a SAS audit file named orsales.sas7baud being built in the CTEMP SAS data library. When updates are made to the orsales SAS data set, entries are made to the attendant orsales SAS audit file.
proc contents data=ctemp.orsales(type=audit);
run;
proc print data=ctemp.orsales(type=audit);
run;
You can determine the variables available in your audit file or print the audit file observations using PROC CONTENTS and PROC PRINT, respectively, using the TYPE=AUDIT data set option.
Note that the Member Type in the Attributes section of the CONTENTS output will state "AUDIT", and that you will get a few additional audit-related variables. You can use other SAS procedures, such as PROC REPORT, on the SAS audit file, too.
Friday, January 29, 2010
Sunday, January 24, 2010
Jump start in PROC REPORT
First, run PROC REPORT against the data set of interest, including the LIST option to list the code in the log and the NOEXEC option to prevent SAS from actually running the code. For example:
proc report data = sashelp.class list noexec ;
run ;
This should appear in the log:
PROC REPORT DATA=SASHELP.CLASS LS=96 PS=55 SPLIT="/" CENTER ;
COLUMN Name Sex Age Height Weight;
DEFINE Name / DISPLAY FORMAT= $8. WIDTH=8 SPACING=2 LEFT "Name" ;
DEFINE Sex / DISPLAY FORMAT= $1. WIDTH=1 SPACING=2 LEFT "Sex" ;
DEFINE Age / SUM FORMAT= BEST9. WIDTH=9 SPACING=2 RIGHT "Age" ;
DEFINE Height / SUM FORMAT= BEST9. WIDTH=9 SPACING=2 RIGHT "Height" ;
DEFINE Weight / SUM FORMAT= BEST9. WIDTH=9 SPACING=2 RIGHT "Weight" ;
RUN;
Copy the code to the program editor, and make any needed changes. Suppose we want average heights and weights by sex and age. Just insert the NOWD option in the PROC statement (assuming that interactive windowing is not needed), comment out the reference to NAME in the COLUMN statement and the DEFINE statement for NAME, and change the first option in each of the remaining DEFINE statements:
PROC REPORT DATA=SASHELP.CLASS LS=96 PS=55 SPLIT="/" CENTER nowd ;
COLUMN /* Name */ Sex Age Height Weight;
* DEFINE Name / DISPLAY FORMAT= $8. WIDTH=8 SPACING=2 LEFT "Name" ;
DEFINE Sex / group FORMAT= $1. WIDTH=1 SPACING=2 LEFT "Sex" ;
DEFINE Age / group FORMAT= BEST9. WIDTH=9 SPACING=2 RIGHT "Age" ;
DEFINE Height / mean FORMAT= BEST9. WIDTH=9 SPACING=2 RIGHT "Height" ;
DEFINE Weight / mean FORMAT= BEST9. WIDTH=9 SPACING=2 RIGHT "Weight" ;
RUN;
The output:
S
e
x Age Height Weight
F 11 51.3 50.5
12 58.05 80.75
13 60.9 91
14 63.55 96.25
15 64.5 112.25
M 11 57.5 85
12 60.366667 103.5
13 62.5 84
14 66.25 107.5
15 66.75 122.5
16 72 150
Cosmetic refinements can be made by changing the WIDTH option for SEX and the FORMAT options for HEIGHT and WEIGHT.
Thursday, January 21, 2010
Use EXIST to check dataset
%MACRO D_EXIST(INPUT); %SYSFUNC(EXIST(&INPUT)) %MEND D_EXIST;
%IF %D_EXIST(SASHELP.CLASS) %THEN %DO;
Wednesday, January 20, 2010
Monday, January 18, 2010
Determining the number of observations in a SAS data set efficiently
- The name of the data set and its member type
- The date and time the data set was created
- The number of observations
- The number of variables
- The engine type
Here there are some examples:
/* Before SAS V9*/
data _NULL_;
if 0 then set sashelp.class nobs=n;
call symput('nrows',trim(left(put(n,8.))));
stop;
run;
%put nobs=&nrows;
/* SAS V9.x and higher */
data _NULL_;
if 0 then set sashelp.class nobs=n;
call symputx('nrows',n);
stop;
run;
%put nobs=&nrows;- The "if 0" is a a conditional statement designed to never be executed at run-time and works since the "set sashelp.class" gives the header information of the data set class to the compiler which sets up the pdv accordingly, but is skipped in the execution time because the if condition is false.
- Nobs is a SAS automatic variable which contains the number of records in the dataset named in the set statement. The code, nobs = n, places the value of nobs (the number of records in the sashlep.class dataset) in the variable n. [NOBS option limitations: 1."The number of observations includes those observations that are marked for deletion but are not yet deleted."2."For certain SAS views, SAS cannot determine the number of observations. In these cases, SAS sets the value of the NOBS= variable to the largest positive integer value that is available in your operating environment.]
- The STOP statement is used to prevent an endless loop because EOF is never reached.
data _null_;
call symput('AnyObs','0'); /* set to 0 for No */
set &data(obs=1);
call symput('AnyObs','1'); /* if got here there are obs so set to 1 for Yes */
run;The meta data can also be accessed by the macro language using DATA step functions. The ATTRN function is used with the NLOBS argument in the %OBSCNT macro shown below to retrieve the number of non-deleted observations from the meta data.
%macro obscnt(dsn); %local nobs dsnid; %let nobs=.; %* Open the data set of interest; %let dsnid = %sysfunc(open(&dsn)); %* If the open was successful get the; %* number of observations and CLOSE &dsn; %if &dsnid %then %do; %let nobs=%sysfunc(attrn(&dsnid,nlobs)); %let rc =%sysfunc(close(&dsnid)); %end; %else %do; %put Unable to open &dsn - %sysfunc(sysmsg()); %end; %* Return the number of observations; &nobs %mend obscnt;
Monday, January 11, 2010
Forcing the Concatenation of Data Sets
PROC APPEND BASE=master DATA=txn FORCE; RUN;
Friday, January 1, 2010
Systask Statement
For example, in your BIGPROGRAM.SAS, you could have the following code:
systask command "sas your_clever_program1.sas" taskname=program1;
systask command "sas your_clever_program2.sas" taskname=program2;
waitfor _all_ program1 program2;
/* Other SAS code that processes the data sets created by the two afore-mentioned programs */
Note that the _all_ option states that your program is to wait for both program1 _AND_ program2 to complete before resuming execution. As an alternative, you could have coded _any_, which would have your program resume the moment either program1 _OR_ program2 completed execution.
Link: http://support.sas.com/documentation/cdl/en/hostwin/61924/HTML/default/win-stmt-systask.htm
Link: http://www2.sas.com/proceedings/sugi30/021-30.pdf