SAS & Statistics
Friday, December 1, 2023
Thursday, July 9, 2020
Functions can be used in fuzzy match
Notes taken from PharmSUG seminar
SOUNDEX (Sound Alike)
Ignore case, embedded blanks and punctuations; works best with English-sounding names;
Example:
WHERE varx = * "Michael";
SPEDIS (Spelling Distance)
Translating a keyword into a query containing the smallest value distance;
Example:
Spedis_Value = SPEDIS (title, "Michael"); (exact match is value 0)
COMPELVE (Levenshtein Edit Distance)
Provides an indication of how close tow strings are;
Example:
COMPLEV (Category, "Drama") as Complev_Number; (exact match is value 0)
COMPGED (Generalized Edit Distance)
Measure of dissimilarity between two strings;
Example:
COMPGED (M.Title, A.Title, 'ILN') as Compged_Score; (lowest score indicates better match)
I: ignore case, L: ignore leading blank; N:ignore quotation mark
SOUNDEX (Sound Alike)
Ignore case, embedded blanks and punctuations; works best with English-sounding names;
Example:
WHERE varx = * "Michael";
SPEDIS (Spelling Distance)
Translating a keyword into a query containing the smallest value distance;
Example:
Spedis_Value = SPEDIS (title, "Michael"); (exact match is value 0)
COMPELVE (Levenshtein Edit Distance)
Provides an indication of how close tow strings are;
Example:
COMPLEV (Category, "Drama") as Complev_Number; (exact match is value 0)
COMPGED (Generalized Edit Distance)
Measure of dissimilarity between two strings;
Example:
COMPGED (M.Title, A.Title, 'ILN') as Compged_Score; (lowest score indicates better match)
I: ignore case, L: ignore leading blank; N:ignore quotation mark
Monday, June 17, 2019
Compress - Keep Writable
varname = compress(varname, , 'kw');
The modifier “k” stands for ‘KEEP’ and the modifier “w” stands for ‘WRITABLE’. When compress function is used in combination of K & W modifiers, it keeps all the writable characters which means it deletes all the non writable characters.
The modifier “k” stands for ‘KEEP’ and the modifier “w” stands for ‘WRITABLE’. When compress function is used in combination of K & W modifiers, it keeps all the writable characters which means it deletes all the non writable characters.
Wednesday, March 13, 2019
Add shade to Kaplan Meier plot
data km;
seed=12345;
do loc=1 to 2;
do time=2 to 22 by 1+int(4*ranuni(seed));
status=int(2*ranuni(seed));
output;
end;
end;
run;
proc lifetest data=km plots=s outsurv=os;
ods select survivalplot;
time time*status(1);
strata loc;
run;
data os;
retain survhold 1;
set os;
if _censor_ = 1 then SURVIVAL= survhold;
else survhold= SURVIVAL;
run;
proc sgplot data=os;
step x=time y=Survival / name="survival" legendlabel="Survival" group=stratum;
band x=time lower=0 upper=survival / modelname="survival" transparency=.5;
run;
seed=12345;
do loc=1 to 2;
do time=2 to 22 by 1+int(4*ranuni(seed));
status=int(2*ranuni(seed));
output;
end;
end;
run;
proc lifetest data=km plots=s outsurv=os;
ods select survivalplot;
time time*status(1);
strata loc;
run;
data os;
retain survhold 1;
set os;
if _censor_ = 1 then SURVIVAL= survhold;
else survhold= SURVIVAL;
run;
proc sgplot data=os;
step x=time y=Survival / name="survival" legendlabel="Survival" group=stratum;
band x=time lower=0 upper=survival / modelname="survival" transparency=.5;
run;
Friday, November 16, 2018
Friday, December 22, 2017
Tuesday, December 12, 2017
Make contents in legend in ASCENDING order
Include the GROUPORDER=ASCENDING option in the VBARPARM
statements. For example:
vbarparm category=subject response=&var /
group=&group datalabel=&byvar dataskin=pressed datalabelattrs=(size=6 weight=bold)
groupdisplay=cluster clusterwidth=1 grouporder=ascending;
or
keylegend / location=inside position=topright title="Highest Grade" sortorder=ascending;
group=&group datalabel=&byvar dataskin=pressed datalabelattrs=(size=6 weight=bold)
groupdisplay=cluster clusterwidth=1 grouporder=ascending;
or
keylegend / location=inside position=topright title="Highest Grade" sortorder=ascending;
Monday, September 11, 2017
INSET in Proc SGPLOT
INSET "Mean of the best
percent change = &pmean." / position =
bottomleft;
Example: Click Here and Here
Wednesday, June 21, 2017
Import password protected EXCEL into SAS
Click Here
%macro readpass(xlsfile1,xlsfile2,passwd,outfile,sheetname,getnames); options macrogen symbolgen mprint nocaps; options noxwait noxsync; %* we start excel here using this routine here *; filename cmds dde 'excel|system'; data _null_; length fid rc start stop time 8; fid=fopen('cmds','s'); if (fid le 0) then do; rc=system('start excel'); start=datetime(); stop=start+20; do while (fid le 0); fid=fopen('sas2xl','s'); time=datetime(); if (time ge stop) then fid=1; end; end; rc=fclose(fid); run; quit; %* then we open the excel sheet here with its password *; filename cmds dde 'excel|system'; data _null_; file cmds; put '[open("'"&xlsfile1"'",,,,"'"&passwd"'")]'; run; %* then we save it without the password *; data _null_; file cmds; put '[error("false")]'; put '[save.as("'"&xlsfile2"'",51,"")]'; put '[quit]'; run; %* Then we import the file here *; proc import datafile="&xlsfile2" out=&outfile dbms=xlsx replace; %* sheet="%superq(datafilm&i)"; sheet="&sheetname"; getnames=&getnames; run; quit; %* then we destroy the non password excel file here *; systask command "del ""&xlsfile2"" "; proc contents data=&outfile varnum; run; %mend readpass;
%readpass(j:\access\accpcff\excelfiles\passpro.xlsx, /* name of the xlsx 2007 file */
c:\sastest\nopass.xlsx, /* temporary xls file for translation for import */
mypass, /* password of the excel spreadsheet */
work.temp1, /* name of the sas dataset you want to write */
sheet1, /* name of the sheet */
yes) ; /* getnames */
Monday, June 19, 2017
Monday, January 30, 2017
Fix for invalid characters in data
For "ERROR: Some character data was lost during transcoding in the dataset DB.XXXDAT. Either the data contains characters that are not representable in the new encoding or truncation occurred during transcoding." use the following code in program:
proc options option=config; run;
proc options group=languagecontrol; run;
/* Show the encoding value for the problematic data set */
%let dsn=db.xxxdat;
%let dsid=%sysfunc(open(&dsn,i));
%put &dsn ENCODING is: %sysfunc(attrc(&dsid,encoding));
/*Renaming item desc file (encoding=any) allowed reading */
data temp;
set db.xxxdat (encoding=any);
run;
proc options option=config; run;
proc options group=languagecontrol; run;
/* Show the encoding value for the problematic data set */
%let dsn=db.xxxdat;
%let dsid=%sysfunc(open(&dsn,i));
%put &dsn ENCODING is: %sysfunc(attrc(&dsid,encoding));
/*Renaming item desc file (encoding=any) allowed reading */
data temp;
set db.xxxdat (encoding=any);
run;
Wednesday, November 9, 2016
Tuesday, August 16, 2016
{nbspace x} in PROC REPORT
ods escapechar="^";
proc report data = statsp nowd split = '|'
headline headskip
style(report) = [asis = on PROTECTSPECIALCHARS=off outputwidth=9in]
center missing contents="" spanrows;
define rowlbl /order "Parameter" left width=20
style(column)=[/*font_weight=bold*/ cellwidth=2in asis=on]
style(header)=[just=left];
*** Use asis=on in DEFINE statement to make {nbspace x} in ROWLBL work
Monday, October 26, 2015
Read XLS with data starts on 3rd row, and column names on 2nd row
proc import file='C:\Users\procx\sample.xls'
out=test dbms=xls replace;
sheet=disposition;
namerow=2;
startrow=3;
getnames=yes;
run;
Thursday, October 22, 2015
Get around ‘WARNING: Fisher's exact test is not computed when the total sample size exceeds 32767.’
Fisher's
Exact test can be memory and computation extensive when the sample size is
large. For the large sample size problem, you can use Monte-Carlo estimates of
the exact p-value to get around the issue. For example,
exact fisher /mc;
More
details of the MC option can be found here -
Wednesday, September 30, 2015
Apply color to PROC REPORT
Highlight a row:
compute __stresc;
if index(__stresc,'>') then call define(_row_,"style","style={backgroundcolor=red}");
endcomp;
Highlight a cell:
compute __stresc;
if index(__stresc,'>') then call define(_col_,"style","style={color=red}");
endcomp;
Color a row:
compute __stresc;
if index(__stresc,'>') then call define(_row_,"style","style={color=red}");
endcomp;
Thursday, July 30, 2015
Thursday, April 9, 2015
Wednesday, January 28, 2015
Tuesday, December 9, 2014
Friday, November 21, 2014
Avoid the error of "no statistic associated with XXX." in PROC REPORT
One can avoid this error by putting a dummy analysis variable on the end of the COLUMN statement. For example:
proc report data=test headline headskip split='~' nowd spanrows;
column n vargrp trt, (timer, (statc)) dummy;
define n /group noprint;
define vargrp /group order=internal style(column)={cellwidth=1.2in} left ' ' id;
define trt /across order=internal style(column)={cellwidth=0.6in} center ' ' ;
define timer /across order=internal style(column)={cellwidth=0.6in} center ' ' ;
define statc /display style(header column)={cellwidth=0.6in} center ' ';
define dummy /computed noprint;
run;
proc report data=test headline headskip split='~' nowd spanrows;
column n vargrp trt, (timer, (statc)) dummy;
define n /group noprint;
define vargrp /group order=internal style(column)={cellwidth=1.2in} left ' ' id;
define trt /across order=internal style(column)={cellwidth=0.6in} center ' ' ;
define timer /across order=internal style(column)={cellwidth=0.6in} center ' ' ;
define statc /display style(header column)={cellwidth=0.6in} center ' ';
define dummy /computed noprint;
run;
Wednesday, November 19, 2014
Tuesday, September 9, 2014
Monday, July 21, 2014
Use SAS to score SF-36
Click Here
The following codes are copied from the above link.
The following codes are copied from the above link.
options ls=132 ps=53 nocenter; *****************************************************************; * SF36V2-4_PUBLIC.SAS * SAS CODE FOR SCORING 36-ITEM HEALTH SURVEY VERSION 2.0 * STANDARD FORM (NOT ACUTE!); * WRITTEN BY K. SPRITZER, 9/29/2003 * * MODIFIED: 2/14/2006 TO ADD AGE/GENDER ADJUSTMENT AND ADD PCS AND MCS SCORES. ALSO REQUIRED MODIFICATION OF AGE GROUPINGS: V1 HAD 18-24, 25-34, 35-44, 45-54, 55-64, 65+ WHEREAS V2 BREAKS 65+ INTO 65-74 AND 75+. * MODIFIED: 2/27/2007 TO USE SEPI FARIVAR WEIGHTS. SEE COMMENTS FOR REFERENCE. ALSO SOME RESTRUCTURING OF CODE. * EVERY EFFORT IS MADE TO TEST CODE AS THOROUGHLY AS POSSIBLE BUT USER MUST ACCEPT RESPONSIBILITY FOR USE. PLEASE REPORT ANY CONCERNS OR PROBLEMS WITH THE CODE TO K. SPRITZER AT KSPRITZER@MEDNET.UCLA.EDU - THANKS. ; *****************************************************************; /* INPUT A FEW DUMMY TEST CASES */ /* INCLUDES SOME OUT OF RANGE DATA */ DATA TEMP1; INPUT I1 I2 I3A I3B I3C I3D I3E I3F I3G I3H I3I I3J I4A I4B I4C I4D I5A I5B I5C I6 I7 I8 I9A I9B I9C I9D I9E I9F I9G I9H I9I I10 I11A I11B I11C I11D AGE MALE; CARDS; 4 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 42 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 33 1 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 25 0 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 45 0 4 1 1 1 1 1 1 1 1 1 1 1 1 2 2 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 19 0 2 1 1 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 67 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 3 1 1 1 2 1 1 1 1 1 1 1 1 2 2 2 1 1 2 1 1 1 . 1 1 1 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 1 1 1 1 5 5 1 1 5 5 1 5 5 5 1 5 1 77 1 1 1 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 1 1 1 1 6 6 1 1 6 6 1 6 5 5 1 5 1 56 0 5 5 1 1 1 . . . . . . . 1 1 1 1 1 1 1 5 6 5 6 1 1 6 6 1 1 6 1 1 . . . 5 48 1 ; RUN; DATA TEMP1; SET TEMP1; /** RENAME ITEMS TO CONFORM TO SURVEY **/ RENAME I1 = I1 I2 = I2 I3A = I3 I3B = I4 I3C = I5 I3D = I6 I3E = I7 I3F = I8 I3G = I9 I3H = I10 I3I = I11 I3J = I12 I4A = I13 I4B = I14 I4C = I15 I4D = I16 I5A = I17 I5B = I18 I5C = I19 I6 = I20 I7 = I21 I8 = I22 I9A = I23 I9B = I24 I9C = I25 I9D = I26 I9E = I27 I9F = I28 I9G = I29 I9H = I30 I9I = I31 I10 = I32 I11A = I33 I11B = I34 I11C = I35 I11D = I36 ; RUN; *********************************************************; DATA TEMP1; SET TEMP1; *********************************************************; ** CODE OUT-OF-RANGE VALUES TO MISSING; *********************************************************; ARRAY PT3 I3-I12; DO OVER PT3; IF PT3 NOT IN (1,2,3) THEN PT3=.; END; ARRAY PT5 I1 I2 I13-I20 I22-I36; DO OVER PT5; IF PT5 NOT IN (1,2,3,4,5) THEN PT5=.; END; IF I21 NOT IN (1,2,3,4,5,6) THEN I21=.; RUN; *********************************************************; *********************************************************; DATA TEMP1; SET TEMP1; * KEEP AS IS: I3-I19, I24, I25, I28, I29, I31, I32, I33, I35; *********************************************************; * RAND versions of items; RAND21=I21; RAND22=I22; RAND1=I1; ***************************************************************************; ** WHEN NECESSARY, REVERSE CODE ITEMS SO A HIGHER SCORE MEANS BETTER HEALTH; ***************************************************************************; IF I21=1 THEN RI21=6.0; ELSE IF I21=2 THEN RI21=5.4; ELSE IF I21=3 THEN RI21=4.2; ELSE IF I21=4 THEN RI21=3.1; ELSE IF I21=5 THEN RI21=2.2; ELSE IF I21=6 THEN RI21=1.0; IF I21>. THEN DO; IF I22=1 & I21 =1 THEN RI22=6; ELSE IF I22=1 & I21>=2 THEN RI22=5; ELSE IF I22=2 & I21>=1 THEN RI22=4; ELSE IF I22=3 & I21>=1 THEN RI22=3; ELSE IF I22=4 & I21>=1 THEN RI22=2; ELSE IF I22=5 & I21>=1 THEN RI22=1; END; ELSE IF I21=. THEN DO; IF I22=1 THEN RI22=6.0 ; ELSE IF I22=2 THEN RI22=4.75; ELSE IF I22=3 THEN RI22=3.5 ; ELSE IF I22=4 THEN RI22=2.25; ELSE IF I22=5 THEN RI22=1.0 ; END; IF I1=1 THEN RI1=5.0; ELSE IF I1=2 THEN RI1=4.4; ELSE IF I1=3 THEN RI1=3.4; ELSE IF I1=4 THEN RI1=2.0; ELSE IF I1=5 THEN RI1=1.0; ARRAY CAT5 I2 I20 I23 I26 I27 I30 I34 I36 ; ARRAY RCAT5 RI2 RI20 RI23 RI26 RI27 RI30 RI34 RI36; DO OVER CAT5; IF CAT5=1 THEN RCAT5=5; ELSE IF CAT5=2 THEN RCAT5=4; ELSE IF CAT5=3 THEN RCAT5=3; ELSE IF CAT5=4 THEN RCAT5=2; ELSE IF CAT5=5 THEN RCAT5=1; END; IF RAND21=1 THEN RRAND21=6; ELSE IF RAND21=2 THEN RRAND21=5; ELSE IF RAND21=3 THEN RRAND21=4; ELSE IF RAND21=4 THEN RRAND21=3; ELSE IF RAND21=5 THEN RRAND21=2; ELSE IF RAND21=6 THEN RRAND21=1; IF RAND22=1 THEN RRAND22=5; ELSE IF RAND22=2 THEN RRAND22=4; ELSE IF RAND22=3 THEN RRAND22=3; ELSE IF RAND22=4 THEN RRAND22=2; ELSE IF RAND22=5 THEN RRAND22=1; IF RAND1=1 THEN RRAND1=5; ELSE IF RAND1=2 THEN RRAND1=4; ELSE IF RAND1=3 THEN RRAND1=3; ELSE IF RAND1=4 THEN RRAND1=2; ELSE IF RAND1=5 THEN RRAND1=1; *****************************************************************; ** RE-CODE ITEMS TO 0-100; *****************************************************************; ARRAY FIVEPT RI1 RI2 I13-I16 I17-I19 RI20 RI23 I24 I25 RI26 RI27 I28 I29 RI30 I31 I32 I33 RI34 I35 RI36 RRAND1 RRAND22; ARRAY SIXPT RI21 RI22 RRAND21; ARRAY THREEPT I3-I12; DO OVER FIVEPT; FIVEPT=(FIVEPT-1)*25; END; DO OVER SIXPT; SIXPT=(SIXPT-1)*20; END; DO OVER THREEPT; THREEPT=(THREEPT-1)*50; END; ** CREATE SCALES; PHYFUN10=MEAN(I3,I4,I5,I6,I7,I8,I9,I10,I11,I12); ROLEP4=MEAN(I13,I14,I15,I16); SFPAIN2=MEAN(RI21,RI22); SFGENH5=MEAN(RI1,I33,RI34,I35,RI36); ENFAT4=MEAN(RI23,RI27,I29,I31); SOCFUN2=MEAN(RI20,I32); ROLEE3=MEAN(I17,I18,I19); EMOT5=MEAN(I24,I25,RI26,I28,RI30); PAIN2=MEAN(RRAND21,RRAND22); GENH5=MEAN(RRAND1,I33,RI34,I35,RI36); label phyfun10="Physical functioning scale"; label rolep4="Physical health problems scale"; label sfpain2="SF-36 pain scale"; label sfgenh5="SF-36 general health perceptions scale"; label enfat4="Energy/fatigue scale"; label socfun2="Social functioning scale"; label rolee3="Emotional health problems scale"; label emot5="Emotional well-being scale"; label pain2="Pain scale (RAND)"; label genh5="General health perceptions scale (RAND)"; ** APPLY OPTION THAT REQUIRES AT LEAST HALF OF ITEMS IN A SCALE TO BE PRESENT IN ORDER THAT THE SCALE BE NON-MISSING; ** NEED AT LEAST HALF (OR HALF + 1 FOR SCALES WITH ODD # OF ITEMS) OF ITEMS PRESENT, ELSE SET SCALE TO MISSING; /* UNCOMMENT THIS SECTION IF YOU WANT TO USE THE OPTION */ /* IF N(OF I3,I4,I5,I6,I7,I8,I9,I10,I11,I12) <=4 THEN PHYFUN10=.; IF N(OF I13,I14,I15,I16) <=1 THEN ROLEP4 =.; * IF N(OF RI21,RI22) <=0 THEN SFPAIN2 =.; IF N(OF RI1,I33,RI34,I35,RI36) <=2 THEN SFGENH5 =.; IF N(OF RI23,RI27,I29,I31) <=1 THEN ENFAT4 =.; * IF N(OF RI20,I32) <=0 THEN SOCFUN2 =.; IF N(OF I17,I18,I19) <=1 THEN ROLEE3 =.; IF N(OF I24,I25,RI26,I28,RI30) <=2 THEN EMOT5 =.; * IF N(OF RRAND21,RRAND22) <=0 THEN PAIN2 =.; IF N(OF RRAND1,I33,RI34,I35,RI36) <=2 THEN GENH5 =.; */ *****************************************************************; *** NEMC PHYSICAL AND MENTAL HEALTH COMPOSITE - SF36 ************; *****************************************************************; *** 1) Transform SF-36 scores (NEMC Scoring) to z-scores; ******* ; *** 1998 US general population means and SDs are used here ** ; *** (not age/gender based) *********************************** ; PF_Z = (PHYFUN10 - 83.29094) / 23.75883 ; RP_Z = (ROLEP4 - 82.50964) / 25.52028 ; BP_Z = (SFPAIN2 - 71.32527) / 23.66224 ; GH_Z = (SFGENH5 - 70.84570) / 20.97821 ; EM_Z = (EMOT5 - 74.98685) / 17.75604 ; RE_Z = (ROLEE3 - 87.39733) / 21.43778 ; SF_Z = (SOCFUN2 - 84.30250) / 22.91921 ; EN_Z = (ENFAT4 - 58.31411) / 20.01923 ; label pf_Z="NEMC physical functioning Z-score"; label rp_Z="NEMC role limitation physical Z-score"; label bp_Z="NEMC pain Z-score"; label gh_Z="NEMC general health Z-score"; label em_Z="NEMC emotional well-being Z-score"; label re_Z="NEMC role limitation emotional Z-score"; label sf_Z="NEMC social functioning Z-score"; label en_Z="NEMC energy/fatigue Z-score"; *** 2) Create physical and mental health composite scores: **********; *** Multiply z-scores by varimax-rotated factor scoring **********; *** coefficients and sum the products ****************************; *** NOTE: Aggregates are missing if any subscale score is missing - Qualitymetric has a missing data estimator (MDE) product that will allow scoring of aggregates when 1 or more subscales are missing. See: http://www.qualitymetric.com/products/SFScoring/ScoringMDE.aspx for more information. ; AGG_PHYS = (PF_Z * 0.42402) + (RP_Z * 0.35119) + (BP_Z * 0.31754) + (GH_Z * 0.24954) + (EM_Z * -.22069) + (RE_Z * -.19206) + (SF_Z * -.00753) + (EN_Z * 0.02877); AGG_MENT = (PF_Z * -.22999) + (RP_Z * -.12329) + (BP_Z * -.09731) + (GH_Z * -.01571) + (EM_Z * 0.48581) + (RE_Z * 0.43407) + (SF_Z * 0.26876) + (EN_Z * 0.23534); label agg_phys="NEMC physical health - SF36 (raw)"; label agg_ment="NEMC mental health - SF36 (raw)"; *** 3) Transform composite and scale scores to T-scores: ****** ; AGG_PHYS_T = 50 + (AGG_PHYS * 10); AGG_MENT_T = 50 + (AGG_MENT * 10); label agg_phys_t="NEMC physical health T-score - SF36"; label agg_ment_t="NEMC mental health T-score - SF36"; PF_T = 50 + (PF_Z * 10) ; RP_T = 50 + (RP_Z * 10) ; BP_T = 50 + (BP_Z * 10) ; GH_T = 50 + (GH_Z * 10) ; EM_T = 50 + (EM_Z * 10) ; RE_T = 50 + (RE_Z * 10) ; SF_T = 50 + (SF_Z * 10) ; EN_T = 50 + (EN_Z * 10) ; label pf_t="NEMC physical functioning T-score"; label rp_t="NEMC role limitation physical T-score"; label bp_t="NEMC pain T-score"; label gh_t="NEMC general health T-score"; label em_t="NEMC emotional well-being T-score"; label re_t="NEMC role limitation emotional T-score"; label sf_t="NEMC social functioning T-score"; label en_t="NEMC energy/fatigue T-score"; *****************************************************************; *** PHYSICAL and MENTAL HEALTH FACTORS **************************; *****************************************************************; *** 1) Derive z-scores using age/gender adjusted US general *****; *** population means and SDs, then multiply each z-score by *; *** oblique factor scoring coefficient and sum the products **; *** Feb 2007: oblique factor scoring coeffs come from: ** Farivar, S. S., Cunningham, W. E., & Hays, R. D. (in press). Correlated physical and mental health summary scores for the SF-36 and SF-12 Health Survey, V.1., Journal of Health and Quality of Life Outcomes. ; FFM1= 0.19868 * (PHYFUN10 - 79.9348918)/ 22.709038+ 0.31059 * (ROLEP4 - 76.2860947)/ 34.945958+ 0.22602 * (SFPAIN2 - 72.6616442)/ 23.807217+ 0.19548 * (SFGENH5 - 69.3384676)/ 20.141323+ -0.03009 * (EMOT5 - 74.6684030)/ 18.209059+ 0.02738 * (ROLEE3 - 79.9677043)/ 34.229373+ 0.10511 * (SOCFUN2 - 82.1785891)/ 23.568127+ 0.13093 * (ENFAT4 - 59.5636224)/ 21.334831; FFM2= -0.01520 * (PHYFUN10 - 79.9348918)/ 22.709038+ 0.03496 * (ROLEP4 - 76.2860947)/ 34.945958+ 0.04327 * (SFPAIN2 - 72.6616442)/ 23.807217+ 0.10379 * (SFGENH5 - 69.3384676)/ 20.141323+ 0.35362 * (EMOT5 - 74.6684030)/ 18.209059+ 0.19588 * (ROLEE3 - 79.9677043)/ 34.229373+ 0.13950 * (SOCFUN2 - 82.1785891)/ 23.568127+ 0.28769 * (ENFAT4 - 59.5636224)/ 21.334831 ; label ffm1 = "Physical health factor score"; label ffm2 = "Mental health factor score"; *** 2) Transform composite scores to t-scores *******************; FSFM1 = (FFM1 * 10) +50; FSFM2 = (FFM2 * 10) +50; label fsfm1="Physical health factor T-score"; label fsfm2="Mental health factor T-score"; run; *****************************************************************; TITLE1 "SF36 V2 (all cases)"; PROC MEANS data=TEMP1; VAR phyfun10 rolep4 sfpain2 pain2 sfgenh5 genh5 emot5 rolee3 socfun2 enfat4 agg_phys agg_ment pf_t rp_t bp_t gh_t em_t re_t sf_t en_t AGG_PHYS_T AGG_MENT_T ffm1 ffm2 fsfm1 fsfm2; run; *****************************************************************; TITLE1 "SF36 V2 (cases with non-missing age and gender)"; PROC MEANS data=TEMP1; where age > . & male > .; VAR phyfun10 rolep4 sfpain2 pain2 sfgenh5 genh5 emot5 rolee3 socfun2 enfat4 agg_phys agg_ment pf_t rp_t bp_t gh_t em_t re_t sf_t en_t AGG_PHYS_T AGG_MENT_T ffm1 ffm2 fsfm1 fsfm2; run; *****************************************************************;Click Here too
Tuesday, June 17, 2014
Computing Geometric Means
This sample is from the SAS Sample Library. For additional information refer to SAS Help and Online Documentation.
/****************************************************************/
/* S A S S A M P L E L I B R A R Y */
/* */
/* NAME: GEOMEAN */
/* TITLE: Computing Geometric Means */
/* PRODUCT: SAS */
/* SYSTEM: ALL */
/* KEYS: DATASTEP ARRAY MEANS DO FUNCTIONS LOG EXP PRINT */
/* PROCS: MEANS PRINT */
/* DATA: */
/* */
/* SUPPORT: UPDATE: */
/* REF: */
/* MISC: */
/* */
/****************************************************************/
OPTIONS LS=72;
TITLE 'GEOMETRIC MEANS';
DATA;
INPUT X Y Z@@;
CARDS;
2 3 1 2 5 6 4 3 2 1 2 3 45 68 6 5 4 3 2 2 1
2 3 2 2 1 3 -46 7 765 4 33 2 2 32 3 5 5 3 2 25 4 32 23
4 3 2 32 32 23 4 4 -55 2 23 1 1 22 32 32
;
RUN;
DATA LOGS;
SET;
ARRAY L (3) X Y Z;
DO I=1 TO 3;
IF L(I)>0 THEN L(I)=LOG(L(I));
ELSE L(I)=.;
END;
RUN;
PROC MEANS;
VAR X Y Z;
OUTPUT OUT=MEANLOGS MEAN=X1 Y1 Z1;
RUN;
DATA GEOMEAN;
SET MEANLOGS;
ARRAY MLOG(3) X1 Y1 Z1;
DO I=1 TO 3;
MLOG(I)=EXP(MLOG(I));
END;
RUN;
PROC PRINT;
RUN;
Also See Geometric Statistics in PK Analysis
Geometric Statistics, geometric CV
Tuesday, June 3, 2014
Send Emails from SAS
For Video Click Here
For Article Click Here and Here
For Article Click Here and Here
Change SAS email options and use SMTP email. Refer to:
Usage Note
19767: Using the SAS® System to send SMTP e-mail
For troubleshooting refer to:
Usage Note
18363: Troubleshooting guidelines for successfully sending an SMTP e-mail from
SAS® software
Monday, May 12, 2014
Wednesday, April 16, 2014
Friday, March 28, 2014
A Couple of Proc Template for Report
proc template;
define style timesrtf;
parent=styles.journal;
replace fonts /
'BatchFixedFont' = ('Times',9pt)
'TitleFont' = ('Times',12pt)
'TitleFont2' = ('Times',9pt)
'FootFont' = ('Times',9pt,Italic)
'StrongFont' = ('Times',9pt,Bold)
'EmphasisFont' = ('Times',9pt)
'FixedEmphasisFont' = (Times,9pt)
'FixedStrongFont' = (Times,9pt,Bold)
'FixedHeadingFont' = (Times,9pt)
'FixedFont' = (Times,9pt)
'headingEmphasisFont' = ('Times',9pt)
'headingFont' = ('Times',9pt)
'docFont' = ('Times',9pt);
style table from output /
cellpadding = 2pt
background=white
rules=groups
frame=void;
style body from document /
leftmargin=1in
rightmargin=1in
topmargin=1in
bottommargin=0.5in;
replace color_list /
'bgH' = white
'fg' = black
'bg' = white;
end;
run;
proc template;
define style mystyle;
parent=styles.printer;
replace fonts /
'BatchFixedFont' = ('Courier',8pt)
'TitleFont2' = ('Courier',9pt,Bold)
'TitleFont' = ('Courier',9pt,Bold)
'StrongFont' = ('courier',8pt,Bold)
'EmphasisFont' = ('Courier',8pt)
'FixedEmphasisFont' = (Courier,8pt)
'FixedStrongFont' = (Courier,8pt,Bold)
'FixedHeadingFont' = (Courier,8pt)
'FixedFont' = (Courier,8pt)
'headingEmphasisFont' = ('Courier',8pt)
'headingFont' = ('Courier',8pt)
'docFont' = ('Courier',8pt);
style table from output /
cellpadding = 1pt
background=white
rules=groups
frame=void;
* rules=all/*rows*//*groups*/
* frame=hsides/*box*/;
style body from document /
leftmargin=0.36in
rightmargin=0.25in
topmargin=0.60in
bottommargin=0.36in;
replace color_list /
'link' = black
'bgH' = white
'fg' = black
'bg' = white;
end;
run;
define style timesrtf;
parent=styles.journal;
replace fonts /
'BatchFixedFont' = ('Times',9pt)
'TitleFont' = ('Times',12pt)
'TitleFont2' = ('Times',9pt)
'FootFont' = ('Times',9pt,Italic)
'StrongFont' = ('Times',9pt,Bold)
'EmphasisFont' = ('Times',9pt)
'FixedEmphasisFont' = (Times,9pt)
'FixedStrongFont' = (Times,9pt,Bold)
'FixedHeadingFont' = (Times,9pt)
'FixedFont' = (Times,9pt)
'headingEmphasisFont' = ('Times',9pt)
'headingFont' = ('Times',9pt)
'docFont' = ('Times',9pt);
style table from output /
cellpadding = 2pt
background=white
rules=groups
frame=void;
style body from document /
leftmargin=1in
rightmargin=1in
topmargin=1in
bottommargin=0.5in;
replace color_list /
'bgH' = white
'fg' = black
'bg' = white;
end;
run;
proc template;
define style mystyle;
parent=styles.printer;
replace fonts /
'BatchFixedFont' = ('Courier',8pt)
'TitleFont2' = ('Courier',9pt,Bold)
'TitleFont' = ('Courier',9pt,Bold)
'StrongFont' = ('courier',8pt,Bold)
'EmphasisFont' = ('Courier',8pt)
'FixedEmphasisFont' = (Courier,8pt)
'FixedStrongFont' = (Courier,8pt,Bold)
'FixedHeadingFont' = (Courier,8pt)
'FixedFont' = (Courier,8pt)
'headingEmphasisFont' = ('Courier',8pt)
'headingFont' = ('Courier',8pt)
'docFont' = ('Courier',8pt);
style table from output /
cellpadding = 1pt
background=white
rules=groups
frame=void;
* rules=all/*rows*//*groups*/
* frame=hsides/*box*/;
style body from document /
leftmargin=0.36in
rightmargin=0.25in
topmargin=0.60in
bottommargin=0.36in;
replace color_list /
'link' = black
'bgH' = white
'fg' = black
'bg' = white;
end;
run;
Tuesday, January 7, 2014
Tuesday, November 19, 2013
Convert CHAR to NUM in PROC SQL
Use TO_NUMBER function in PROC SQL.
proc sql;
connect to oracle (user=xxx orapw=yyy path="@zzz");
create table temp as
select * from connection to oracle
(select aa, to_number(bb) as bb, to_number(cc) as cc
from xxx.dddd)
where bb ne . or cc ne .
order by aa;
disconnect from oracle;
quit;
run;
proc sql;
connect to oracle (user=xxx orapw=yyy path="@zzz");
create table temp as
select * from connection to oracle
(select aa, to_number(bb) as bb, to_number(cc) as cc
from xxx.dddd)
where bb ne . or cc ne .
order by aa;
disconnect from oracle;
quit;
run;
Tuesday, November 5, 2013
Wednesday, July 10, 2013
Power Calculation
Example of finding power of a balanced one way ANOVA:
proc power;
onewayanova test = overall
groupmeans = 59 | 66 | 42
std = 5 8 10 12
npergroup = 3 4 5 6 7 8 9 10
power = . ;
plot ;
run;
Contrast power:
proc power;
onewayanova test = contrast
contrast = (1 -1 0 0 0 0) (0 1 0 0 -1 0) (0 0 1 -1 0 0)
groupmeans = (5 4 6 11 12 9)
std = 4
npergroup = 5
power = .
;
run;
Example of finding n of an unbalanced one way ANOVA:
proc power;
onewayanova test = overall
groupmeans = 3 | 7 | 8
stddev = 4
groupweights = (1 2 2)
ntotal = .
power = 0.8
;
run;
Example of finding power of a chi-squared test:
proc power;
twosamplefreq test = pchi
groupproportions = (.6 .4) (.7 .3) (.55 .45) (.49 .51)
nullproportiondiff = 0
npergroup =25 50 75 100 200
power = . ;
plot;
run;
Click Here
Example of finding power for a survival analysis:
proc power;
twosamplesurvival test = logrank
gexphs = 0.3567 | 0.5978 0.6931
grouplossexphazards = (0.3567 0.3567)
accrualtime = 1
followuptime = 1
groupweights = (1 2)
power = .
ntotal = 225 ;
run;
Click Here
Example of finding power of a two-sample t test:
proc power;
twosamplemeans test=diff
meandiff = 5 6
stddev = 12 18
alpha = 0.05 0.1
ntotal = 100 200
power = . ;
plot ;
run;
Click Here
proc power;
onewayanova test = overall
groupmeans = 59 | 66 | 42
std = 5 8 10 12
npergroup = 3 4 5 6 7 8 9 10
power = . ;
plot ;
run;
Contrast power:
proc power;
onewayanova test = contrast
contrast = (1 -1 0 0 0 0) (0 1 0 0 -1 0) (0 0 1 -1 0 0)
groupmeans = (5 4 6 11 12 9)
std = 4
npergroup = 5
power = .
;
run;
Example of finding n of an unbalanced one way ANOVA:
proc power;
onewayanova test = overall
groupmeans = 3 | 7 | 8
stddev = 4
groupweights = (1 2 2)
ntotal = .
power = 0.8
;
run;
Example of finding power of a chi-squared test:
proc power;
twosamplefreq test = pchi
groupproportions = (.6 .4) (.7 .3) (.55 .45) (.49 .51)
nullproportiondiff = 0
npergroup =25 50 75 100 200
power = . ;
plot;
run;
Click Here
Example of finding power for a survival analysis:
proc power;
twosamplesurvival test = logrank
gexphs = 0.3567 | 0.5978 0.6931
grouplossexphazards = (0.3567 0.3567)
accrualtime = 1
followuptime = 1
groupweights = (1 2)
power = .
ntotal = 225 ;
run;
Click Here
Example of finding power of a two-sample t test:
proc power;
twosamplemeans test=diff
meandiff = 5 6
stddev = 12 18
alpha = 0.05 0.1
ntotal = 100 200
power = . ;
plot ;
run;
Click Here
Sunday, April 14, 2013
Friday, November 9, 2012
Friday, June 22, 2012
Friday, May 11, 2012
Wednesday, March 28, 2012
Create Transport File
This example uses the DATA step to create a transport file for one data set.
libname source 'SAS-data-library';
libname xportout xport 'transport-file';
data xportout.grades;
set source.grades;
run;
This example uses the COPY procedure to create a transport file for multiple data sets.
libname source 'SAS-data-library';
libname xportout xport 'transport-file';
proc copy in=source out=xportout memtype=data;
run;
Click Here
PROC CPORT/CIMPORT vs PROC COPY
libname source 'SAS-data-library';
libname xportout xport 'transport-file';
data xportout.grades;
set source.grades;
run;
This example uses the COPY procedure to create a transport file for multiple data sets.
libname source 'SAS-data-library';
libname xportout xport 'transport-file';
proc copy in=source out=xportout memtype=data;
run;
Click Here
PROC CPORT/CIMPORT vs PROC COPY
Monday, March 5, 2012
Use GREPLAY to create multiple graphs per page
Desired result:
This sample program uses PROC GREPLAY together with macro code to dynamically replay a specific number of graphs per page.
The graphics output in the Results tab was produced using SAS® 9.2. Submitting the sample code with releases of SAS prior to SAS 9.2 might produce different results.
/* Delete any old entries in WORK.GSEG catalog */
proc greplay nofs igout=work.gseg;
delete _all_;
run;
quit;
/* Specify macro debugging options */
options mprint mlogic symbolgen;
goptions reset=all device=gif;
/* Macro MAKEGRAF creates GSLIDE graph entries that are TREPLAYED */
/* by PROC GREPLAY later in this program. Remove the %DO loop and GSLIDE */
/* procedure and place your code within the MAKEGRAF macro when you are */
/* ready to create your own individual graph entries. */
%macro makegraf;
%do i=1 %to 6;
proc gslide;
title1 "GRAPH &i";
run;
quit;
%end;
%mend makegraf;
/* This macro will dynamically treplay a specified number of grseg entries */
/* per page/screen. The specified values for the parameters */
/* ACROSS and DOWN will determine the number of panels in the template */
/* and the number of graphs per page/screen. */
%MACRO PERPAGE(across,down);
/* Create macro variable perpage */
%let perpage=%eval(&across*&down);
/* Use DSGI functions HSIZE/VSIZE to determine default HSIZE and VSIZE */
/* and modify for correct aspect ratio. */
data garea;
rc=ginit();
call gask('hsize',hsize,rc);
hsze=hsize/&across;
call symput('hsize',left(trim(hsze)));
call gask('vsize',vsize,rc);
vsze=vsize/&down;
call symput('vsize',left(trim(vsze)));
rc=gterm();
run;
/* Adjust VSIZE and HSIZE to reflect dimension of template panel. */
/* Turn DISPLAY off. */
goptions nodisplay vsize=&vsize hsize=&hsize;
/* Invoke macro MAKEGRAF, */
/* gslide entries created in WORK.GSEG catalog. */
%makegraf
/* Use DSGI function NUMGRAPH to determine the number of entries in */
/* WORK.GSEG catalog. */
data numgraf;
rc=gset('catalog','work','gseg');
rc=ginit();
call gask('numgraph',grsegcnt,rc);
call symput('grsegcnt',grsegcnt);
call symput('loopit',ceil(grsegcnt/&perpage));
ymult=100/&down;
xmult=100/&across;
rc=gterm();
run;
/* Calculate the panel coordinates for template */
data coord;
set numgraf;
do x=0 to (100-xmult) by xmult;
llx=x;
ulx=x;
urx=x+xmult;
lrx=x+xmult;
do y=0 to (100-ymult) by ymult;
lly=y;
uly=y+ymult;
ury=y+ymult;
lry=y;
output;
end;
end;
run;
proc sort;
by descending y;
run;
/* Create macro variables that resolve to panel coordinates */
data mccoord;
set coord end=eof;
call symput('llx'||left(_n_),llx);
call symput('ulx'||left(_n_),ulx);
call symput('urx'||left(_n_),urx);
call symput('lrx'||left(_n_),lrx);
call symput('lly'||left(_n_),lly);
call symput('uly'||left(_n_),uly);
call symput('ury'||left(_n_),ury);
call symput('lry'||left(_n_),lry);
if eof then call symput('total',_n_);
run;
/* Macro to create template */
%macro tempdef;
%do i=1 %to &total;
&i / llx=&&llx&i lly=&&lly&i
ulx=&&ulx&i uly=&&uly&i
urx=&&urx&i ury=&&ury&i
lrx=&&lrx&i lry=&&lry&i
color=black
%end;
%mend tempdef;
/* Macro to generate TREPLAY statement for GREPLAY */
%macro tplay;
%do j=1 %to &perpage;
%if %eval(&i*&perpage-(&perpage-&j)) <= &grsegcnt %then
&j:%eval(&i*&perpage-(&perpage-&j));
%end;
%mend tplay;
/* Macro to generate templated grseg entries */
%macro greplay;
proc greplay nofs igout=work.gseg tc=tempcat;
tdef spec&perpage
%tempdef;
template spec&perpage;
%do i=1 %to &loopit;
treplay %tplay;
run;
%end;
quit;
%mend greplay;
/* Reset the graphics options */
goptions reset=all device=gif
gsfname=grafout gsfmode=replace
xpixels=600 ypixels=400;
filename grafout 'c:\temp\test.gif';
/* Invoke the GREPLAY macro */
%greplay
%MEND PERPAGE;
/* Invoke the PERPAGE macro, specifying how you */
/* want the graphs laid out on the page. */
%PERPAGE(across=2,down=3);
Copied from SAS KNOWLEDGE BASE / SAMPLES & SAS NOTES
This sample program uses PROC GREPLAY together with macro code to dynamically replay a specific number of graphs per page.
The graphics output in the Results tab was produced using SAS® 9.2. Submitting the sample code with releases of SAS prior to SAS 9.2 might produce different results.
/* Delete any old entries in WORK.GSEG catalog */
proc greplay nofs igout=work.gseg;
delete _all_;
run;
quit;
/* Specify macro debugging options */
options mprint mlogic symbolgen;
goptions reset=all device=gif;
/* Macro MAKEGRAF creates GSLIDE graph entries that are TREPLAYED */
/* by PROC GREPLAY later in this program. Remove the %DO loop and GSLIDE */
/* procedure and place your code within the MAKEGRAF macro when you are */
/* ready to create your own individual graph entries. */
%macro makegraf;
%do i=1 %to 6;
proc gslide;
title1 "GRAPH &i";
run;
quit;
%end;
%mend makegraf;
/* This macro will dynamically treplay a specified number of grseg entries */
/* per page/screen. The specified values for the parameters */
/* ACROSS and DOWN will determine the number of panels in the template */
/* and the number of graphs per page/screen. */
%MACRO PERPAGE(across,down);
/* Create macro variable perpage */
%let perpage=%eval(&across*&down);
/* Use DSGI functions HSIZE/VSIZE to determine default HSIZE and VSIZE */
/* and modify for correct aspect ratio. */
data garea;
rc=ginit();
call gask('hsize',hsize,rc);
hsze=hsize/&across;
call symput('hsize',left(trim(hsze)));
call gask('vsize',vsize,rc);
vsze=vsize/&down;
call symput('vsize',left(trim(vsze)));
rc=gterm();
run;
/* Adjust VSIZE and HSIZE to reflect dimension of template panel. */
/* Turn DISPLAY off. */
goptions nodisplay vsize=&vsize hsize=&hsize;
/* Invoke macro MAKEGRAF, */
/* gslide entries created in WORK.GSEG catalog. */
%makegraf
/* Use DSGI function NUMGRAPH to determine the number of entries in */
/* WORK.GSEG catalog. */
data numgraf;
rc=gset('catalog','work','gseg');
rc=ginit();
call gask('numgraph',grsegcnt,rc);
call symput('grsegcnt',grsegcnt);
call symput('loopit',ceil(grsegcnt/&perpage));
ymult=100/&down;
xmult=100/&across;
rc=gterm();
run;
/* Calculate the panel coordinates for template */
data coord;
set numgraf;
do x=0 to (100-xmult) by xmult;
llx=x;
ulx=x;
urx=x+xmult;
lrx=x+xmult;
do y=0 to (100-ymult) by ymult;
lly=y;
uly=y+ymult;
ury=y+ymult;
lry=y;
output;
end;
end;
run;
proc sort;
by descending y;
run;
/* Create macro variables that resolve to panel coordinates */
data mccoord;
set coord end=eof;
call symput('llx'||left(_n_),llx);
call symput('ulx'||left(_n_),ulx);
call symput('urx'||left(_n_),urx);
call symput('lrx'||left(_n_),lrx);
call symput('lly'||left(_n_),lly);
call symput('uly'||left(_n_),uly);
call symput('ury'||left(_n_),ury);
call symput('lry'||left(_n_),lry);
if eof then call symput('total',_n_);
run;
/* Macro to create template */
%macro tempdef;
%do i=1 %to &total;
&i / llx=&&llx&i lly=&&lly&i
ulx=&&ulx&i uly=&&uly&i
urx=&&urx&i ury=&&ury&i
lrx=&&lrx&i lry=&&lry&i
color=black
%end;
%mend tempdef;
/* Macro to generate TREPLAY statement for GREPLAY */
%macro tplay;
%do j=1 %to &perpage;
%if %eval(&i*&perpage-(&perpage-&j)) <= &grsegcnt %then
&j:%eval(&i*&perpage-(&perpage-&j));
%end;
%mend tplay;
/* Macro to generate templated grseg entries */
%macro greplay;
proc greplay nofs igout=work.gseg tc=tempcat;
tdef spec&perpage
%tempdef;
template spec&perpage;
%do i=1 %to &loopit;
treplay %tplay;
run;
%end;
quit;
%mend greplay;
/* Reset the graphics options */
goptions reset=all device=gif
gsfname=grafout gsfmode=replace
xpixels=600 ypixels=400;
filename grafout 'c:\temp\test.gif';
/* Invoke the GREPLAY macro */
%greplay
%MEND PERPAGE;
/* Invoke the PERPAGE macro, specifying how you */
/* want the graphs laid out on the page. */
%PERPAGE(across=2,down=3);
Copied from SAS KNOWLEDGE BASE / SAMPLES & SAS NOTES
Tuesday, November 22, 2011
Tuesday, November 15, 2011
Wednesday, November 2, 2011
Monday, August 29, 2011
Suppress Engine/Host Section in Proc Contents
ODS EXCLUDE EngineHost ;
SAS default behavior is for this statement to be in effect only until the next procedure it encounters. To tell SAS to continuously exclude these two objects from the output, add the PERSIST option as follows:
ODS EXCLUDE ATTRIBUTES (PERSIST) ENGINEHOST (PERSIST) ;
This one-line change can be used by most users, regardless of skill level, to easily reduce the amount of output from PROC CONTENTS. It’s something that may be useful in an AUTOEXEC file. To return to the default behavior, issue this command:
ODS SELECT ALL ;
Go To Page 5
SAS default behavior is for this statement to be in effect only until the next procedure it encounters. To tell SAS to continuously exclude these two objects from the output, add the PERSIST option as follows:
ODS EXCLUDE ATTRIBUTES (PERSIST) ENGINEHOST (PERSIST) ;
This one-line change can be used by most users, regardless of skill level, to easily reduce the amount of output from PROC CONTENTS. It’s something that may be useful in an AUTOEXEC file. To return to the default behavior, issue this command:
ODS SELECT ALL ;
Go To Page 5
Friday, April 1, 2011
How to preserve leading zeros when export to csv file
data one;
input id $;
/* create formatting to preserve leading zeros in excel */
cid  =  ' = " '  ||  id  ||  ' " ';
cards;
00123
12398
01234
00034
;
run;
data one (drop=id);
      set one;
run;
ods listing close;
ods csv file='lead0.csv' ;
      proc print data=one noobs;
      run;
ods csv close;
input id $;
/* create formatting to preserve leading zeros in excel */
cid  =  ' = " '  ||  id  ||  ' " ';
cards;
00123
12398
01234
00034
;
run;
data one (drop=id);
      set one;
run;
ods listing close;
ods csv file='lead0.csv' ;
      proc print data=one noobs;
      run;
ods csv close;
Wednesday, February 9, 2011
Create transport file and Read in transport file
*** Create transport file;
libname source "/home/kwxxx/requests/xxxdata";
libname tran xport "/home/kwxx/requests/xxxdata/transport/xxxdata.xpt";
proc copy in=source out=tran memtype=data;
run;
*** Read in transport file;
libname library ‘c:\kwxxx’;
libname tran xport ‘c:\kwxxx\xxxdata.xpt’;
proc copy in=tran out=library;
run;
*** If format was included as a dataset in the transport file, need to convert the dataset back to format library;
proc format library=library cntlin=library.fmtout;
run;
libname source "/home/kwxxx/requests/xxxdata";
libname tran xport "/home/kwxx/requests/xxxdata/transport/xxxdata.xpt";
proc copy in=source out=tran memtype=data;
run;
*** Read in transport file;
libname library ‘c:\kwxxx’;
libname tran xport ‘c:\kwxxx\xxxdata.xpt’;
proc copy in=tran out=library;
run;
*** If format was included as a dataset in the transport file, need to convert the dataset back to format library;
proc format library=library cntlin=library.fmtout;
run;
Monday, February 7, 2011
How to kill a Unix SAS job
First, enter the following command to see your SAS job that are currently running:
      ps -ef|grep kwxxxx
And you'll see the output like this:
      kwxxxx 21506 35726 0 11:07:45 pts/15 0:00 vi zzz.sas
      kwxxxx 29200 34432 1 11:27:40 pts/18 0:00 ps -ef
      kwxxxx 30128 34432 0 09:36:54 pts/18 0:00 -ksh
      kwxxxx 30478 1 0 09:36:52 - 0:00 xterm
      kwxxxx 34432 30478 1 09:36:54 pts/18 0:00 -ksh
      kwxxxx 35334 1 0 09:36:51 - 0:00 xterm
      kwxxxx 35726 35334 0 09:36:53 pts/15 0:00 -ksh
The second column is the process number, use the following command to delete the job that you'd like to remove:
      kill -9 process#
      (i.e. kill -9 21506)
      ps -ef|grep kwxxxx
And you'll see the output like this:
      kwxxxx 21506 35726 0 11:07:45 pts/15 0:00 vi zzz.sas
      kwxxxx 29200 34432 1 11:27:40 pts/18 0:00 ps -ef
      kwxxxx 30128 34432 0 09:36:54 pts/18 0:00 -ksh
      kwxxxx 30478 1 0 09:36:52 - 0:00 xterm
      kwxxxx 34432 30478 1 09:36:54 pts/18 0:00 -ksh
      kwxxxx 35334 1 0 09:36:51 - 0:00 xterm
      kwxxxx 35726 35334 0 09:36:53 pts/15 0:00 -ksh
The second column is the process number, use the following command to delete the job that you'd like to remove:
      kill -9 process#
      (i.e. kill -9 21506)
Thursday, January 20, 2011
Create a SAS dataset from FORMAT library
libname fmts "/temp/formats";
proc format library=fmts cntlout=fmts.fmtout;
run;
proc format library=fmts cntlout=fmts.fmtout;
run;
Friday, January 14, 2011
Use BYTE function to create special ASCII character, e.g. plus/minus sign
data _null_;
      do i=0 to 255;
            x=byte(i);
            put i= x=;
      end;
      y=rank('a');
      put y=;
run;
BYTE function results:
i=65 -- 90 x=A -- Z
i=97 -- 122 x=a -- z
i=33 x=!
i=34 x="
i=35 x=#
i=36 x=$
i=37 x=%
i=38 x=&
i=39 x='
i=40 x=(
i=41 x=)
i=42 x=*
i=43 x=+
i=44 x=,
i=45 x=-
i=46 x=.
i=47 x=/
i=162 x=¢
i=163 x=£
i=164 x=¤
i=165 x=¥
i=166 x=¦
i=167 x=§
i=168 x=¨
i=169 x=©
i=170 x=ª
i=171 x=«
i=172 x=¬
i=173 x=
i=174 x=®
i=175 x=¯
i=176 x=°
i=177 x=±
i=178 x=²
i=179 x=³
i=180 x=´
i=181 x=µ
i=182 x=¶
i=183 x=•
i=184 x=¸
i=185 x=¹
i=186 x=º
i=187 x=»
i=188 x=¼
i=189 x=½
i=190 x=¾
RANK function results:
y=97
      do i=0 to 255;
            x=byte(i);
            put i= x=;
      end;
      y=rank('a');
      put y=;
run;
BYTE function results:
i=65 -- 90 x=A -- Z
i=97 -- 122 x=a -- z
i=33 x=!
i=34 x="
i=35 x=#
i=36 x=$
i=37 x=%
i=38 x=&
i=39 x='
i=40 x=(
i=41 x=)
i=42 x=*
i=43 x=+
i=44 x=,
i=45 x=-
i=46 x=.
i=47 x=/
i=162 x=¢
i=163 x=£
i=164 x=¤
i=165 x=¥
i=166 x=¦
i=167 x=§
i=168 x=¨
i=169 x=©
i=170 x=ª
i=171 x=«
i=172 x=¬
i=173 x=
i=174 x=®
i=175 x=¯
i=176 x=°
i=177 x=±
i=178 x=²
i=179 x=³
i=180 x=´
i=181 x=µ
i=182 x=¶
i=183 x=•
i=184 x=¸
i=185 x=¹
i=186 x=º
i=187 x=»
i=188 x=¼
i=189 x=½
i=190 x=¾
RANK function results:
y=97
Tuesday, January 4, 2011
LAG function
LAG Function (SAS 9.2 Doc)
*** Use a third variable to assign the value from the previous record;
array one a b c d;
array two e f g h;
      do over one;
            temp = lag(one);
            if first.id = 0 and last.id=1 and two = . then do;
                  two = temp;
            end;
      end;
*** Use a third variable to assign the value from the previous record;
array one a b c d;
array two e f g h;
      do over one;
            temp = lag(one);
            if first.id = 0 and last.id=1 and two = . then do;
                  two = temp;
            end;
      end;
Calculate new variables using the sum of other columns in PROC REPORT
%macro vol(type,desc);
proc report data=fmxx center headline headskip split='/';
      column site_id havevis &type.2 &type.3 &type.p1 &type.p2;
      define site_id / group id order=internal width=16 left 'Clinical Center'style(header)={cellwidth= 1.2in just=left} style(column)={cellwidth= 1.2in cellheight=0.17in just=left};
      define havevis / sum style(header)={cellwidth=.9in} 'All/Visits' center ;
      define &type.2 / sum style(header)={cellwidth=.9in} width=15 "Any/&desc" center ;
      define &type.3 / sum style(header)={cellwidth=.9in} width=15 "3+ Volumes/for &desc/n" center ;
      define &type.p1 / computed width=10 format=5.1 "3+ Volumes/for &desc/% (1)" center;
      define &type.p2 / computed width=10 format=5.1 "3+ Volumes/for &desc/% (2)" center;
      compute &type.p1;
            &type.p1=(&type.3.sum*100)/&type.2.sum;
      endcomp;
      compute &type.p2;
            &type.p2=(&type.3.sum*100)/havevis.sum;
      endcomp;
title1 " ";
run;
%mend;
%vol(head, Head);
If data is already summarized, use the following code for compute:
      compute &type.p1;
            _c5_=(_c4_*100)/_c3_;
      endcomp;
      compute &type.p2;
            _c6_=(_c4_*100)/_c2_;
      endcomp;
proc report data=fmxx center headline headskip split='/';
      column site_id havevis &type.2 &type.3 &type.p1 &type.p2;
      define site_id / group id order=internal width=16 left 'Clinical Center'style(header)={cellwidth= 1.2in just=left} style(column)={cellwidth= 1.2in cellheight=0.17in just=left};
      define havevis / sum style(header)={cellwidth=.9in} 'All/Visits' center ;
      define &type.2 / sum style(header)={cellwidth=.9in} width=15 "Any/&desc" center ;
      define &type.3 / sum style(header)={cellwidth=.9in} width=15 "3+ Volumes/for &desc/n" center ;
      define &type.p1 / computed width=10 format=5.1 "3+ Volumes/for &desc/% (1)" center;
      define &type.p2 / computed width=10 format=5.1 "3+ Volumes/for &desc/% (2)" center;
      compute &type.p1;
            &type.p1=(&type.3.sum*100)/&type.2.sum;
      endcomp;
      compute &type.p2;
            &type.p2=(&type.3.sum*100)/havevis.sum;
      endcomp;
title1 " ";
run;
%mend;
%vol(head, Head);
If data is already summarized, use the following code for compute:
      compute &type.p1;
            _c5_=(_c4_*100)/_c3_;
      endcomp;
      compute &type.p2;
            _c6_=(_c4_*100)/_c2_;
      endcomp;
PROC LOGISTIC odds ratio estimate
I ran into some problems with the interpretation of the parameter estimate of the logistic model recently. When I contacted SAS, here's their response. Please make sure you specify PARAM=GLM option in the CLASS statement when you want the parameterization to the reference cell coding, 'cause the default is EFFECT. Also I attach the syntax for PARAM below, and you can go to online manual for a better view.
==================== From SAS =======================================
The calculation of the Odds Ratios depends upon the parameterization used for the categorical independent variable. By default, Proc LOGISTIC uses effects coding so the odds ratios are not calculated as EXP(estimate). You can change the parameterization to reference cell coding by using the PARAM=GLM option on the CLASS statement. Using this coding does lead to odds ratios being calculated as EXP(estimate). Note that for continuous variables the odds ratios are always calculated as EXP(estimate). Also note that no matter what parameterization is used, the values of the odds ratios are always the same.
For additional details on this topic, as well as setting the reference levels for the CLASS statement please reference the Syntax section (for the CLASS
statement) and the "Odds Ratio Estimation" portion of the Details section of the LOGISTIC chapter in the SAS/STAT User's Guide for V8.
==================== Syntax from SAS online manual =========================
PARAM=keyword
specifies the parameterization method for the classification variable or variables. Design matrix columns are created from CLASS variables according to the following coding schemes. The default is PARAM=EFFECT. If PARAM=ORTHPOLY or PARAM=POLY, and the CLASS levels are numeric, then the ORDER= option in the CLASS statement is ignored, and the internal, unformatted values are used.
EFFECT
specifies effect coding
GLM
specifies less than full rank, reference cell coding; this option can only be used as a global option ORTHPOLY specifies orthogonal polynomial coding POLYNOMIAL | POLY specifies polynomial coding REFERENCE | REF specifies reference cell coding
The EFFECT, POLYNOMIAL, REFERENCE, and ORTHPOLY parameterizations are full rank. For the EFFECT and REFERENCE parameterizations, the REF= option in the CLASS statement determines the reference level.
Consider a model with one CLASS variable A with four levels, 1, 2, 5, and 7. Details of the possible choices for the PARAM= option follow.
EFFECT
Three columns are created to indicate group membership of the nonreference levels. For the reference level, all three dummy variables have a value of -1. For instance, if the reference level is 7 (REF=7), the design matrix columns for A are as follows.
Effect Coding
A Design Matrix
1 1 0 0
2 0 1 0
5 0 0 1
7 -1 -1 -1
Parameter estimates of CLASS main effects using the effect coding scheme estimate the difference in the effect of each nonreference level compared to the average effect over all 4 levels.
GLM
As in PROC GLM, four columns are created to indicate group membership. The design matrix columns for A are as follows.
GLM Coding
A Design Matrix
1 1 0 0 0
2 0 1 0 0
5 0 0 1 0
7 0 0 0 1
Parameter estimates of CLASS main effects using the GLM coding scheme estimate the difference in the effects of each level compared to the last level.
ORTHPOLY
The columns are obtained by applying the Gram-Schmidt orthogonalization to the columns for PARAM=POLY. The design matrix columns for A are as follows.
Orthogonal Polynomial Coding
A Design Matrix
1 -1.153 0.907 -0.921
2 -0.734 -0.540 1.473
5 0.524 -1.370 -0.921
7 1.363 1.004 0.368
POLYNOMIAL
POLY
Three columns are created. The first represents the linear term (x), the second represents the quadratic term (x2), and the third represents the cubic term (x3), where x is the level value. If the CLASS levels are not numeric, they are translated into 1, 2, 3, ... according to their sorting order. The design matrix columns for A are as follows.
Polynomial Coding
A Design Matrix
1 1 1 1
2 2 4 8
5 5 25 125
7 7 49 343
REFERENCE
REF
Three columns are created to indicate group membership of the nonreference levels. For the reference level, all three dummy variables have a value of 0. For instance, if the reference level is 7 (REF=7), the design matrix columns for A are as follows.
Reference Coding
A Design Matrix
1 1 0 0
2 0 1 0
5 0 0 1
7 0 0 0
Parameter estimates of CLASS main effects using the reference coding scheme estimate the difference in the effect of each nonreference level compared to the effect of the reference level.
==================== From SAS =======================================
The calculation of the Odds Ratios depends upon the parameterization used for the categorical independent variable. By default, Proc LOGISTIC uses effects coding so the odds ratios are not calculated as EXP(estimate). You can change the parameterization to reference cell coding by using the PARAM=GLM option on the CLASS statement. Using this coding does lead to odds ratios being calculated as EXP(estimate). Note that for continuous variables the odds ratios are always calculated as EXP(estimate). Also note that no matter what parameterization is used, the values of the odds ratios are always the same.
For additional details on this topic, as well as setting the reference levels for the CLASS statement please reference the Syntax section (for the CLASS
statement) and the "Odds Ratio Estimation" portion of the Details section of the LOGISTIC chapter in the SAS/STAT User's Guide for V8.
==================== Syntax from SAS online manual =========================
PARAM=keyword
specifies the parameterization method for the classification variable or variables. Design matrix columns are created from CLASS variables according to the following coding schemes. The default is PARAM=EFFECT. If PARAM=ORTHPOLY or PARAM=POLY, and the CLASS levels are numeric, then the ORDER= option in the CLASS statement is ignored, and the internal, unformatted values are used.
EFFECT
specifies effect coding
GLM
specifies less than full rank, reference cell coding; this option can only be used as a global option ORTHPOLY specifies orthogonal polynomial coding POLYNOMIAL | POLY specifies polynomial coding REFERENCE | REF specifies reference cell coding
The EFFECT, POLYNOMIAL, REFERENCE, and ORTHPOLY parameterizations are full rank. For the EFFECT and REFERENCE parameterizations, the REF= option in the CLASS statement determines the reference level.
Consider a model with one CLASS variable A with four levels, 1, 2, 5, and 7. Details of the possible choices for the PARAM= option follow.
EFFECT
Three columns are created to indicate group membership of the nonreference levels. For the reference level, all three dummy variables have a value of -1. For instance, if the reference level is 7 (REF=7), the design matrix columns for A are as follows.
Effect Coding
A Design Matrix
1 1 0 0
2 0 1 0
5 0 0 1
7 -1 -1 -1
Parameter estimates of CLASS main effects using the effect coding scheme estimate the difference in the effect of each nonreference level compared to the average effect over all 4 levels.
GLM
As in PROC GLM, four columns are created to indicate group membership. The design matrix columns for A are as follows.
GLM Coding
A Design Matrix
1 1 0 0 0
2 0 1 0 0
5 0 0 1 0
7 0 0 0 1
Parameter estimates of CLASS main effects using the GLM coding scheme estimate the difference in the effects of each level compared to the last level.
ORTHPOLY
The columns are obtained by applying the Gram-Schmidt orthogonalization to the columns for PARAM=POLY. The design matrix columns for A are as follows.
Orthogonal Polynomial Coding
A Design Matrix
1 -1.153 0.907 -0.921
2 -0.734 -0.540 1.473
5 0.524 -1.370 -0.921
7 1.363 1.004 0.368
POLYNOMIAL
POLY
Three columns are created. The first represents the linear term (x), the second represents the quadratic term (x2), and the third represents the cubic term (x3), where x is the level value. If the CLASS levels are not numeric, they are translated into 1, 2, 3, ... according to their sorting order. The design matrix columns for A are as follows.
Polynomial Coding
A Design Matrix
1 1 1 1
2 2 4 8
5 5 25 125
7 7 49 343
REFERENCE
REF
Three columns are created to indicate group membership of the nonreference levels. For the reference level, all three dummy variables have a value of 0. For instance, if the reference level is 7 (REF=7), the design matrix columns for A are as follows.
Reference Coding
A Design Matrix
1 1 0 0
2 0 1 0
5 0 0 1
7 0 0 0
Parameter estimates of CLASS main effects using the reference coding scheme estimate the difference in the effect of each nonreference level compared to the effect of the reference level.
PROC GPLOT options: AUTOHREF and AUTOVREF
Use the AUTOHREF and AUTOVREF options on the PLOT statement of PROC GPLOT to draw reference lines at all major tickmarks. Create an annotate data to draw thicker reference lines at the desired tickmarks.
goptions reset=all;
/* Create sample data */
data a;
input x y;
cards;
1 30
2 15
3 40
4 80
5 35
6 40
7 85
8 75
9 55
10 30
;
run;
/* Create an annotate data set to draw darker reference lines */
/* at X values of 3, 6, and 9; Y values of 20, 50, and 80. */
/* The SIZE variable sets the line thickness. */
data anno;
      length function color $ 8;
      /* Vertical reference lines */
      do x=3 to 9 by 3;
            function='move'; xsys='2'; ysys='1';
            y=0; output;
            function='draw'; xsys='2'; ysys='1';
            size=3; line=1; color='black'; y=100; output;
      end;
      /* Horizontal reference lines */
      do y=20 to 80 by 30;
            function='move'; xsys='1'; ysys='2';
            x=0; output;
            function='draw'; xsys='1'; ysys='2';
            size=3; line=1; color='black'; x=100; output;
      end;
run;
/* Create the graph. */
proc gplot data=a;
      plot y*x / haxis=axis1 vaxis=axis2 autohref autovref noframe annotate=anno;
      symbol1 i=none v=dot c=blue;
      axis1 order=(0 to 12 by 1);
      axis2 order=(0 to 100 by 10);
      title1 h=1.5 f=swiss 'Reference Lines of Varying Thickness';
run;
quit;
goptions reset=all;
/* Create sample data */
data a;
input x y;
cards;
1 30
2 15
3 40
4 80
5 35
6 40
7 85
8 75
9 55
10 30
;
run;
/* Create an annotate data set to draw darker reference lines */
/* at X values of 3, 6, and 9; Y values of 20, 50, and 80. */
/* The SIZE variable sets the line thickness. */
data anno;
      length function color $ 8;
      /* Vertical reference lines */
      do x=3 to 9 by 3;
            function='move'; xsys='2'; ysys='1';
            y=0; output;
            function='draw'; xsys='2'; ysys='1';
            size=3; line=1; color='black'; y=100; output;
      end;
      /* Horizontal reference lines */
      do y=20 to 80 by 30;
            function='move'; xsys='1'; ysys='2';
            x=0; output;
            function='draw'; xsys='1'; ysys='2';
            size=3; line=1; color='black'; x=100; output;
      end;
run;
/* Create the graph. */
proc gplot data=a;
      plot y*x / haxis=axis1 vaxis=axis2 autohref autovref noframe annotate=anno;
      symbol1 i=none v=dot c=blue;
      axis1 order=(0 to 12 by 1);
      axis2 order=(0 to 100 by 10);
      title1 h=1.5 f=swiss 'Reference Lines of Varying Thickness';
run;
quit;
PROC LOGISTIC options: selection=, hierarchy=
An additional option that you should be aware of when using SELECTION= with a model that has the interaction as a possible variable is the HIERARCHY= option. It specifies whether and how the model hierarchy requirement is applied and whether a single effect or multiple effects are allowed to enter or leave the model in one step. You can specify that only CLASS effects, or both CLASS and interval effects, be subject to the hierarchy requirement. The HIERARCHY= option is ignored unless you also specify one of the following options: SELECTION=FORWARD, SELECTION=BACKWARD, or SELECTION=STEPWISE.
Model hierarchy refers to the requirement that, for any term to be in the model, all effects contained in the term must be present in the model. For example, in order for the interaction A*B to enter the model, the main effects A and B must be in the model. Likewise, neither effect A nor B can leave the model while the interaction A*B is in the model.
The keywords you can specify in the HIERARCHY= option are described as follows:
NONE
Model hierarchy is not maintained. Any single effect can enter or leave the model at any given step of the selection process.
SINGLE
Only one effect can enter or leave the model at one time, subject to the model hierarchy requirement. For example, suppose that you specify the main effects A and B and the interaction of A*B in the model. In the first step of the selection process, either A or B can enter the model. In the second step, the other main effect can enter the model. The interaction effect can enter the model only when both main effects have already been entered. Also, before A or B can be removed from the model, the A*B interaction must first be removed. All effects (CLASS and interval) are subject to the hierarchy requirement.
SINGLECLASS
This is the same as HIERARCHY=SINGLE except that only CLASS effects are subject to the hierarchy requirement.
MULTIPLE
More than one effect can enter or leave the model at one time, subject to the model hierarchy requirement. In a forward selection step, a single main effect can enter the model, or an interaction can enter the model together with all the effects that are contained in the interaction. In a backward elimination step, an interaction itself, or the interaction together with all the effects that the interaction contains, can be removed. All effects (CLASS and interval) are subject to the hierarchy requirement.
MULTIPLECLASS
This is the same as HIERARCHY=MULTIPLE except that only CLASS effects are subject to the hierarchy requirement.
The default value is HIERARCHY=SINGLE, which means that model hierarchy is to be maintained for all effects (that is, both CLASS and interval effects) and that only a single effect can enter or leave the model at each step.
Model hierarchy refers to the requirement that, for any term to be in the model, all effects contained in the term must be present in the model. For example, in order for the interaction A*B to enter the model, the main effects A and B must be in the model. Likewise, neither effect A nor B can leave the model while the interaction A*B is in the model.
The keywords you can specify in the HIERARCHY= option are described as follows:
NONE
Model hierarchy is not maintained. Any single effect can enter or leave the model at any given step of the selection process.
SINGLE
Only one effect can enter or leave the model at one time, subject to the model hierarchy requirement. For example, suppose that you specify the main effects A and B and the interaction of A*B in the model. In the first step of the selection process, either A or B can enter the model. In the second step, the other main effect can enter the model. The interaction effect can enter the model only when both main effects have already been entered. Also, before A or B can be removed from the model, the A*B interaction must first be removed. All effects (CLASS and interval) are subject to the hierarchy requirement.
SINGLECLASS
This is the same as HIERARCHY=SINGLE except that only CLASS effects are subject to the hierarchy requirement.
MULTIPLE
More than one effect can enter or leave the model at one time, subject to the model hierarchy requirement. In a forward selection step, a single main effect can enter the model, or an interaction can enter the model together with all the effects that are contained in the interaction. In a backward elimination step, an interaction itself, or the interaction together with all the effects that the interaction contains, can be removed. All effects (CLASS and interval) are subject to the hierarchy requirement.
MULTIPLECLASS
This is the same as HIERARCHY=MULTIPLE except that only CLASS effects are subject to the hierarchy requirement.
The default value is HIERARCHY=SINGLE, which means that model hierarchy is to be maintained for all effects (that is, both CLASS and interval effects) and that only a single effect can enter or leave the model at each step.
Create Oracle Tables
libname oradb Oracle User=orauser Password=xxxxxxx Path="@orapth";
proc sql;
      create table oradb.dsn as
      select * from temp;
quit;
run;
proc append base=oradb.dsn data=temp1 force;
run;
proc sql;
      create table oradb.dsn as
      select * from temp;
quit;
run;
proc append base=oradb.dsn data=temp1 force;
run;
Subscribe to:
Posts (Atom)