dweston50's profile - activity

2017-05-15 14:35:00 -0600 asked a question How to automatically determine lengths and display length frequencies in SAS

Hello I am trying to automate this. We routinely receive data in unknown formats. We bring it in as character, we create length variables to display. We are looking for the max as well as the frequencies. Typing this up each time is cumbersome. I wondered if there was a macro for this.

I have worked up some sample code here any help to make this more efficient would be appreciated

/***************Setting initial lengths****************************/
/***************Creating number as a character variable************/
data lengths (keep=CLIENT SERVER DESC NUMBER LEN_CLIENT LEN_SERVER LEN_DESC LEN_NUMBER);
length     CLIENT $ 100
           SERVER $ 100
           DESC   $ 100
           NUMBER $ 100;
INFORMAT   CLIENT $100.
           SERVER $100.
           DESC   $100.
           NUMBER $100.
                        ;
SET SASHELP.MONIKER;
           NUMBER="10";
/*************************left-aligning all character strings and determining lengths************************/
            CLIENT          = LEFT   (CLIENT    )                          ;
            SERVER          = LEFT   (SERVER    )                          ;
            DESC            = LEFT   (DESC      )                          ;
            NUMBER          = LEFT   (NUMBER    )                          ;
            LEN_CLIENT      = LENGTH (CLIENT        ) * (CLIENT         NE " ") ;
            LEN_SERVER      = LENGTH (SERVER        ) * (SERVER         NE " ") ;
            LEN_DESC        = LENGTH (DESC          ) * (DESC           NE " ") ;
            LEN_NUMBER      = LENGTH (NUMBER        ) * (NUMBER         NE " ") ;

/*************************looking at the distribution of lengths*********************************************/
/*************************the distribution of the lengths is equally informative to the max******************/
PROC FREQ   DATA = lengths ;
TABLES      LEN_CLIENT
            LEN_SERVER
            LEN_DESC
            LEN_NUMBER
                        ;
RUN;
/**********************Setting the variable to the Max Length**************************************************/
/**********************Creating Number as a numeric variable***************************************************/
DATA WORK.FINAL (KEEP = CLIENT SERVER DESC NUMBER) ;
LENGTH      CLIENT   $ 51
            SERVER   $ 30
            DESC     $ 60
            NUMBER    8.
                          ;
SET lengths  (RENAME = (NUMBER    = VAR_A));
           NUMBER            = INPUT (VAR_A,  COMMA2.)      ;
RUN                                                         ;
2017-05-15 14:35:00 -0600 asked a question SAS variable lengths distribution macro

Hello, I routinely am reading in data into SAS from various formats and must analyze the data's integrity. It is important for me to set the length of the variable large enough to accommodate the longest variable but knowing the distribution of lengths also goes a long way to informing me of data integrity problems. I am searching for a more efficient, more automated way of looking at the distribution of lengths. I always bring everything over as a character variable. Please see my method below

/***************Setting initial lengths****************************/
/***************Creating number as a character variable************/
data lengths (keep=CLIENT SERVER DESC NUMBER LEN_CLIENT LEN_SERVER LEN_DESC LEN_NUMBER);
length     CLIENT $ 100
           SERVER $ 100
           DESC   $ 100
           NUMBER $ 100;

INFORMAT CLIENT $100. SERVER $100. DESC $100. NUMBER $100. ; SET SASHELP.MONIKER; NUMBER="10"; /*******left-aligning all character strings and determining lengths********/ CLIENT = LEFT (CLIENT ) ; SERVER = LEFT (SERVER ) ; DESC = LEFT (DESC ) ; NUMBER = LEFT (NUMBER ) ; LEN_CLIENT = LENGTH (CLIENT ) * (CLIENT NE " ") ; LEN_SERVER = LENGTH (SERVER ) * (SERVER NE " ") ; LEN_DESC = LENGTH (DESC ) * (DESC NE " ") ; LEN_NUMBER = LENGTH (NUMBER ) * (NUMBER NE " ") ;

/*************************looking at the distribution of lengths*********************************************/
/*************************the distribution of the lengths is equally informative to the max******************/
PROC FREQ   DATA = lengths ;
TABLES      LEN_CLIENT
                    LEN_SERVER
                    LEN_DESC
            LEN_NUMBER
                        ;
RUN;
/**********************Setting the variable to the Max Length**************************************************/
/**********************Creating Number as a numeric variable***************************************************/
DATA WORK.FINAL (KEEP = CLIENT SERVER DESC NUMBER) ;
LENGTH CLIENT   $ 51
             SERVER   $ 30
             DESC        $ 60
             NUMBER    8.
                          ;
SET lengths  (RENAME = (NUMBER    = VAR_A));
         NUMBER  = INPUT (VAR_A,  COMMA2.)      ;
RUN                                                         ;

Typing this all out takes a large part of my day and I am convinced there is a macro that will solve my problem. Thanks in advanced!