Basic Concepts of SAS Programming

 

Here we have gathered some information for those students who have just started the journey of becoming SAS certified programmer. Believe me, when I have started my studies, this took almost a week of search on web and tons of reference books. Just putting all my research paper here for you guys. As a fresher programmer to understand the following basic point about SAS Programming console was a challenge.

Components of SAS Programs

SAS programs consist of two types of steps: DATA steps and PROC (procedure) steps. These two steps, alone or combined, form most SAS programs. SAS program can consist of a DATA step, a PROC step, or any combination of DATA and PROC steps. DATA steps typically create or modify SAS data sets, but they can also be used to produce custom-designed reports. PROC steps are pre- written routines that enable you to analyze and process the data in SAS data set and to present the data in the form of a report. PROC steps sometimes create new SAS data sets that contain the results of the procedure.

Characteristics of SAS Programs

SAS programs consist of SAS statements. A SAS statement usually begins with a SAS keyword and always ends with a semicolon. A DATA step begins with the keyword DATA. A PROC step begins with the keyword PROC. SAS statements are in free format so that they can begin and end anywhere on a line. One statement can continue over several lines, and several statements can be on a line. Blanks or special characters separate "words" in a SAS statement.

Processing SAS Programs

When you submit a SAS program, SAS reads SAS statements and checks them for errors. When it encounters a subsequent DATA, PROC, RUN, or QUIT statement, SAS executes the previous step in the program. Each time a step is executed, SAS generates a log of the processing activities and the results of the processing. The SAS log collects messages about the processing of SAS programs and about any errors that occur. The results of processing can vary. Some SAS programs open an interactive window or invoke procedures that create output in the form of a report. Other SAS programs perform tasks such as sorting and managing data, which have no visible results other than messages in the log.

SAS Libraries

Every SAS file is stored in a SAS library, which is a collection of SAS files such as SAS data sets and catalogs. In some operating environments, a SAS library is a physical collection of files. In others, the files are only logically related. In the Windows and UNIX environments, a SAS library is typically a group of SAS files in the same folder or directory. Depending on the library you use, you can store SAS files in temporary SAS libraries or in permanent SAS libraries.

Temporary SAS files that are created during the session are held in a special workspace that is assigned the default libref Work. If you don't specify a libref when you create a file (or if you specify Work), then the file is stored in the temporary SAS library. When you end the session, the temporary library is deleted.

To store a file permanently in a SAS library, you assign it a libref other than the default Work. For example, by assigning the libref Clinic to a SAS library, you specify that files within the library are to be stored until you delete them.

Overview of SAS Data Sets

For many of the data processing tasks that you perform with SAS, you access data in the form of a SAS data set and use SAS programs to analyze, manage, or present the data. Conceptually, a SAS data set is a file that consists of two parts: a descriptor portion and a data portion. Some SAS data sets also contain one or more indexes, which enable SAS to locate records in the data set more efficiently.

The descriptor portion of a SAS data set contains information about the data set.

The data portion of a SAS data set is a collection of data values that are arranged in a rectangular table. Observations in the data set correspond to rows or data lines in a raw data file or an external database. An observation is an information about each object in a SAS data set. Variables in the data set correspond to columns in a raw data file or in an external database. A variable is the set of data values that describe a particular characteristic. If a data value is unknown for a particular observation, a missing value is recorded in the SAS data set.

Variable Attributes

In addition to general information about the data set, the descriptor portion contains attribute information for each variable in the data set. The attribute information includes the variable's name, length, and type. A variable's type determines how missing values for a variable are displayed by SAS. For character variables, a blank represents a missing value. For numeric variables, a period represents a missing value.

Points to Remember

  • Before referencing SAS files, you must assign a name (libref, or library reference) to the library in which the files are stored (or specify that SAS is to assign the name automatically).
  • You can store SAS files either temporarily or permanently.
  • Variable names follow the same rules as SAS data set names.

 

Rating: 5 / 5 (80 votes)