Data Analytics and Tools
The word “analytics” has been trending for a while now, yet there is no single way to define it. Let us try to understand analytics with a simple example. Imagine, you want to buy a shirt. What you do is, some sort of analysis to help you buy a Shirt. Let me simplify it further. Your brain did two simple things here:
Collected information as per your requirement and understood the data, based on that information, helped you decide in buying the Shirt.
This is what you can do using analytics. You can gather information, analyze it and take better decisions. The above example was easy, so you could take a decision based on a few assumptions. What if the problem and the decision making wasn’t this easy?
Consider this problem from a business point of view. Suppose, an e-commerce company wants to study the buying patterns of its customers based on the previous data. Will the company have to consider thousands of records isn’t it? Now, imagine the data we just talked about or permutations and combinations the company may consider for different preferences which people may have.
Also, the company may not have all the data. For example, if a customer did not buy a Shirt, then what factors led the customer to decide not to buy the Shirt? This missing data may create problems. How do we deal with these problems? How do we handle such data? Well, these problems become easier when we use analytics. By using analytics you can eliminate unnecessary data and optimize the relevant information to find patterns which can help you make better decisions.
We have many tools at our disposal for analysis and to simplify such problems. One prominent tool is SAS. This SAS Tutorial will help you understand SAS and how it can be used to solve our problems.
The analytics market has grown immensely in the last few years. This has resulted in an increase in the number of tools used. All of these are beneficial in one way or the other. So let us move ahead with our SAS tutorial and take a look at a few of the most widely used tools in the market.
SAS: It is the most used tool in the commercial analytics market. With a plethora of statistical functions and good GUI (Enterprise Guide & Miner), it certainly leads the market.
R: It is an open-source software. It is easy to learn R because it is well documented. It is cost effective and has strong statistical capabilities.
Python is another open-source scripting language which is widely used. Python usage has grown over time. Today, it sports libraries such as Numpy, Scipy and MatPlotLib. You can perform almost any statistical operation or build any model using these libraries.
Let us move ahead with our SAS tutorial and take a look at few important SAS components:
Base SAS: It is the most widely used component. It has data management facility. You can do data analysis using Base SAS.
SAS/GRAPH: With the use SAS/Graph you can represent data as graphs. This makes data visualization easy.
SAS/STAT: It lets you perform Statistical analysis, such as Variance, Regression, Multivariate, Survival and Psychometric analysis.
SAS/ETS: It is suited for Time Series Analysis.
Since this is an introductory article, we will be focusing on Base SAS and I am sure, it should be easy for everyone to understand.
All The Key Considerations Before Installing SAS 9.4
- The new version of SAS 9.4 (TS1M1) will work on 32-bit or 64-bit Operating Systems.
- SAS 9.4 can only be installed on Professional, Ultimate or Enterprise versions of Windows 7; Pro or Enterprise versions of Windows 8 and Windows 8.1 and Windows 10.
- SAS requires 10 - 15 GB of hard drive space.
- This installation will take at least 3 hours to complete after you have downloaded and decompressed the installation media.
- It is strongly recommended that you download the installation files from a wired network connection - using a wireless connection can dramatically increase your download time (24 hours or more has been reported by some users) and the files can be corrupted or damaged when downloaded so slowly.
- If you are migrating from a previous version of SAS, please read all vendor documentation and follow the recommended steps for your current version. http://support.sas.com/rnd/migration/utility/upgrade.html
- Make sure to back up any data files before installing or upgrading.
Some checks before starting Installation of SAS 9.4:
- If you have not reviewed them yet, please refer to the System Requirements.
- Review the appropriate requirements for the SAS Installer account:
- For Windows, the installer account must have Administrator rights (user must be a local administrator on the machine and/or a member of the administrator's group.
- Check compatibility of your computer operating system. SAS 9.4 runs on the professional, ultimate, and enterprise versions of Windows 7 and the Pro and Enterprise versions of Windows 8 and Windows 8.1
- Determine if your computer is running a 32-bit or 64-bit version of the operating system. SAS 9.4 is available in a 32-bit edition or a 64-bit edition.
- Make sure you have sufficient storage space on your computer. SAS 9.4 requires at 10 - 15 gigabytes of disk space including documentation. If you are licensed for the Teaching and Research version of SAS, installing all components of the program requires over 15 gigabytes of disk space including documentation.
- Close all open software applications and disable any virus scanning software on your computer.
- Make sure that your computer is connected to a wired network connection. Downloading the installation files from a wireless connection is not recommended as these files are very large and require a significant amount of time and bandwidth to download. If you get an error at any point during your download, we strongly recommend that you start the download over again.
- For complete details on system requirements for SAS 9.4 please visit the documentation prepared by SAS on SAS System Requirements Windows x64 or SAS System Requirements Windows.
Step by step SAS 9.4 Installation guide
- Check your computer system to verify that you are using a 32-bit or 64-bit enterprise version of Windows 7 or Windows 8.
- Make sure your copy of Microsoft Windows is up-to-date; make sure that all critical Windows updates have been installed. You can find out if all your critical Windows updates have been installed by going to Windows Update and checking for available updates.
- When you download the SAS files from the ITS Software Download site, the files will be compressed as zip files. After you have downloaded the appropriate file, you will need to use a program such as WinZip to decompress the file so that you can install the software. Once you have selected and installed a software program on your computer to decompress the zip file, choose one of the folders and extract, or decompress all the files in that folder to another folder on your desktop.
- If you are modifying an existing SAS deployment, perform a backup before you install your new SAS software.
- Back up the existing SASHOME directory (for example, C:\Program Files\SAS).
- To install your software, use the SAS Deployment Wizard in your SAS Software Depot and follow the steps outlined below.
- Double-click the SAS EXE (Setup file) at the root of the media to start the SAS Deployment Wizard.
- The SAS Deployment Wizard should launch. Select the language you wish to use for the installation process (English is the default) and click OK.
- The SAS Deployment Wizard may take several minutes to load and does not show a progress bar. This is normal.
- On the SAS Deployment Task window, choose Install SAS Software. Click Next.
- On the Specify SAS Home window, keep the default installation directory. Click Next.
- On the Select Deployment Type window, select Install SAS Foundation and Related Software. Click Next.
- Select “Install SAS software” at the End User Tasks section and then click next.
- On the Select Products to Install window, check the box next to each individual SAS product you wish to install. The default configuration should have all the options selected that are covered with a standard SAS license. When you have finished selecting the SAS products to install, Click Next.
- On the Select Java Runtime Environment window, we recommend that you select Use Recommended Java Runtime Environment unless you know you have a specific need for a particular different version. Click Next.
- If you are installing on a computer running the 64-bit version of Windows, you will see the following additional screens:
1. On the Select SAS Foundation Mode, you must choose whether to install the 64-bit or 32-bit version of SAS Foundation.
2. The 32-bit version is more compatible with previous SAS files and other applications, while the 64-bit version allows more powerful computation. Click the radio button next to the versions you wish to install. (We recommend that you accept the default values.) Click Next.
3. On the Select SAS Enterprise Guide Mode window, select which mode of SAS Enterprise Guide to install. (We recommend that you accept the default value.) Click Next.
4. On the Select SAS Add-In for Microsoft Office Mode window, select the version corresponding to the version of Microsoft Office installed on the machine. Click Next.
- On the Select SAS Foundation Products window, choose which individual components of the SAS Foundation you wish to install. We recommend that you accept the default choices. Click Next.
- On the Specify SAS Installation Data window, browse to the location of your current license file. This should already be in the default location. Click Next.
- On the Select Language Support window, choose which languages you wish to be supported on your installation. By default, all languages will be installed. We recommend that you first click the Clear All button, which deselects all languages except English. You may then add back any individual languages you would prefer. Click Next.
- On the Select Regional Settings window, the language should default to the primary language you chose when you started the installation. Click Next.
- On the Default Product for SAS File Types window, choose either the SAS Foundation or SAS Enterprise Guide. Click Next.
- On the Specify Document Conversion Host and Port window, we recommend that you accept the default values. Click Next.
- SAS should then run a system configuration check, to make sure that the components and values you have selected are valid for your machine and operating system. This can take several minutes. If it returns errors, click <Back to the problematic selection and choose a different value. If it passes, Click Next.
- Review the installation options. If they are correct, click Start. Once the installation has completed, Click Next.
- On the Select Support Option window, choose the Send option to submit your license information to SAS. After choosing your option, Click Next.
- On the Additional Resources window, click Finish.
SAS as a Programming Language
Most programming environments are either menu driven (point-and-click) or command driven (enter and execute commands). However, SAS is neither menu driven nor command driven. This is because it uses a series of instructions or statements known as SAS program. This program is a depiction of what you want to do and is written in SAS language.
Data is central to every data set. In SAS, data is available in the tabular form where variables occupy the column space, and observations occupy the row space.
SAS treats numbers as numeric data and everything else falls under character data. Hence SAS has two data types, numeric and character. Easy, isn’t it?
DATA step and PROC step form the basic building blocks of a SAS program. What do these building blocks do is what we are going to discuss in this SAS tutorial.
Building Blocks of SAS
We start a program with a DATA step to create a SAS data set and then pass the data onto a PROC step. The PROC step processes the data. In order to understand how DATA and PROC steps work, let us consider the below example.
Suppose I wanted to convert a number which is in inches to centimetres and store the result in a variable called ‘size’ and print it, then the DATA step would convert the number in inches to centimetres and PROC step would print the result.
The statements constitute DATA and PROC steps. The length of a step may vary from one to more than hundred statements. It is important you remember that DATA steps are used to read and modify data, whereas PROC steps are used to analyse data, perform utility functions, or print reports.
DATA steps begin with the keyword DATA which is followed by a name that you choose for your SAS data set. It is evident that the above DATA step produces a data set named size. DATA steps read data from external data files and may also be used to include loops and case statements. It can be used to merge, sort, combine and concatenate data. Similarly, procedures start with a PROC statement where the keyword PROC follows the name of the procedure used (for example the name of the procedure may be PRINT, SORT, or MEAN). SAS procedures mostly have a handful of possible statements. Each time SAS comes across a new step (marked by a DATA or PROC statement), it terminates or ends the previous step and starts with a new one.
While a typical program starts with a DATA step to input or modify data, and then passes the data to a PROC step, it is certainly not the only pattern for mixing DATA and PROC steps. Just as you can stack building blocks in any order, you can arrange DATA and PROC steps in any order. A program could even contain only DATA steps or only PROC steps. Nonetheless, you will find it much easier to write SAS programs if you understand these basic functions. The above mentioned are few basics every SAS beginner should know. Moving on to the next part of our SAS tutorial, let us understand how to install SAS university edition.
Now beginners can learn and practice SAS, as SAS Institute Inc has released SAS University Edition which is available for free. All the features which are needed to learn Base SAS are available here. Learning Base SAS will make it easy for you to learn other components. The following steps will help you install SAS University Edition. It is a software using which you can practice SAS programming.