This page is a onestop solution for any information you may require for SAS Big Data Preparation, Statistics, and Visual Exploration (A00220) Certification exam. The SAS A00220 Exam Summary, Syllabus Topics and Sample Questions provide the base for the actual SAS Certified Big Data Professional Using SAS 9 exam preparation, we have designed these resources to help you get ready to take your dream exam.
The SAS Big Data Preparation, Statistics, and Visual Exploration credential is globally recognized for validating SAS Big Data Professional knowledge. With the SAS Certified Big Data Professional Using SAS 9 Certification credential, you stand out in a crowd and prove that you have the SAS Big Data Professional knowledge to make a difference within your organization. The SAS Big Data Preparation, Statistics, and Visual Exploration Certification (A00220) exam will test the candidate's knowledge on following areas.
SAS A00220 Exam Summary:
Exam Name  SAS Big Data Preparation, Statistics, and Visual Exploration 
Exam Code  A00220 
Exam Duration  110 minutes 
Exam Questions  55 to 60 Multiple choice questions 
Passing Score  67% 
Exam Price  $180 (USD) 
Training 
1. SAS Academy for Data Science: Big Data 
Exam Registration  Pearson VUE 
Sample Questions  SAS Big Data Professional Certification Sample Question 
Practice Exam  SAS Big Data Professional Certification Practice Exam 
SAS A00220 Exam Topics:
Objective  Details 

Data Management  50% 

Navigate within the Data Management Studio Interface 
 Register a new QKB  Create and connect to a repository  Define a data connection  Specify Data Management Studio options  Access the QK  Create a name value macro pair  Access the business rules manager  Access the appropriate monitoring report  Attach and detach primary tabs 
Create, design and be able to explore data explorations and interpret results  
Define and create data collections from exploration results  
Create and explore a data profile 
 Create a data profile from different sources (text file, filtered table, SQL query)  Interpret results (frequency distribution & pattern)  Use collections from profile results 
Design data standardization schemes 
 Build a scheme from profile results  Build a scheme manually  Update existing schemes 
Create Data Jobs 
 Rename output fields  Add nodes and preview nodes  Run a data job  View a log and settings  Work with data job settings and data job displays  Best practices (how do you ensure that you are following a particular best practice): examples: insert notes, establish naming conventions  Work with branching  Join tables  Apply the Field layout node to control field order  Work with the Data Validation node: Add it to the job flow Specify properties/review properties Edit settings for the Data Validation node  Work with data inputs  Work with data outputs  Profile data from within data jobs  Interact with the Repository from within Data Jobs  Determine how data is processed  Data job variables  Set Sorting properties for the Data Sorting node Set appropriate advanced properties options for the Data Sorting Node 
Apply a Standardization definition and scheme 
 Use a definition  Use a scheme  Be able to determine the differences between definition and scheme  Explain what happens when you use both a definition and scheme  Review and interpret standardization results  Be able to explain the different steps involved in the process of standardization 
Apply Parsing definitions 
 Distinguish between different data types and their tokens  Review and interpret parsing results  Be able to explain the different steps involved in the process of parsing  Use parsing definition 
Compare and contrast the differences between identification analysis and right fielding nodes 
 Review results  Explain the technique used for identification (process of the definition) 
Apply the Gender Analysis node to determine gender 
 Use gender definition  Interpret results  Explain different techniques for accomplishing gender analysis 
Create an Entity Resolution Job 
 Use a node in the data job that is the clustering node and explain why you would want to use it  Survivorship (surviving record identification) Record rules Field rules Options for survivorship  Discuss and apply the Cluster Diff node  Apply Crossfield matching (new option)  Use the Match Codes Node to select match definitions for selected fields Outline the various uses for match codes (join) Use the definition Interpret the results Match versus match parsed Explain the process for creating a match code Select sensitivity for a selected match definition Apply matching best practices 
Define and create business rules 
 Use Business Rules Manager  Create a new business rule Name/label rule Specify type of rule Define checks Specify fields  Distinguish between different types of business rules Row Set Group  Apply business rules Profile Execute business rule node  Use of Expression Builder  Apply best practices 
Describe the organization, structure and basic navigation of the QKB 
 Identify and describe locale levels (global, language, country)  Navigate the QKB (tab structure, copy definitions, etc.)  Identify data types and tokens 
Be able to articulate when to use the various components of the QKB 
 Components include: Regular expressions Schemes Phonetics library Vocabularies Grammar Chop Tables 
Define the processing steps and components used in the different definition types 
 Identify/describe the different definition types Parsing Standardization Match Identification Casing Extraction Locale guess Gender Patterns 
ANOVA and Regression  30% 

Verify the assumptions of ANOVA 
 Explain the central limit theorem and when it must be applied  Examine the distribution of continuous variables (histogram, boxwhisker, QQ plots)  Describe the effect of skewness on the normal distribution  Define H0, H1, Type I/II error, statistical power, pvalue  Describe the effect of sample size on pvalue and power  Interpret the results of hypothesis testing  Interpret histograms and normal probability charts  Draw conclusions about your data from histogram, boxwhisker, and QQ plots  Identify the kinds of problems may be present in the data: (biased sample, outliers, extreme values)  For a given experiment, verify that the observations are independent  For a given experiment, verify the errors are normally distributed  Use the UNIVARIATE procedure to examine residuals  For a given experiment, verify all groups have equal response variance  Use the HOVTEST option of MEANS statement in PROC GLM to asses response variance 
Analyze differences between population means using the GLM and TTEST procedures 
 Use the GLM Procedure to perform ANOVA CLASS statement MODEL statement MEANS statement OUTPUT statement  Evaluate the null hypothesis using the output of the GLM procedure  Interpret the statistical output of the GLM procedure (variance derived from MSE, F value, pvalue R 2 , Levene's test)  Interpret the graphical output of the GLM procedure  Use the TTEST Procedure to compare means 
Perform ANOVA post hoc test to evaluate treatment affect 
 use the LSMEANS statement in the GLM or PLM procedure to perform pairwise comparisons  use PDIFF option of LSMEANS statement  use ADJUST option of the LSMEANS statement (TUKEY and DUNNETT)  Interpret diffograms to evaluate pairwise comparisons  Interpret control plots to evaluate pairwise comparisons  Compare/Contrast use of pairwise TTests, Tukey and Dunnett comparison methods  PLM 
Detect and analyze interactions between factors 
 Use the GLM procedure to produce reports that will help determine the significance of the interaction between factors. MODEL statement LSMEANS with SLICE=option (Also using PROC PLM) ODS SELECT  Interpret the output of the GLM procedure to identify interaction between factors: pvalue F Value R Squared TYPE I SS TYPE III SS 
Fit a multiple linear regression model using the REG and GLM procedures 
 Use the REG procedure to fit a multiple linear regression model  Use the GLM procedure to fit a multiple linear regression model 
Analyze the output of the REG, PLM, and GLM procedures for multiple linear regression models 
 Interpret REG or GLM procedure output for a multiple linear regression model: convert models to algebraic expressions  Convert models to algebraic expressions  Identify missing degrees of freedom  Identify variance due to model/error, and total variance  Calculate a missing F value  Identify variable with largest impact to model  For output from two models, identify which model is better  Identify how much of the variation in the dependent variable is explained by the model  Conclusions that can be drawn from REG, GLM, or PLM output: (about H0, model quality, graphics) 
Use the REG or GLMSELECT procedure to perform model selection 
 Use the SELECTION option of the model statement in the GLMSELECT procedure  Compare the different model selection methods (STEPWISE, FORWARD, BACKWARD)  Enable ODS graphics to display graphs from the REG or GLMSELECT procedure  Identify best models by examining the graphical output (fit criterion from the REG or GLMSELECT procedure)  Assign names to models in the REG procedure (multiple model statements) 
Assess the validity of a given regression model through the use of diagnostic and residual analysis 
 Explain the assumptions for linear regression  From a set of residuals plots, asses which assumption about the error terms has been violated  Use REG procedure MODEL statement options to identify influential observations (Student Residuals, Cook's D, DFFITS, DFBETAS)  Explain options for handling influential observations  Identify colinearity problems by examining REG procedure output  Use MODEL statement options to diagnose collinearity problems (VIF, COLLIN, COLLINOINT) 
Perform logistic regression with the LOGISTIC procedure 
 Identify experiments that require analysis via logistic regression  Identify logistic regression assumptions  logistic regression concepts (log odds, logit transformation, sigmoidal relationship between p and X)  Use the LOGISTIC procedure to fit a binary logistic regression model (MODEL and CLASS statements) 
Optimize model performance through input selection 
 Use the LOGISTIC procedure to fit a multiple logistic regression model  LOGISCTIC procedure SELECTION=SCORE option  Perform Model Selection (STEPWISE, FORWARD, BACKWARD) within the LOGISTIC procedure 
Interpret the output of the LOGISTIC procedure 
 Interpret the output from the LOGISTIC procedure for binary logistic regression models: Model Convergence section Testing Global Null Hypothesis table Type 3 Analysis of Effects table Analysis of Maximum Likelihood Estimates table Association of Predicted Probabilities and Observed Responses 
Visual Data Exploration  20% 

Examine, modify, and create data items 
 Create and use parameterized data items  Examine data item properties and measure details  Change data item properties  Create custom sorts  Create distinct counts  Create aggregated measures  Create calculated items  Create hierarchies  Create custom categories 
Select and work with data sources 
 Work with multiple data sources  Change data sources  Refresh data sources 
Create, modify, and interpret automatic chart visualizations in Visual Analytics Explorer 
 Identify default visualizations  Identify the properties available in an automatic chart 
Create, modify, and interpret graph and table visualizations in Visual Analytics Explorer 
 Work with list table visualizations  Work with crosstab visualizations  Work with bar chart visualizations  Work with line chart visualizations  Work with scatter plot visualizations  Work with bubble plot visualizations  Work with histogram visualizations  Work with box plot visualizations  Work with heat map visualizations  Work with geo map visualizations  Work with treemap visualizations  Work with correlation matrix visualizations 
Enhance visualizations with analytics within Visual Analytics Explorer 
 Add fit lines to visualizations  Create forecasts  Interpret word clouds 
Interact with visualizations and explorations within Visual Analytics Explorer 
 Control appearance of visualizations within explorations  Add comments to visualizations and explorations  Use filters on data source and visualizations  Share explorations  Share visualizations 
The SAS has created this credential to assess the knowledge and understanding of a candidate in the area as above via the certification exam. The SAS Big Data Professional (A00220) Certification exam contains a high value in the market being the brand value of the SAS attached with it. It is highly recommended to a candidate to do a thorough study and also get a hand full of the practice to clear SAS Big Data Preparation, Statistics, and Visual Exploration exam without any hiccups.