Welcome to your onestop solution for all the information you need to excel in the SAS Advanced Predictive Modeling (A00225) Certification exam. This page provides an indepth overview of the SAS A00225 Exam Summary, Syllabus Topics, and Sample Questions, designed to lay the foundation for your exam preparation. We aim to help you achieve your SAS Advanced Analytics Professional certification goals seamlessly. Our detailed syllabus outlines each topic covered in the exam, ensuring you focus on the areas that matter most. With our sample questions and practice exams, you can gauge your readiness and boost your confidence to take on the SAS Advanced Predictive Modeling exam.
Why SAS Advanced Predictive Modeling Certification Matters
The SAS A00225 exam is globally recognized for validating your knowledge and skills. With the SAS Advanced Analytics Professional credential, you stand out in a competitive job market and demonstrate your expertise to make significant contributions within your organization. The SAS Advanced Predictive Modeling Certification exam will test your proficiency in the various syllabus topics.
SAS A00225 Exam Summary:
Exam Name  SAS Advanced Predictive Modeling 
Exam Code  A00225 
Exam Duration  110 minutes 
Exam Questions  5055 
Passing Score  67% 
Exam Price  $180 (USD) 
Books / Training 
SAS Academy for Data Science: Advanced Analytics Professional Subscription Neural Network Modeling Predictive Modeling Using Logistic Regression Data Mining Techniques: Predictive Analytics on Big Data Using SAS to Put Open Source Models into Production 
Exam Registration  Pearson VUE 
Sample Questions  SAS Advanced Predictive Modeling Certification Sample Question 
Practice Exam  SAS Advanced Predictive Modeling Certification Practice Exam 
SAS A00225 Exam Syllabus Topics:
Objective  Details 

Neural Networks  20% 

Describe key concepts underlying neural networks 
 Use SAS procedures to perform nonlinear modeling
 Explain advantages and disadvantages of using neural networks compared to other approaches

Use two architectures offered by the Neural Network node to model either linear or nonlinear inputoutput relationships 
 Define the linear perceptron neural network
 Be able to demonstrate how a linear perceptron is a generalized linear model that is able to model many target distributions
 Construct multilayer perceptrons
 Construct radial basis function networks
 Identify advantages of using a radial basis function network over using a multilayer perceptron (invert order) 
Use optimization methods offered by the SAS Enterprise Miner Neural Network node to efficiently search the parameter space in a neural network 
 Describe the problem of local minima  Explain the rationale behind the initialization settings  Explain how early stopping and weight decay can be used to help avoid bad local minima  Describe parameter estimation methods and determine best method to use  List the assortment of error functions that are available in the Neural Networks node and determine the appropriate one to use based upon statistical considerations
 List the optimization (training) techniques available in the Neural Networks node and determine the appropriate method to use based upon statistical considerations

Construct custom network architectures by using the NEURAL procedure (PROC Neural) 
 Working with SAS Enterprise Miner, use selected NEURAL procedure statements and PROC DMDB to construct neural networks
 Define Sequential Network Construction (SNC) and use it to build an MLP(Multilayer Perceptron) 
Based upon statistical considerations, use either time delayed neural networks, surrogate models to augment neural networks 
 Given a particular scenario/problem, use the time delayed neural network (TDNN) model to conduct time series analysis  Apply a surrogate model to help understand a neural network's predictions

Use the HP Neural Node to perform highspeed training of a neural network  
Logistic Regression  30% 

Score new data sets using the LOGISTIC and PLM procedures 
 Use the SCORE statement in the PLM procedure to score new cases  Use the CODE statement in PROC LOGISITIC to score new data  Describe when you would use the SCORE statement vs the CODE statement in PROC LOGISTIC  Use the INMODEL/OUTMODEL options in PROC LOGISTIC  Explain how to score new data when you have developed a model from a biased sample 
Identify the potential challenges when preparing input data for a model 
 Identify problems that missing values can cause in creating predictive models and scoring new data sets  Identify limitations of Complete Case Analysis  Explain problems caused by categorical variables with numerous levels  Discuss the problem of redundant variables  Discuss the problem of irrelevant and redundant variables  Discuss the nonlinearities and the problems they create in predictive models  Discuss outliers and the problems they create in predictive models  Describe quasicomplete separation  Discuss the effect of interactions  Determine when it is necessary to oversample data 
Use the DATA step to manipulate data with loops, arrays, conditional statements and functions 
 Use ARRAYs to create missing indicators  Use ARRAYS, LOOP, IF, and explicit OUTPUT statements 
Improve the predictive power of categorical inputs 
 Reduce the number of levels of a categorical variable  Explain thresholding  Explain Greenacre's method  Cluster the levels of a categorical variable via Greenacre's method using the CLUSTER procedure
 Convert categorical variables to continuous using smooth weight of evidence 
Screen variables for irrelevance and nonlinear association using the CORR procedure 
 Explain how Hoeffding's D and Spearman statistics can be used to find irrelevant variables and nonlinear associations  Produce Spearman and Hoeffding's D statistic using the CORR procedure (VAR, WITH statement)  Interpret a scatter plot of Hoeffding's D and Spearman statistic to identify irrelevant variables and nonlinear associations 
Screen variables for nonlinearity using empirical logit plots 
 Use the RANK procedure to bin continuous input variables (GROUPS=, OUT= option; VAR, RANK statements)  Interpret RANK procedure output  Use the MEANS procedure to calculate the sum and means for the target cases and total events (NWAY option; CLASS, VAR, OUTPUT statements)  Create empirical logit plots with the GPLOT procedure  Interpret empirical logit plots 
Apply the principles of honest assessment to model performance measurement 
 Explain techniques to honestly assess classifier performance  Explain overfitting  Explain differences between validation and test data  Identify the impact of performing data preparation before data is split 
Assess classifier performance using the confusion matrix 
 Explain the confusion matrix  Define: Accuracy, Error Rate, Sensitivity, Specificity, PV+, PV  Explain the effect of oversampling on the confusion matrix  Adjust the confusion matrix for oversampling 
Model selection and validation using training and validation data 
 Divide data into training and validation data sets using the SURVEYSELECT procedure  Discuss the subset selection methods available in PROC LOGISTIC  Discuss methods to determine interactions (forward selection, with bar and @ notation)  Create interaction plot with the results from PROC LOGISTIC  Select the model with fit statistics (BIC, AIC, KS, Brier score) 
Create and interpret graphs (ROC, lift, and gains charts) for model comparison and selection 
 Explain and interpret charts (ROC, Lift, Gains)  Create a ROC curve (OUTROC option of the SCORE statement in the LOGISTIC procedure)  Use the ROC and ROCCONTRAST statements to create an overlay plot of ROC curves for two or more models  Explain the concept of depth as it relates to the gains chart 
Establish effective decision cutoff values for scoring 
 Illustrate a decision rule that maximizes the expected profit  Explain the profit matrix and how to use it to estimate the profit per scored customer  Calculate decision cutoffs using Bayes rule, given a profit matrix  Determine optimum cutoff values from profit plots  Given a profit matrix, and model results, determine the model with the highest average profit 
Predictive Analytics on Big Data  40% 

Build and interpret a cluster analysis in SAS Visual Statistics 
 Assign roles for cluster analysis  Set cluster matrix properties (number, seed, etc)  Select the proper inputs for the kmeans algorithm for a given cluster analysis scenario  Choose the number of clusters for a given cluster analysis scenario  Set Parallel coordinate properties for cluster analysis  Interpret a cluster matrix  Interpret a parallel coordinates plot  Display summary statistics for clusters  Interpret summary statistics for clusters  Assign cluster IDs to the data within Visual Statistics  Score observations into clusters based on the results from Visual Statistics 
Explain SAS highperformance computing 
 Identify limitations of traditional computing environments  Describe the characteristics of SAS HighPerformance Analytics procedures  Compare SMP and MPP computing modes  Distinguish between HPA and the LASR related operation 
Perform principal component analysis 
 Explain how principal component analysis is performed  List the benefits and problems of principal component analysis  Distinguish between clustering, variable clustering, and principal component analysis  Determine the number of principal components to retain  Compare IMSTAT, Visual Statistics, and High Performance Computing nodes in Enterprise Miner 
Analyze categorical targets using logistic regression in SAS Visual Statistics 
 Assign roles for logistic regression  Assign properties for logistic regression  Filter data used for logistic regression  Interpret logistic regression results (fit summary, residual plots, ROC/Lift charts, etc)  Use GroupBy variables to perform binary logistic regression 
Analyze categorical targets using decision trees in SAS Visual Statistics 
 Assign roles for decision trees  Assign properties for decision trees  Interpret decision trees results (trees, leaf statistics, assessment, etc)  Identify variable importance with decision trees for use in other analysis techniques  Splitting criteria used by Visual Statistics 
Analyze categorical targets using decision trees in PROC IMSTAT 
 Use the DECISIONTREE statement to create decision trees  Define input variables with the INPUT and NOMINAL options  Create and retrieve saved trees for input data scoring with the SAVE, TREETAB, and ASSESS options  Evaluate the output of ODS tables (DTREE, DTreeVarImpInfo, DTREESCORE, etc) from decision trees  Use the ASSESS statement to create data sets for evaluating the decision tree model  Perform honest assessment on PROC IMSTAT decision trees  Assess decision trees using ODS statistical graphics (SGPLOT) 
Analyze categorical targets using logistic regression in PROC IMSTAT 
 Assign variables to roles for logistic regression in PROC IMSTAT  Create logistic regression in PROC IMSTAT using the LOGISTIC statement  Use selected options of the LOGISTIC STATEMENT (ROLEVAR, INPUTS, SCORE, CODE, SHOWSELECTED, SLSTAY=)  Assess logistic regression models using ODS statistical graphics (SGPLOT)  Perform honest assessment on PROC IMSTAT logistic regression 
Build random forest models with PROC IMSTAT 
 Describe random forests  Use the RANDOMWOOODS statement to build a forest of trees  Score data with the RANDOMWOODS score code  List benefits of forests  Interpret random forests  Identify variable importance with forest for use in other analysis techniques 
Analyze interval targets with SAS Visual Statistics 
 Build linear regression models in SAS Visual Statistics  Assign roles for linear regression models  Set properties for linear regression models  Assess a linear regression model (evaluate Fit summary statistics, residual plot, influence plot, summary table, etc)  Assess linear model assumption violations and recognize when linear model is inadequate  Build generalized linear models in SAS Visual Statistics  Assign roles for generalized linear models  Set properties for generalized linear models  Assess a generalized linear model (evaluate Fit summary statistics, residual plot, assessment, etc) 
Analyze interval targets with PROC IMSTAT 
 Use GENMODEL and GLM statements  Distinguish between GENMODEL and GLM statements and the results of each procedure  Assign variables to roles for GENMODEL and GLM statements in PROC IMSTAT  Create models with GENMODEL and GLM statements in PROC IMSTAT  Use selected options of the GENMODEL and GLM statements in PROC IMSTAT  Assess models using ODS statistical graphics (SGPLOT)  Perform honest assessment on PROC IMSTAT linear models 
Analyze zero inflated models with HPGLM in Enterprise Miner 
 Identify when it would be appropriate to use mixture distribution  Describe the link functions and distributions available in the HP GLM node  Build a zero inflated generalized linear model in EM  Describe restrictions on roles and levels in input data sources for generalized linear models in EM  Assess a zero inflated generalized linear model (evaluate Fit summary statistics, residual plot, assessment, etc) 
Open Source Models in SAS  10% 

Incorporate an existing R program into SAS Enterprise Miner 
 Enable R language statements to connect SAS to R  Use the Open Source Integration node in SAS Enterprise Miner
 Use Enterprise Miner variable handles to alter an R script

Incorporate an existing Python program into SAS Enterprise Miner 
 Determine steps to perform in SAS to incorporate a Python model  Determine nodes in Enterprise Miner to incorporate a Python model  Determine the necessary set up requirements for running Python models in SAS 
The SAS has created this credential to assess your knowledge and understanding in the specified areas through the A00225 certification exam. The SAS Advanced Analytics Professional exam holds significant value in the market due to the brand reputation of SAS. We highly recommend thorough study and extensive practice to ensure you pass the SAS Advanced Predictive Modeling exam with confidence.