## Predictive Modeling with R & RStudio

**1. Getting & Loading Your Data**

Before you can work with data you have to get some. This lesson will cover the basic ways that data can be obtained. The course will cover obtaining data from the web, from APIs, from databases and in various formats. It will also cover the basics of data cleaning and how to make data “tidy”. Tidy data dramatically speed downstream data analysis tasks.

A. Mining Datasets

B. Access Your Data Anywhere C. Big Data, EDW, CRM, ERP or Social Media

D. All data formats: csv, xlsx, hive, spark, html, API etc.

E. Load Data From local File or remote URL

**2. Understand Your Data Using Descriptive Analytics**

Descriptive analytics is a preliminary stage of data processing that creates a summary of historical data to yield useful information and possibly prepare the data for further analysis. … Descriptive analytics is sometimes said to provide information about what happened.

A. Class Distribution

B. Data Summary

C. Standard Deviations

D. Skewness

E. Correlations

F. Hands On Lab

**3. Understand Your Data Using Exploratory Data Visualization**

In statistics, exploratory data analysis (EDA) is an approach to analyzing data sets to summarize their main characteristics, often with visual methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling or hypothesis testing task.

A. Get The Best Results with QPlot, GGPlot & GGViz

B. Univariate and Multivariate Visualization

C. Tips For Data Visualization

D. Hands On Lab

**4. Prepare Data for Predictive Modeling**

Data preprocessing is a data mining technique that involves transforming raw data into an understandable format. Real-world data is often incomplete, inconsistent, and/or lacking in certain behaviors or trends, and is likely to contain many errors. Data preprocessing is a proven method of resolving such issues.

A. Data Pre-Processing in R

B. Scale, Center, Standardize and Normalize Data

C. Box-Cox and Yeo-Johnson Transform

D. Principal Component Analysis Transform

E. Independent Component Analysis Transform

F. Tips For Data Transforms

G. Hands On Lab

**5. Inferential Statistics**

Statistical inference is the process of deducing properties of an underlying distribution by analysis of data. Inferential statistical analysis infers properties about a population: this includes testing hypotheses and deriving estimates.

A. Frequentists vs Bayesians

B. Hypothesis Testing

C. Central Limit Theorem

D. CI, P Value, ANOVA & T-test

E. Common Pitfalls of Hypothesis Testing

F. Model Selection

G. Hands On Lab

**6. Resampling Methods & Estimating Model Accuracy**

There are many different tests to determine if the predictive models you create are accurate, meaningful representations that will prove valuable to your organization—but which are best? Here, your will learn and apply the tests used to measure different models results, and what makes each test so effective.

A. Estimating Model Accuracy

B. Data Split

C. Bootstrap & Bagging

D. k-fold Cross Validation

E. Tips For Evaluating Algorithms

F. Hands On Lab

**7. Statistical Modeling Algorithms**

A statistical model is a class of mathematical model, which embodies a set of assumptions concerning the generation of some sample data, and similar data from a larger population. In simple terms, statistical modeling is a simplified, mathematically-formalized way to approximate reality and optionally to make predictions from this approximation. The statistical model is the mathematical equation that is used to guess/predict.

A. Linear Regression

B. Non-linear Regression

C. Logistic Regression

D. Hands On Lab

**8. Compare Performance of Multiple Algorithms**

Comparing for the purpose of Finding the alternative with the most cost effective or highest achievable performance under the given constraints, by maximizing desired factors and minimizing undesired ones. In comparison, maximization means trying to attain the highest or maximum result or outcome without regard to cost or expense.

A. Pre-Processing Dataset

B. Train, Tune and Test Models

C. Comparison & Evaluation Metrics in R

D. Accuracy, Kappa, RMSE and R2

E. AUC, ROC Curve, and Logarithmic Loss

F. Choose The Best Predictive Model

G. Hands On Lab

**9. Combine Prediction Models with Ensembles**

Ensemble modeling is the process of running two or more related but different analytical models and then synthesizing the results into a single score or spread in order to improve the accuracy of predictive analytics and data mining applications.

A. Increase The Accuracy Of Your Models

B. Test Dataset

C. Boosting Algorithms

D. Bagging Algorithms

E. Stacking Algorithms

F. Hands On Lab

**10. Moving Your Model to Production**

The solution to operationalizing predictive models involves the effective combination of a Decision Management approach with a robust, modern analytic technology platform. Such a combination focuses analytics on the right problems and effectively integrates analytical results directly into operational systems for faster and more profitable decisions. You will learn how to:

A. Finalize Your Machine Learning Model

B. Make Predictions On New Data

C. Create A Standalone Model

D. Save and Load Your Model

E. Hands On Lab

If you are interested in learning more about this course, then please complete the following simple form. We will contact you at your convenience. Course payment is due prior to the first day’s course start. This a reoccurring course, so we can accommodate emergency changes in student’s schedules.