Data Science Online Training Course Content

Module:1 – Descriptive & Inferential Statistics

1. Turning Data into Information

• Data Visualization
• Measures of Central Tendency
• Measures of Variability
• Measures of Shape
• Covariance, Correlation
• Using Software-Real Time Problems

2.Probability Distributions

• Probability Distributions: Discrete Random Variables
• Mean, Expected Value
• Binomial Random Variable
• Poisson Random Variable
• Continuous Random Variable
• Normal distribution
• Using Software-Real Time Problems

3.Sampling Distributions

• Central Limit Theorem
• Sampling Distributions for Sample Proportion, p-hat
• Sampling Distribution of the Sample Mean, x-bar
• Using Software-Real Time Problems

4.Confidence Intervals

• Statistical Inference
• Constructing confidence intervals to estimate a population Mean, Variance, Proportion
• Using Software-Real Time Problems

5.Hypothesis Testing

• Hypothesis Testing
• Type I and Type II Errors
• Decision Making in Hypothesis Testing
• Hypothesis Testing for a Mean, Variance, Proportion
• Power in Hypothesis Testing
• Using Software-Real Time Problems

6.Comparing Two Groups

• Comparing Two Groups
• Comparing Two Independent Means, Proportions
• Pairs wise testing for Means
• Two Variances Test(F-Test)
• Using Software-Real Time Problems

7. Analysis of Variance (ANOVA)

• One-Way and Two-way ANOVA
• ANOVA Assumptions
• Multiple Comparisons (Tukey, Dunnett)
• Using Software-Real Time Problems

8.Association Between Categorical Variables

• Two Categorical Variables Relation
• Statistical Significance of Observed Relationship / Chi-Square Test
• Calculating the Chi-Square Test Statistic
• Contingency Table
• Using Software-Real Time Problems

Module:2 – Applied Regression Methods

  1. Simple Linear Regression(SLR)
  • Prerequisite Mathematics
  • The Simple Linear Regression Model
  • What is The Common Error Variance?
  • The Coefficient of Determination
  • Hypothesis Test for the Population Correlation Coefficient
  • Using Software-Real Time Problems
  1. SLR Model Evaluation
  • Inference for the Population Intercept and Slope
  • The Analysis of Variance (ANOVA) table and the F-test
  • Equivalent linear relationship tests
  • Decomposing the Error
  • The Lack of Fit F-test
  • Using Software-Real Time Problems
  1. SLR Estimation & Prediction
  • Confidence Interval for the Mean Response
  • Prediction Interval for a New Response
  • Using Software-Real Time Problems
  1. SLR Model Assumptions
  • Model Assumptions Diagnostics
  • Using Software-Real Time Problems
  1. Multiple Linear


  • The Multiple Linear Regression Model
  • Using Software-Real Time Problems
  1. MLR Model Evaluation
  • The General Linear Test
  • Sequential (or Extra) Sums of Squares
  • The Hypothesis Tests for the Slopes
  • Partial R-squared
  • Lack of Fit Testing in the Multiple Regression Setting
  • Using Software-Real Time Problems
  1. MLR Estimation, Prediction & Model Assumptions
  • Confidence Interval for the Mean Response
  • Prediction Interval for a New Response
  • Model Assumptions Diagnostics
  • Using Software-Real Time Problems
  1. Categorical Predictors
  • Coding Qualitative Variables
  • Additive Effects
  • Interaction Effects
  • Using Software-Real Time Problems
  1. Data Transformations
  • Using Software-Real Time Problems
  1. Model Building
  • Forward Selection/Backward Elimination
  • Stepwise Regression
  • Adjusted R-Sq, Mallows Cp, PRESS, AIC, BIC, SBC, AICC
  • Outliers and Influential Data Points
  • Cooks Distance/DIFBETAS/DFFITS
  • Using Software-Real Time Problems

Module:3 – Applied Time Series Analysis

1. Time Series Basics

• Overview
• ACF and AR(1) Model

2. MA Models, PACF

• Moving Average Models (MA models)
• Using Software-Real Time Problems

3. ARIMA models

• Non-seasonal ARIMA
• Diagnostics
• Forecasting
• Using Software-Real Time Problem

4. Seasonal Models

• Seasonal ARIMA
• Identifying Seasonal Models
• Using Software-Real Time Problems

5. Smoothing and Decomposition Methods

• Decomposition Models
• Smoothing Time Series
• Using Software-Real Time Problems

6. Periodogram

• Periodogram
• Using Software-Real Time Problems

7. Regression with ARIMA errors; CCF; 2 Time Series

• Linear Regression Models with Autoregressive Errors
• CCF and Lagged Regressions
• Using Software-Real Time Problems

Module:4 – Machine Learning


• Application Examples
• Supervised Learning
• Unsupervised Learning

2.Regression Shrinkage Methods

• Ridge RegressionüLasso
• Using Software-Real Time Problems


• Logistic Regression
• Discriminant Analysis
• Nearest-Neighbor Methods
• Using Software-Real Time Problems

4. Tree-based Methods

• The Basics of Decision Trees
• Regression Trees
• Classification Trees
• Ensemble Methods
• Bagging, Boosting, Bootstrap, Random Forests
• Using Software-Real Time Problems

5. Neural Networks

• Introduction
• Single Layer Perceptron
• Multi-layer Perceptron
• Forward Feed and Backward Propagation
• Using Software-Real Time Problems

6.Support Vector Machine

• Support Vector Classifier
• Support Vector Machine
• SVMs with More than Two Classes
• Using Software-Real Time Problems

7.Dimension Reduction Methods

• Principal Components Regression (PCR)
• Partial Least Squares (PLS)
• Using Software-Real Time Problems

8.Association rules

• Market Basket Analysis
• Using Software-Real Time Problems

Module:5 – SAS/R Programming

1.Base SAS

• Working with SAS program syntax
• Examining SAS data sets
• Accessing SAS libraries
• Producing Detail Reports
• Sorting and grouping report data
• Enhancing reports
• Formatting Data Values
• Creating user-defined formats
• Reading SAS Data Sets
• Customizing a SAS data set
• Handling missing data
• Manipulating Data
• Combining SAS Data Sets
• Creating Summary Reports
• Controlling Input and Output
• Summarizing Data
• Reading Raw Data Files
• Data Transformations
• Debugging Techniques
• Using the PUTLOG statement
• Processing Data Iteratively
• Restructuring a Data Set
• Creating and Maintaining Permanent Formats


• Working with SAS program syntax
• Basic Queries
• Examining SAS data sets
• Sub-Queries
• Accessing SAS libraries
• Joins (SQL)
• Producing Detail Reports
• Operators
• Sorting and grouping report data
• Creating Tables and Views
• Enhancing reports
• Managing Tables
• Formatting Data Values

3. SAS Macros

• Creating user-defined formats
• Macro Variables
• Reading SAS Data Sets
• Definitions
• Customizing a SAS data set
• Data Step and SQL Interfaces
• Handling missing data

4. R Programming

• Manipulating Data
• RCMDR Package
• Combining SAS Data Sets
• Rattle Package
• Creating Summary Reports

