Machine Learning Comprehensive (Using R)

Ex utamur fierent tacimates duis choro an

Lorem ipsum dolor sit amet, ius minim gubergren ad. At mei sumo sonet audiam, ad mutat elitr platonem vix. Ne nisl idque fierent vix.

Overview

Machine Learning is the basis for the most exciting careers in data analysis today. You’ll learn the models and methods and apply them to real world situations ranging from identifying trending news topics, to building recommendation engines, ranking sports teams and plotting the path of movie zombies.

Objective

The course objective is to provide you an in depth understanding and hands on with current business problem using Machine Learning Classification & Regression techniques.

Prerequisites

  • Basic knowledge of statistics and Mathematics
  • and basic programming knowledge

What you'll learn

  • Statistical Concept used in Machine Learning
  • R Programming
  • Data Pre-processing & Exploratory Data Analysis
  • Advanced Regression & Classification technique
  • Model Evaluation technique & deciding the best fit model
  • How to Solve a real business problem using Machine Learning

Course Outline

  • Applications of Machine Learning in various fields
  • How is Machine Learning different from traditional programming and reporting?
  • Who are data scientists?
  • What they do & what kind of projects they work on?
  • Data types
    • Continuous variable
    • Ordinal variable
    • Categorical Variable
    • Time series
    • Miscellaneous
  • Descriptive statistics
  • Inferential statistics
  • What is sampling
  • Different types of sampling
  • Simple random sampling
  • Systematic sampling
  • Stratified Sampling
  • Normal Distribution
  • Binomial Distribution
  • Skewness
  • Mean, Median, Mode
  • Variance and Standard Deviation
  • What is Normalization?
  • Different Types of Normalization
  • Z score
  • Null & Alternate Hypothesis
  • Type 1 and Type 2 Errors
  • What is Correlation?
  • Correlation Coefficient
  • Positive and Negative Correlation
  • An introduction to R programming.
  • Type of objects in R
  • Creating new variables or uploading existing variables
  • If statement and For loops.
  • String searching and manipulations.
  • Reading data from data frames and text files.
  • Casting and melting data to different formats.
  • Merging datasets
  • Filtering data using dplyr
  • Getting data into R – reading from files
  • Linear Vs Non-Linear data
  • Bi-variate and Multi-variate analysis
  • Cleaning and preparing the data – converting data types (Character to numeric etc.)
  • Handling missing values – Imputation or replacing with place holder values
  • Visualization in R using ggplot2(lots and charts) – Histogram, bar charts, box plot, scatter plots
  • Adding more dimension to the plots -geom.(), dodge etc.
  • Correlation – Positive, negative and no correlation
  • Correlation vs causation
  • Data transformation
  • Different type of predictive analytics – prediction, forecasting etc
  • Supervised learning
  • Assumptions
  • Model development and interpretation
  • Model validation – tests to validate assumptions
  • Multiple linear regression
  • Disadvantages of linear models
  • Need for logistic regression
  • Model development and interpretation – Example
  • Confusion matrix – error measurement
  • ROC curve
  • Measuring sensitivity and specificity
  • Advantages and disadvantages of logistic regression models
  • Process of tree building
  • Entropy and gini index
  • Problem of over fitting
  • Pruning a tree back
  • Classification model development and validation – Example
  • CART and CTREE – Example
  • Advantages and disadvantages of tree based models
  • What is KNN?
  • Model development and validation – Example
  • Advantages and disadvantages of KNN
  • What is SVM?
  • Maximum margin Classifier
  • SVM for Non-Linear data – Kernels
  • Model development and validation – Example
  • Advantages and disadvantages of SVM
  • Different types of cross validation techniques
  • Bagging
  • Random Feature Selection
  • Hyper parameter Tuning
  • Model development and validation – Example
  • Boosting – Gradient boosting machines
  • Model development and validation – Example
  • Xgboost – Extreme Gradient Boosting
  • Model development and validation – Example
  • What is unsupervised learning?
  • Distance measures and Linkage Criteria
  • Cluster analysis
  • Hierarchical clustering
    • Model development and interpretation – Example
    • Cluster Dendrogram
  • Model development and interpretation – Example
  • Choosing optimal value of k (Elbow, Average silhouette and Gap statistics method)
  • Need for PCA(Curse of dimensionality)
  • Advantages of principal components
  • Applications of PCA – Example
  • Error measurement
    • RMSE – Root mean squared error
    • Area under the curve
  • Cross validation
    • Different types of cross validation techniques
  • Business problem to an analytical problem
  • Problem definition and analytical method selection
  • Guidelines in model development
  • Course Id                                             A101