Becoming a data scientist takes more than the understanding of basic skills like statistics and programming in various languages. The need to develop one area of technical analytic expertise while being conversant in many others is very crucial. Going beyond descriptive analytics has become essential to meet the complexities of information requirement for decision making as well as developing strategies to drive greater profitability, improved performance and competitiveness. This course builds expertise in advanced analytics, data mining, predictive modeling. The aim of the course is to introduce the trainees to the important big data management
techniques and analytical tools.


Data Science Fundamentals

Introduction to Visualization

Data Processing and Cleaning using Panda Python

Managing Big Data using Apache Hadoop and MongoDB

Exploratory Data Analysis and Visualization

Data Mining using Python

Data Extraction for Enterprise Reporting

Advanced Analytics

Linear Regression

Logistic Regression

Big Data Model Diagnostics

Supervised and unsupervised learning

Random Forest, SVM, clustering

Dimensionality reduction

Validation, and Evaluation of Machine Learning Methods

Advanced Analytic Techniques and Text Mining

Simulation of sentimental analysis

Optimization and Causal Mechanistic Analysis

Time Series and Forecasting

Big Data Security

Big Data and Apache Hadoop

Data Science / Big Data frameworks and RDDs

SQL and Data Frames Module

