Diploma in Data Science and Business Analytics in collaboration with MAKAUT, WB

COURSE NAME

:

One Year Diploma in Data Science and Analytics

COURSE CODE

:

CONTACT HOURS

:

180 Hours (60 Hours Theory, 120 Hours Laboratory)

Prerequisite:

Bachelor Degree (B.Sc/B.E/B.Tech) or Diploma in Computer Science, Information Technology, or allied streams.

Course Objective:

· Expose the students to the basic statistical techniques that provide the foundation of data science.

· Illustrate the various steps of the data science process, viz. cleaning, visualisation, modeling, and presentation.

· Provide practice in different software tools used for data science: Python, R, Tableau

Course Outcome : At the end of the course students will be able to

1: Clean and prepare data for analysis

2: Perform basic visualisation of data

3: Model and curve-fit the data

4: Present findings of the analysis to stakeholders

#

Topic

Theory

Lab

1.

Introduction to Data Science and Analytics

2

2.

Overview of Statistics

6

3.

Statistical computing in Python – I

2

4

4.

Data visualizations in Python

2

4

5.

Statistical computing in Python – II

2

4

6.

Data cleaning and Preparation in Python

2

4

7.

Statistical computing in R – I

2

4

8.

Data visualizations in R

2

4

9.

Statistical computing in R – II

2

4

10.

Data cleaning and preparation using R

2

4

11.

Creating data visualisations in Tableau

4

8

12.

Predictive Analytics

4

6

13.

Time Series Forecasting

4

4

14.

Introduction to Machine Learning

4

8

15.

Analytics for Business Domains

8

8

16.

Industry Project

12

54

TOTAL

60

120

Course Content:

Introduction to Data Science [2Theory]:

What is data Science? – Applications of data science – Skills required – tools required – Models and methods – The data science process – Type of data – Nominal data – Ordinal data – Interval data – Ratio data – Relationship between different types of data Use of graphs to see characteristics of data

Overview of Statistics [6Theory]:

Descriptive Statistics – Central tendency – Spread – Distributions – Inferential Statistics – Hypothesis testing – Chi-Square – Correlation- Regression

Statistical computing in Python – I [2Theory, 4Laboratory]:

Using Jupyter Notebooks – Statements and comments – Data types and Variables – Introduction to Numpy and Pandas – Descriptive statistics in Python

Data visualizations in Python [2Theory, 4Laboratory]:

Perceptions of visual cues – Bar chart – dot plot – scatter plot – histogram – plotting in Python – numerical – categorical – time series – Matplotlib

Statistical computing in Python – II [2Theory, 4Laboratory]:

Inferential statistics using Pandas and Scipy.stats library – Chi-Square Test – Correlation – T-test – ANOVA

Data cleaning and Preparation in Python [2Theory, 4Laboratory]:

Missing values – outliers – sorting – merging – Dropping Columns in a DataFrame – Changing the Index of a DataFrame – Tidying up Fields in the Data – Combining str Methods with NumPy to Clean Columns – Cleaning the Entire Dataset Using the applymap Function

Statistical computing in R – I [2Theory, 4Laboratory]

Basic data types – variables – vectors – matrices – control structures – functions – Factors – Data frames – lists – Useful R packages – Basic statistics in R – Reading in data – Descriptive statistics in R

Data visualizations in R [2Theory, 4Laboratory]:

Basic plotting in R – Using GGPlot2 – Aesthetics – Faceting – Geoms – Position Adjustments – Saving Graphs

Statistical computing in R – II [2Theory, 4Laboratory]:

Inferential statistics using R – Chi-Square Test – Covariance – Correlation – T-test – Wilcox – ANOVA

Data cleaning and preparation using R [2Theory, 4Laboratory]:

Reshaping – meltdcastrbindcbind – Treating Missing values – Using dplyr – Using tidyr – Working with Continuous and Categorical Variables – Joining Data Sets – Grouping Data

Creating data visualisations in Tableau [4Theory, 8Laboratory]:

What is Tableau – Features of Tableau – Applications of Tableau – The Tableau products – Install Tableau Public – Tableau Workspace – Build views – Connect to data source – Creating dashboards – Data blending

Predictive Analytics[4Theory, 6Laboratory] :

Multiple Linear Regression – Classification – Logistic Regression – Linear Discriminant Analysis – Dimensionality reduction – Rapidminer tool

Time Series Forecasting [4Theory, 4Laboratory]:

Examples of time series – Forecasting – ETS models – Auto-regressive models – ARIMA – KMIME tool

Introduction to Machine Learning [4Theory, 8Laboratory]:

What is machine learning? – Applications of machine learning – How does ML work? –Training data – model/algorithm – testing data – evaluation – prediction – tools required – Orange tool – Types of ML – Supervised – Unsupervised – Common techniques – Deep Learning

Analytics for Business Domains [8Theory, 8Laboratory]:

Marketing and Retail – Web and Social Media – Banking and Finance – Supply chain and Logistics

Industry Project [12Theory, 54Laboratory]:

Each student will be required to work on a Data Science project relevant to the industry. This will involve performing business requirements analysis, solutions design, and implementation.

Reference Books:

  1. Think Stats, Allen B. Downey, O’Reilly Media.
  2. R for Data Science, Garret Grolemund and Hadley Wickham , Chapman & Hall/CRC
  3. Python Data Science Handbook, Jake VanderPlas, O’Reilly Media.

For more information and registration please contact

Aunwesha Academy

Learning and Development partner of Aunwesha Knowledge Technologies Pvt. Ltd

120A Linton Street Kolkata 700014 | email: enq@aunweshaacademy.com | Call/Whatsapp: 9088998585/9051952573/9830379592/6290622433