0% found this document useful (0 votes)

2 views

Grade 10 Unit 4 - Data Science

Uploaded by

suhanidevgan2009

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Grade 10 Unit 4 - Data Science

Uploaded by

suhanidevgan2009

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Unit 4: Data Science

 Introduction to Data Science

 Applications of Data Science
 Data Collection
Sources of Data
Types of Data
 Data Access
 Activities: Game: Rock, Paper & Scissors
Introduction to Data Science
AI can be classified into three broad domains:

• Data science is the study of data to extract meaningful insights for business.
• Data Sciences majorly work around analysing the data and when it comes to AI, the analysis helps in
making the machine intelligent enough to perform tasks by itself.
• Data science is a concept to combine statistics, data analysis, machine learning and their related
methods in order to understand and analyse actual phenomena with data.
• Data Science employs techniques and theories drawn from many fields within the context of
Mathematics, Statistics, Computer Science, and Information Science to analyze large amounts of data.
Applications of Data Science
Data Science is a branch of computer science where we study how to store, use and analyze data for deriving information from it.
There exist various applications of Data Science in today’s world. Some of them are:
1.Fraud and Risk Detection:
 Fraud and risk detection is crucial for protecting businesses, customers, and individuals from
financial losses and other negative impacts.
 By using data analysis and intelligent algorithms, organizations can identify and respond to
fraudulent activities and potential risks more effectively, enhancing security and trust.
Eg: when a customer approaches for a bank loan, Data science analyse the customer’s data like
customer profiling, their past debts and if they have settled debt properly or they failed to do so.
 The earliest applications of data science were in Finance. Companies were fed up of bad debts and
losses every year. However, they had a lot of data which use to get collected during the initial
paperwork while sanctioning loans. They decided to bring in data scientists in order to rescue
them from losses.

2. Genetics & Genomics:

 Genetics is the branch of biology that deals with studies heredity, which involves the passing of
traits from parents to offspring.
 Genomics,, is a broader field that includes the study of how a set of genes(genome) behave.
 Data science empowers geneticists and genomic researchers to handle, analyze, and interpret large-
scale genetic datasets, leading to insights into the structure, function, and evolution of genomes.
Genomic data scientists develop comprehensive models. These models can do things like predict the risk of common
diseases based on an individual's genetic makeup.
Genomic data is used to diagnose and monitor genetic diseases like cancer, genetic disorders, and inherited diseases.
Specific genetic markers are identified and monitored to determine the progression of a disease and treatment.
Preventive health care also uses genomics research to treat issues early and improve outcomes.
Scientists use human genomic data to investigate diseases or medical conditions, identify and assess drug targets, and
develop new treatments. Genomic data helps them develop effective drugs and personalized treatments as well as screen
and test potential drugs.
As soon as we acquire reliable personal genome data, we will achieve a deeper understanding of the human
DNA. The advanced genetic risk prediction will be a major step towards more individual care.
3.Internet Search:
Many search engines like Yahoo, Bing, Ask, AOL, and so on
(including Google) make use of data science algorithms to deliver the
best result for our searched query in the fraction of a second. Google
processes more than 20 petabytes of data every day, had there been no
data science. Google wouldn’t have been the ‘Google’ we know today.
4.Targeted Advertising The display banners on various websites to the digital billboards at the airports – almost
all of them are decided by using data science algorithms. This is the reason why digital
ads have been able to get a much higher CTR (Call-Through Rate) than traditional
advertisements. They can be targeted based on a user’s past behaviour. Eg: When a
person receives ads of fashions other may receive ads on electronics, based on their past
activities.
5.Website Recommendations
Amazon not only help us find relevant products from billions of products available with
them but also add a lot to the user experience.
A lot of companies have intensely used this engine to promote their products in accordance
with the user’s interest and relevance of information. Internet giants like Amazon, Twitter,
Google Play, Netflix, LinkedIn, IMDB and many more use this system to improve the user
experience. The recommendations are made based on previous search results for a user.
6.Airline Route Planning
Airline companies started using Data Science to identify the strategic areas of improvements and to over come the heavy
losses. Now, while using Data Science, the airline companies can:
 Predict the delay in the flight
 Decide which class of airplanes to buy
6.Airline Route Planning (Cont…)
 Whether to directly land at the destination or take a halt in between (For example, A flight can have a direct route from New
Delhi to New York. Alternatively, it can also choose to halt in any country.)
 Effectively drive customer loyalty programs
System Maps
 A system map is a visual representation shows all the components in the process and boundaries of a system and the components
of the environment at a specific point in time.
 Systems mapping is an effective tool that we can use for understanding and redesigning systems.
 It provides the relationship of various factors and impact on the Project goal.
Use of System Map
 System Map helps us to find relationships between different elements of the problem which we have scoped.
 System Map helps in strategizing the solution for achieving the goal of our project.
 System Map is used to understand complex issues with multiple factors that affect each other.
 The main use of a system map is to help structure a system and communicate the result to others
.
Components of System Map:
S.No Component Represents
1. Circle Elements of the Problem
2. Arrows Relationship
3. Longer arrow Longer time for a change to happen. Also called as time delay.
4. Arrow with + sign Both the elements are directly related to each other
5. Arrow with - sign Both the elements are inversely related to each other
System Map to show the stress Management.

System Map to show the effect of increase

in number of vehicles on the road..
Data Collection
• Data collection is an exercise which does not require even a tiny bit of technological knowledge.
• When it comes to analysing the data, it becomes a tedious process for humans as it is all about
numbers and alpha-numerical data. That is where Data Science came into the picture.
• Data collection not only gives us a clearer idea around the dataset, but also adds value to it by
providing deeper and clearer analyses around it.
Some examples of datasets which you must already be aware of are:

Sources of Data
There exist various sources of data from where we can collect any type of data required and the data
collection process can be categorised in two ways:
 Offline Data Collection - Sensors, Surveys , Interviews, Observations
 Online Data Collection - Open-sourced Government Portals, Reliable Websites (Kaggle), World
Organisations’ open-sourced statistical websites
While accessing data from any of the data sources, following points should be kept in mind:
1. Data which is available for public usage only should be taken up.
2. Personal datasets should only be used with the consent of the owner.
3. One should never breach someone’s privacy to collect data.
4. Data should only be taken form reliable sources as the data collected from random sources can be
wrong or unusable.
5. Reliable sources of data ensure the authenticity of data which helps in proper training of the AI
model.
Types of Data
For Data Science, usually the data is collected in the form of tables. These tabular datasets can be stored in
different formats. Some of the commonly used formats are:
1. CSV: CSV stands for comma separated values. It is a simple file format used to store tabular data. Each
line of this file is a data record and reach record consists of one or more fields which are separated by
commas. Since the values of records are separated by a comma, hence they are known as CSV files.
2. Spreadsheet: A Spreadsheet is a piece of paper or a computer program which is used for accounting and
recording data using rows and columns into which information can be entered. Microsoft excel is a
program which helps in creating spreadsheets.
3. SQL: SQL is a programming language also known as Structured Query Language. It is a domain-specific
language used in programming and is designed for managing data held in different kinds of DBMS
(Database Management System) It is particularly useful in handling structured data.
Data Access
After collecting the data we should be able to use it for programming purposes, we should know how to
access the same in a Python code. To make our lives easier, there exist various Python packages which help
us in accessing structured data (in tabular form) inside the code. Some of the Python packages are:
1. NumPy 2. Matplotlib 3. Pandas 4.Statistics

1.NumPy
• NumPy stands for Numerical Python, is the fundamental package for Mathematical and logical
operations on arrays in Python.
• NumPy works around numbers and gives a wide range of arithmetic operations around numbers giving
us an easier approach in working with them.
• NumPy also works with arrays. An array is nothing but a set of multiple values which are of same
datatype that is its a homogenous collection of Data.
• In NumPy, the arrays used are known as ND-arrays (N-Dimensional Arrays) as NumPy comes with a
feature of creating n-dimensional arrays in Python.
Difference Between Arrays and List
NumPy Arrays Lists
Homogenous collection of Data. It Heterogenous collection of Data. It
can contain only one type of data. contain multiple types of data.
Cannot be directly initialized. Can Can be directly initialized as it is a part
be operated with Numpy package of Python syntax.
only.
Widely used for arithmetic Widely used for data management
operations
Arrays take less memory space Lists acquire more memory space
Functions like concatenation, Functions like concatenation,
appending etc are not possible with Appending etc are possible with lists
arrays.
Can be accessed only through Can be accessed directly used in Python
package support. without any package support.
Example : Example:
import numpy A=[1,2,3,4,5,6,7,8,9,0]
A=numpy.array([1,2,3,4])
Matplotlib
 Matplotlib is an amazing visualization library in Python for 2D plots of arrays.(NumPy arrays)
 One of the greatest benefits of visualization is that it allows us visual access to huge amounts of data in
easily digestible visuals.

 Matplotlib comes with a wide variety of plots which us helps to understand trends, patterns, and to
make correlations.
Pandas [ panel data ]
Pandas is a software library written for the Python programming language for data manipulation and
analysis.
Pandas offers data structures and operations for manipulating numerical tables and time series.
Panel data is an econometrics term for data sets that include observations over multiple time periods for
the same individuals.
Pandas are also able to delete rows that are not relevant, or contains wrong values, like empty or NULL
values. This is called cleaning the data.
Pandas is well suited for many different kinds of data:
• Pandas is a Python library used for working with data sets.
• Tabular data with heterogeneously-typed columns, as in an SQL table or Excel spreadsheet
• Ordered and unordered (not necessarily fixed-frequency) time series data. data actually need not be
labelled at all to be placed into a Pandas data structure.

What is data mining ?

In simple words, data mining is defined as a process used to extract usable data from a larger set of
any raw data. It implies analysing data patterns in large batches of data using one or more software.
Data mining has applications in multiple fields, like science and research.
1. What is Data Science? List out the Applications of Data Science and explain
2. What is System map? List its components and uses.
3. List down all possible Sources of Data collection with example
4. What all essential points should we consider while accessing data from
different data sources?
5. What types of data formats that are commonly utilized in the field of data
science?
6. List out few Python packages which help us in accessing structured data?

Unit I & II_FDS_II AI&DS
No ratings yet
Unit I & II_FDS_II AI&DS
48 pages
DS R Unit-1
No ratings yet
DS R Unit-1
41 pages
Unit 1 Data Science Notes
No ratings yet
Unit 1 Data Science Notes
33 pages
Deta Science
No ratings yet
Deta Science
8 pages
Chapter-3 Data Sciences Study Materials Final-1
No ratings yet
Chapter-3 Data Sciences Study Materials Final-1
3 pages
UNIT IV Data Science
No ratings yet
UNIT IV Data Science
7 pages
DS QB
No ratings yet
DS QB
81 pages
Data Science Training
No ratings yet
Data Science Training
8 pages
X AI SS CH4 LM
No ratings yet
X AI SS CH4 LM
57 pages
Data Science - FYBCA-Sem-II
No ratings yet
Data Science - FYBCA-Sem-II
13 pages
Data science
No ratings yet
Data science
10 pages
SAS 101 - Introduction to Data Science
No ratings yet
SAS 101 - Introduction to Data Science
10 pages
Data Science XTH
No ratings yet
Data Science XTH
10 pages
Fundamentals of Data Science
No ratings yet
Fundamentals of Data Science
53 pages
3-Business Intelligence and Data Science-08!01!2024
No ratings yet
3-Business Intelligence and Data Science-08!01!2024
16 pages
PDF Data Science
No ratings yet
PDF Data Science
7 pages
Data Science Notes
No ratings yet
Data Science Notes
4 pages
Data Science For Business
No ratings yet
Data Science For Business
18 pages
Data Science Introduction
No ratings yet
Data Science Introduction
22 pages
Fundamentals of Data Science
100% (3)
Fundamentals of Data Science
62 pages
PSD02 - Data Science Overview
No ratings yet
PSD02 - Data Science Overview
64 pages
introduction to data science
No ratings yet
introduction to data science
8 pages
Data Science
No ratings yet
Data Science
5 pages
Unit-1 IDS
No ratings yet
Unit-1 IDS
26 pages
DS QB unit 1
No ratings yet
DS QB unit 1
45 pages
DS-2, Week 1, Lecture
No ratings yet
DS-2, Week 1, Lecture
10 pages
Introduction to Data Science- Unit-1
No ratings yet
Introduction to Data Science- Unit-1
9 pages
IDS-UNIT-1-FINAL (1)
No ratings yet
IDS-UNIT-1-FINAL (1)
30 pages
Notes Data Science
No ratings yet
Notes Data Science
5 pages
Handbook Introduction of Data Science AY 23-24
No ratings yet
Handbook Introduction of Data Science AY 23-24
171 pages
Session 1819
No ratings yet
Session 1819
47 pages
Chapter 1-Introduction to data science
No ratings yet
Chapter 1-Introduction to data science
39 pages
Cs3353-Fds All 5units
No ratings yet
Cs3353-Fds All 5units
211 pages
(DSBDA) Unit 1 Introduction To Data Science
No ratings yet
(DSBDA) Unit 1 Introduction To Data Science
14 pages
Applications of Data Science UNIT-1
No ratings yet
Applications of Data Science UNIT-1
4 pages
Data Science CBSE Notes
No ratings yet
Data Science CBSE Notes
45 pages
DATA SCIENCE
No ratings yet
DATA SCIENCE
8 pages
Unit 1 DA
No ratings yet
Unit 1 DA
72 pages
Data Sciences
No ratings yet
Data Sciences
23 pages
Unit 1-FDS
No ratings yet
Unit 1-FDS
18 pages
Ch7-Overview of Data Science-part 1
No ratings yet
Ch7-Overview of Data Science-part 1
37 pages
Unit1 R Full Material
No ratings yet
Unit1 R Full Material
11 pages
Introduction To Datasciecne
No ratings yet
Introduction To Datasciecne
50 pages
CSE3038 Module 1
No ratings yet
CSE3038 Module 1
21 pages
DS-BDS (Unit 1) Technical
No ratings yet
DS-BDS (Unit 1) Technical
22 pages
Chapter one-DSA
No ratings yet
Chapter one-DSA
20 pages
Data Science 2020
100% (1)
Data Science 2020
123 pages
Lecture 2-Quick Overview of Data Science
No ratings yet
Lecture 2-Quick Overview of Data Science
18 pages
Data Science Internship
No ratings yet
Data Science Internship
6 pages
Fds Module 1
No ratings yet
Fds Module 1
65 pages
The Field of Data Science
No ratings yet
The Field of Data Science
4 pages
IDS Complete Notes
No ratings yet
IDS Complete Notes
126 pages
Industrial Training Report
No ratings yet
Industrial Training Report
24 pages
Adobe Scan 09 Sept 2024
No ratings yet
Adobe Scan 09 Sept 2024
4 pages
DATA SCIENCE LIFE CYCLE
No ratings yet
DATA SCIENCE LIFE CYCLE
12 pages
IDS- UNIT-1
No ratings yet
IDS- UNIT-1
14 pages
data science assignment
No ratings yet
data science assignment
4 pages
Chapter 1 - Lecture
No ratings yet
Chapter 1 - Lecture
7 pages
"Big Data Science" Basic Concepts and Applications
From Everand
"Big Data Science" Basic Concepts and Applications
Sukanta Bhattacharya
No ratings yet
Data Analytics for Businesses 2019: Master Data Science with Optimised Marketing Strategies using Data Mining Algorithms (Artificial Intelligence, Machine Learning, Predictive Modelling and more)
From Everand
Data Analytics for Businesses 2019: Master Data Science with Optimised Marketing Strategies using Data Mining Algorithms (Artificial Intelligence, Machine Learning, Predictive Modelling and more)
Riley Adams
5/5 (1)
GENDER RELIGION AND CASTE
No ratings yet
GENDER RELIGION AND CASTE
4 pages
Metals and Non Metals
No ratings yet
Metals and Non Metals
9 pages
MAJOR CROPS OF INDIA WK SHEET A
No ratings yet
MAJOR CROPS OF INDIA WK SHEET A
1 page
MCQ GENDER RELIGION AND CASTE
No ratings yet
MCQ GENDER RELIGION AND CASTE
2 pages
GEO - REVISION WORKSHEET - UT2
No ratings yet
GEO - REVISION WORKSHEET - UT2
2 pages
Science Class X Chapter 07 How Do Organisms Reproduce Practice Paper 09 2024
No ratings yet
Science Class X Chapter 07 How Do Organisms Reproduce Practice Paper 09 2024
4 pages
Science Class X Chapter 03 Metals and Non Metals Practice Paper 08 2024
No ratings yet
Science Class X Chapter 03 Metals and Non Metals Practice Paper 08 2024
5 pages
Science Class X Chapter 04 Carbon and Its Compound Practice Paper 11 2024
No ratings yet
Science Class X Chapter 04 Carbon and Its Compound Practice Paper 11 2024
5 pages
Science Class X Chapter 12 Magnetic Effects of Electric Current Practice Paper 13 2024
No ratings yet
Science Class X Chapter 12 Magnetic Effects of Electric Current Practice Paper 13 2024
6 pages
Gautam Resume
No ratings yet
Gautam Resume
3 pages
Data Science
No ratings yet
Data Science
71 pages
Impact of AI On Media Entertainment Industry
No ratings yet
Impact of AI On Media Entertainment Industry
32 pages
M Tech CSE Batch
No ratings yet
M Tech CSE Batch
150 pages
DataAnalyticsConsulting report by RM
No ratings yet
DataAnalyticsConsulting report by RM
12 pages
Malak Al-Aabiad CV
No ratings yet
Malak Al-Aabiad CV
2 pages
6th Sem Cse Data Science Analytics SM o
No ratings yet
6th Sem Cse Data Science Analytics SM o
40 pages
PHD Thesis Topics in Commerce PDF
100% (3)
PHD Thesis Topics in Commerce PDF
6 pages
Data Science 3
No ratings yet
Data Science 3
216 pages
Data Analyst Information
No ratings yet
Data Analyst Information
15 pages
Hirist Udit Narayana Gedala
No ratings yet
Hirist Udit Narayana Gedala
3 pages
Big Data Metods
No ratings yet
Big Data Metods
23 pages
Unit 06 - Assignment Brief 1 - Big Data
No ratings yet
Unit 06 - Assignment Brief 1 - Big Data
6 pages
Final Doc of Two Stage Job Title Identification System for Online Job Advertisements-1
No ratings yet
Final Doc of Two Stage Job Title Identification System for Online Job Advertisements-1
59 pages
Data Science Ppt Final
No ratings yet
Data Science Ppt Final
19 pages
Advanced Data Analytics
No ratings yet
Advanced Data Analytics
114 pages
JOB INTEL
No ratings yet
JOB INTEL
9 pages
Mastering Python - Practical Guide
No ratings yet
Mastering Python - Practical Guide
14 pages
Gradient Flow Trend 2023 Report Final
No ratings yet
Gradient Flow Trend 2023 Report Final
16 pages
Eric C. Chi's CV
No ratings yet
Eric C. Chi's CV
14 pages
Data Science - UNIT-2 - Notes
No ratings yet
Data Science - UNIT-2 - Notes
13 pages
Module 1
No ratings yet
Module 1
15 pages
Iman's Resume
No ratings yet
Iman's Resume
2 pages
The Art of Data Science: Student - Feedback@sti - Edu
No ratings yet
The Art of Data Science: Student - Feedback@sti - Edu
2 pages
Vedant%20Kumar%20Resume
No ratings yet
Vedant%20Kumar%20Resume
4 pages
Data Scientist Resume Sample
100% (1)
Data Scientist Resume Sample
8 pages
Longterm Course Catalog
No ratings yet
Longterm Course Catalog
11 pages
Codsoft Report
No ratings yet
Codsoft Report
26 pages
2024-GSAS Masters Resume Cover Letters-2
No ratings yet
2024-GSAS Masters Resume Cover Letters-2
13 pages
Enterprise AI JD-Data Scientist - AI ML Engineer
No ratings yet
Enterprise AI JD-Data Scientist - AI ML Engineer
4 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Grade 10 Unit 4 - Data Science

Uploaded by

Grade 10 Unit 4 - Data Science

Uploaded by

Unit 4: Data Science

 Introduction to Data Science

2. Genetics & Genomics:

System Map to show the effect of increase

What is data mining ?

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.