0% found this document useful (0 votes)
17 views

Unit 1

The document discusses the role of statistics in engineering. It covers topics like collecting engineering data through retrospective studies, observational studies, and designed experiments. It also discusses data description and representation, descriptive statistics, correlation and regression, sampling, statistical inference, hypothesis testing, and the analysis of variance.

Uploaded by

Tom Jones
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Unit 1

The document discusses the role of statistics in engineering. It covers topics like collecting engineering data through retrospective studies, observational studies, and designed experiments. It also discusses data description and representation, descriptive statistics, correlation and regression, sampling, statistical inference, hypothesis testing, and the analysis of variance.

Uploaded by

Tom Jones
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 50

UNIT-I

The Role of Statistics in Engineering : The Engineering Method and Statistical Thinking -
Collecting Engineering Data - Basic Principles - Retrospective Study - Observational Study
- Designed Experiments -Observing Processes Over Time - Mechanistic and Empirical
Models
Data Description and Representation: Collection of data- Classification and Tabulation of
data - Stem-and-Leaf Diagrams - Frequency Distributions and Histograms - Box Plots -
Time Sequence Plots - Probability Plots .
 
UNIT-II
Descriptive Statistics: Measures of central Tendency-Measures of Dispersion-Skewness
and Kurtosis. Correlation and Regression: Scatter Diagram – Types of Correlation – Karl
Pearsons Coefficient of Correlation and Spearmen’s Rank Correlations- Method of Least
Squares – Linear Regression.
 
UNIT-III
Sampling: Different types of sampling - Sampling Distributions - Sampling
Distribution of Mean.
Point Estimation of Parameters: General Concepts of Point Estimation -
Unbiased Estimators -Variance of a Point Estimator - Standard Error- Methods of
Point Estimation (Method of Moments - Method of Maximum Likelihood).
Statistical Intervals for a Single Sample: Confidence Interval on the Mean of a
Normal Distribution with Variance Known - Confidence Interval on the Mean of a
Normal Distribution with Variance Unknown - Confidence Interval on the Variance
and Standard Deviation of a Normal Distribution - A Large-Sample Confidence
Interval for a Population Proportion
UNIT-IV
Tests of Hypotheses for a Single Sample: Tests of Statistical Hypotheses - General
Procedure for Hypothesis Testing –Tests on the Mean of a Normal Distribution with
Variance Known - Tests on the Mean of a Normal Distribution with Variance Unknown
- Tests on the Variance and Standard Deviation of a Normal Distribution.
Statistical Inference for Two Samples: Inference For a Difference in Means of Two
Normal Distributions with Variances Known - Inference For a Difference in Means of
Two Normal Distributions with Variances Unknown -Inference on the Variances of
Two Normal Distributions – Inference on Two Population Proportions.
UNIT-V L- 6 The Analysis of Variance: Concept-
Assumptions-One way classification and two-way classifications.
Designing Engineering Experiments –Concept of Randomization, Replication and
local control - Completely Randomized Design -Randomized Block Design –Latin
square Design.
Text Books
1. Douglas C. Montgomery and George C. Runger. Applied Statistics and
Probability for Engineers, (3rd Edn), John Wiley and Sons, Inc., New York,
2003.
2. Robert H. Carver and Jane Gradwohl Nash. Doing Data Analysis with SPSS
Version 18.0, (Indian Edition), Cengage Learning, New Delhi, 2012
3. Richard A. Johnson and C.B.Gupta, Probability and Statistics for Engineers, (7th
Edn.), Pearson Education, Indian Impression 2006. 
Reference:
1. Mohammed A.Shayib. Applied Statistics, First Edition. eBook, Bookboon.com
2013.
2. Peter R.Nelson, Marie Coffin, Copeland Kanen, A.F. Introductory Statistics for
Engineering Experimentation, Elsevier Science and Technology Books, New
York, 2003.
3. Sheldon M. Ross, Introduction to Probability and Statistics, (3rd Edn), Elsevier
Science and Technology Books, New York, 2004.
4. T.T.Soong, Fundamentals of Probability and Statistics for Engineers, John
Wiley and Sons, Ltd., New York, 2004.
5. J.P.Marques de Sá , Applied Statistics using SPSS, STATISTICA, MATLAB and
R, (2nd Edn.), Springer Verlag, Heidelberg, 2007.
The Role of Statistics in Engineering
THE ENGINEERING METHOD AND STATISTICAL THINKING

The engineering, or scientific, method is the approach to formulating and solving


problems. of interest to society by the efficient application of scientific principles.
The steps in the engineering method
 Develop a clear and concise description of the problem.

 Identify, at least tentatively, the important factors that affect this problem
or that may play a role in its solution.
 Propose a model for the problem, using scientific or engineering
knowledge of thephenomenon being studied. State any limitations or
assumptions of the model.

 Conduct appropriate experiments and collect data to test or validate the


tentative model or conclusions made in steps 2 and 3.
 Refine the model on the basis of the observed data.

 Manipulate the model to assist in developing a solution to the


problem.

 Conduct an appropriate experiment to confirm that the proposed


solution to the problemis both effective and efficient.

 Draw conclusions or make recommendations based on the


problem solution
 The field of statistics deals with the collection, presentation,
analysis, and use of data to make decisions, solve problems, and
design products and processes.
 Statistical methods are used to help us describe and understand
variability.
 statistical thinking can give us a useful way to incorporate this
variability into our decision-making processes.
 Statistics gives us a framework for describing this variability and for
learning about which potential sources of variability are the most
important or which have the greatest impact on the gasoline mileage
performance.
• POPULATION

• SAMPLE

• ENUMERATION STUDY

• ANALYTIC STUDY
Collecting Engineering Data
Basic Principles

In the engineering environment, the data is almost always a sample that


has been selected from some population.

Three basic methods of collecting data are

• A retrospective study using historical data

• An observational study

• A designed experiment
Retrospective Study

• A retrospective study may involve a lot of data, but that data may contain relatively little
useful information about the problem. Furthermore, some of the relevant data may be
missing, there may be transcription or recording errors resulting in outliers (or unusual
values), or data on other important factors may not have been collected and archived.

• For example, the specific concentrations of butyl alcohol and acetone in the input feed stream
are a very important factor, but they are not archived because the concentrations are too hard
to obtain on a routine basis.

• As a result of these types of issues, statistical analysis of historical data sometimes identify
interesting phenomena, but solid and reliable explanations of these phenomena are often
difficult to obtain.
Observational Study
• In an observational study, the engineer observes the process or population, disturbing it as
little as possible, and records the quantities of interest. Because these studies are usually
conducted for a relatively short time period, sometimes variables that are not routinely
measured can be included.

• In the distillation column, the engineer would design a form to record the two temperatures
and the reflux rate when acetone concentration measurements are made. It may even be
possible to measure the input feed stream concentrations so that the impact of this factor
could be studied. Generally, an observational study tends to solve problems and goes a long
way toward obtaining accurate and reliable data.
Designed Experiments
• In a designed experiment the engineer makes deliberate or purposeful
changes in the controllable variables of the system or process, observes
the resulting system output data, and then makes an inference or
decision about which variables are responsible for the observed changes
in output performance.
Observing Processes Over Time
• Often data are collected over time. In this case, it is usually very helpful to plot
the data versus time in a time series plot. Phenomena that might affect the
system or process often become more visible in a time-oriented plot and the
concept of stability can be better judged.
MECHANISTIC AND EMPIRICAL MODELS
Models play an important role in the analysis of nearly all engineering
problems. Much of the formal education of engineers involves learning
about the models relevant to specific fields and the techniques for
applying these models in problem formulation and solution. As a simple
example, suppose we are measuring the flow of current in a thin copper
wire. Our model for this phenomenon might be Ohm’s law:

Current = voltage/resistance
We call this type of model a mechanistic model because it is built from our underlying
knowledge of the basic physical mechanism that relates these variables.
EMPIRICAL MODEL
• It uses our engineering and scientific knowledge of the
phenomenon, but it is not directly developed from our theoretical or
first-principles understanding of the underlying mechanism.
COLLECTION OF DATA
Collection of statistical data forms the fundamental basis for all statistical analysis. Care
must be taken to see that the data collected are reliable and useful for the purpose of the
inquiry.
Before collecting statistical data one should clearly define
(1) The purpose of inquiry
(2) The source of information
(3) Scope of inquiry
(4) The degree of accuracy desired
(5) Methods of collecting data
(6) The unit of data collection
The purpose of inquiry
• Statistical data are collected to draw desired conclusions based on the
data.
• These information may also be useful for some other statistical
survey.

Example:
*Product produced is popular in market
SCOPE OF INQUIRY

• For an investigation of a statistical problem, one has to decide the


geographical area to be covered, the field of inquiry to which it is to be
confined.

NOTE:

• Time and money


Sources of Information

• Primary data

• Secondary data
Statistical Unit

• Statistical data cannot be presented or interpreted without a unit. The


unit should be simple and unambiguous. It should also be stable and
uniform.

• The nature of unit depends upon the purpose of inquiry.


STANDARD ACCURACY

• Statistics is a science of estimates.


• Absolute accuracy is neither possible nor desirable in a statistical
inquiry.
• Statistical inquiry involves large collection of data, a reasonable
standard of accuracy is sufficient.
• Statistical data with these standard of accuracy will not sacrifice the
object of inquiry.
CLASSIFICATION AND TABULATION OF
DATA
• The process of arranging the huge masses of data in a proper way is
called classification.

• Classification may be defined as the process of arranging or bringing


together all the enumerated individuals or items under sepearte heads
or classes , according to some common characteristics possessed by
them.
TYPES OF CLASSIFICATION

• Classification according to qualitative basis

• Classification according to quantitative basis

• Classification according to chronological basis

• Classification according to geographical basis


Classification according to qualitative basis

If the statistical data collected are numerical facts about the qualities
like male, female,employed, Indian, foreigners, etc., the classification of
the data the data is done according to these characteristics.
Classification according to quantitative basis

• The arrangements of statistical data according to numerical


measurements such asage, height, weight, amount of saving,
number of members in a family come under quantitative
classification.
Geographical Classification
• Statistical data classified according to different areas like states,
districts, towns, villages, etc.

Example:
• The production of fertilizer from different parts of the country.
CHRONOLOGICAL CLASSIFICATION
Statistical data arranged according to the time of occurrence come
under this classification.

Example:
Production of wheat from the year 1980 to 1985
TABULATION
• The classified data has to be presented in a tabular form in an orderly
way before analysis and interpretation of the data.
• Tabulation is defined as “the orderly or systematic presentation of
numerical data in rows and columns, designed to facilitate the
comparision between the figures”
• Tabulation is a statistical tool used for condensation of the data in a
statistical process.
Characteristics of a good table

• A statistical table should contain a clear and precise title.

• When a number of tables are presented in the analysis of a statistical data, serial numbers
should be given to the tables.

• Description of columns, rows, sub-column, sub-rows, should be well defined.

• The unit of measurement used should be clearly indicated. These units are normally
mentioned at the top of the columns.

• Data which are comparable should be given side by side.

• The tables should be neat and attractive.


• Horizontal and vertical lines should be drawn to separate rows and
columns. Thin and thick lines may be drawn to distinguish sub-rows and
sub-columns from the main rows and main columns.

• Column totals and row totals should be shown.

• If the information given in the table is not self explanatory, sufficient


information should be given as foot notes for proper understanding of the
table.

• If the data has several sub-classifications, it can be presented in more than


one table.

• The data of preparation of the table and the source of information should
be mentioned at the bottom of the table.
STEM AND LEAF DIAGRAM
In a stem-and-leaf plot, numerical data are listed in ascending
or descending order. The digits in the greatest place value of
the data are used for the stems. The digits in the next greatest
place value form the leaves.
Box plots

Box plots (also called box-and-whisker plots or box-whisker plots) give a good


graphical image of the concentration of the data. They also show how far the
extreme values are from most of the data. A box plot is constructed from five
values: the minimum value, the first quartile, the median, the third quartile, and
the maximum value. We use these values to compare how close other data values
are to them.
To construct a box plot, use a horizontal or vertical number line and a rectangular box. The
smallest and largest data values label the endpoints of the axis. The first quartile marks one end
of the box and the third quartile marks the other end of the box. Approximately the middle 50
percent of the data fall inside the box. The “whiskers” extend from the ends of the box to the
smallest and largest data values. The median or second quartile can be between the first and
third quartiles, or it can be one, or the other, or both. The box plot gives a good, quick picture
of the data.
Construct a box plot for the following data:
12, 5, 22, 30, 7, 36, 14, 42, 15, 53, 25
Solution:
Step 1: Arrange the data in ascending order.
Step 2: Find the median, lower quartile and upper quartile
Median (middle value) = 22 
Lower quartile (middle value of the lower half) = 12 
Upper quartile (middle value of the upper half) = 36
(If there is an even number of data items, then we need to get the average of the middle
numbers.)

Step 3: Draw a number line that will include the smallest and the largest data.

Step 4: Draw three vertical lines at the lower quartile (12), median (22) and
the upper quartile (36), just above the number line.
Step 5: Join the lines for the lower quartile and the upper quartile to form a
box.

Step 6: Draw a line from the smallest value (5) to the left side of the box and draw a line from
the right side of the box to the biggest value (53)
Consider, again, this dataset.
1 1 2 2 4 6 6.8 7.2 8 8.3 9 10 10 11.5
The first quartile is two, the median is seven, and the third quartile is nine. The
smallest value is one, and the largest value is 11.5. The following image shows
the constructed box plot.
Times Series Plots
A time series plot is a graph where some measure of time is the unit on
the x-axis. In fact, we label the x-axis the time-axis. The y-axis is for the
variable that is being measured. Data points are plotted and generally
connected with straight lines, which allows for the analysis of the graph
generated.
From the graph generated by the plotted points, we can see any trends in
the data. A trend is a change that occurs in general direction. For
example, if we see a car at a red light and then the light turns green, we
could plot the distance the car moves versus the time it takes to get to its
current position. We would notice the trend of an increasing distance
from the starting point.
Distance versus time graph
PROBABILITY PLOT
• The probability plot (Chambers et al., 1983) is a graphical technique
for assessing whether or not a data set follows a given distribution
such as the normal or Weibull. The data are plotted against a
theoretical distribution in such a way that the points should form
approximately a straight line.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy