0% found this document useful (0 votes)
106 views

SAMPLING by Naresh Vasant Afre 13.04.23 Shareable

Uploaded by

Swapnil Oza
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
106 views

SAMPLING by Naresh Vasant Afre 13.04.23 Shareable

Uploaded by

Swapnil Oza
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 58
SAMPLING 9320344503 By Naresh Vasant Afre 9320344503 SAMPLING Contents Introduction: .. Z-test for mean: Z-test for difference of Means:. T- test for means: T- test for difference of means: ss Paired T-test for difference of means: Chi-Square distribution: 1m giiiisiisnssA References: 57 1[Page By Naresh Vasant Afre 9320344503 SAMPLING Introduction: Often in practice we are interested in drawing valid conclusions about a large group of individuals or objects. Instead of examining the entire group, called the population, which may be difficult or impossible to do, one may arrive at an idea of examining only a small part of this population, which is called a sample, This is done with the aim of deducing certain facts about the population from the results found in the sample, a process known as statistical inference. The process of obtaining samples is called sampling. Clearly, the reliability of conclusions drawn concerning a population depends on whether the sample is properly chosen so as to represent the population sufficiently well, and one of the important problem of statistical inference is Just how to choose a sample. One way to do this for finite populations is to make sure that each member of the population has the same chance of being in the sample, which is often called a random sample, Fairly good samples can be obtained by throwing a dice, draw of a lottery, etc. + PARAMETER: The statistical constants such as mean(ji), variance(a?), etc obtained from the population are called parameters. ¢ STATISTIC: The statistical constants such as mean(z), variance(s2), etc obtained from the sample are called statisti + SAMPLING DISTRIBUTION: The distribution of all possible values that can be assumed by some statistic, computed from samples of the same size randomly drawn from the same population, is called sampling distribution of that statistic. 21 Page By Naresh Vasant Afre 9320344503 SAMPLING + CONSTRUCTION OF SAMPLING DISTRIBUTION: 1. Froma finite population size N, construct all possible random samples of size n. 2. Compute the statistic of interest for each sample (mean(z), variance(s?), etc.) 3. List in one column the different distinct observed values of the statistic, and in another column list the corresponding frequency of occurrence. For example: ‘Suppose we have a population of size N = 5, consisting of the ages of five children. The ages are as follows: x; = 6,x2 = 8,x3 = 10,x4 =12,x5 = 14. Then, the mean of this population is and the variance of this population is ge a Beai=w?), 40_ > 58 Let us draw all possible samples of size n = 2 from this population and suppose we are interested in the statistic(mean)= (x) There are 5? = 25 samples possible with replacement and they are as follows: Name SA) S&S TS Sy Ss Se Sy Ss So Sample | (6,6) | (6,8) | (8,6) | (6,10) | (40,6) | (6,12) | (12,6) | (6,14) | (14,6) ‘Mean 6 7 7 8 8 9 9 10 10 31 Page By Naresh Vasant Afre 9320344503 SAMPLING Name Sto Si Siz Sis Sia Sis Sis Sir Sample | (8,8) | (8,10) | (10,8) | (8,12) | (12,8) | (8,14) | (14,8) | (10,10) ‘Mean 8 9 9 10 10 11 11 10 Name Sis Sis Szo Sa Sn2 Ses Sea Sos Sample | (10,12) | (42,10) | (10,14) | (44, 10) | (42,12) | (42,14) | 4,12) | 44.14) ‘Mean i i 12 2 12 13 13 14 We can observe the following table: sampling distribution of the statistic x in the ¥ Frequency 6 1 7 2 8 3 9 4 10 Ss a. 4 12 3 is 2 —— Sampling Distribution of mean : -§- : ; ; : : : : : , | ‘= Sampling Distribution of mean Note that the sampling distribution of the mean () is normally distributed. 41 Page By Naresh Vasant Afre 9320344503 SAMPLING + SAMPLING DISTRIBUTION OF (2): MEAN Let us compute the mean of the distribution (), which we call jz. G+7474+848484--414 = * 25 We note with interest that the mean of the sampling distribution of x has the same value as the mean of the origincl population. # SAMPLING DISTRIBUTION OF (ic): VARIANCE Let us compute the variance of the distribution of =, which we call «2. _ (6-10)? + (7 — 10)? + (7 = 10)? + +14 — 10)? 25 a 25 y We note that the variance of the sampling distribution of z is not equal to the population variance. It is of interest to observe, however, that the variance of the sampling distribution of x is equal to the population variance divided by the size of the sample used to obtain sampling distribution. i.e., The square root of the variance of the sampling distribution of x, a ee “vn is called the standard error of the mean or, simply, the standard error. The above discussion was about the mean and variance of the sampling distribution of statistic = x. In general, the statistic t may be regarded as a random variable which can take values t,t, ...,ty and we can compute the various SI Page By Naresh Vasant Afre 9320344503 SAMPLING statistical constants like mean, variance, skewness, kurtosis, etc., for its distribution. For example, the mean (j,.) and the variance of of the sampling distribution of the statistic ¢ are given by 1 Me = F(t + te +" + ty) % 1 2-13 Gn)? RR ye Hey. + STANDARD ERROR: The standard deviation of the sampling distribution of statistic is known as its Standard Error (S.E.) The standard errors of some well-known statistics, for large samples, are given below, where n is the sample size and ? is the population variance. Sr. No Statistics Standard error 1 Sample mean: = ofvn 2 | Sample standard deviation: s S 7 2 3 | Sample variance: s? are a Difference of two sample means: (¥%— %2) + UTILITY OF STANDARD ERROR: If t is any statistics, then for large samples ~N@.1) for large samples. 61 Pace By Naresh Vasant Afre 9320344503 SAMPLING + STATISTICAL DECISION: Very often in practice we are called upon to make decisions about populations based on sample information. Such decisions are called statistical decisions. For example, we may wish to decide whether anew serum is really effective in curing a disease, whether one educational procedure is better than the other, whether a given coin is biased, etc. @ STATISTICAL HYPOTHESIS: In attempting to reach decisions, it is useful to make assumptions or guesses about the population involved. Such assumptions, which may or may not be true, are called statistical hypothesis. In many instances we formulate a statistical hypothesis for the sole purpose of rejecting or nullifying it. For example, if we want to decide whether a given coin is biased, we formulate the hypothesis that the coin is fair, i.,p = 0.5, where p is the probability of heads. Similarly, if we want to decide whether one procedure is better than another, we formulate the hypothesis that there is no difference between the procedures. Such hypotheses are often called null hypothesis and are denoted by Ho. Any hypothesis which differs from null hypothesis is called an Alternative hypothesis. For example, if null hypothesis is p = 0.5, then the alternative hypotheses are p+ 0.5, p > 0.5, or p < 0.5. A hypothesis alternative to null hypothesis is denoted by H,. # TESTS OF HYPOTHESES AND SIGNIFICANCE: Procedures which enable us to decide whether to accept or reject hypotheses or to determine whether observed samples differ significantly from expected results are called test of hypotheses, tests of significance, or rules of decision, Some of the well-known tests of significance for studying such differences are Z — test, T — test and F — test. 7IPage By Naresh Vasant Afre 9320344503 SAMPLING + TYPE T AND TYPE IT ERRORS: If we reject the hypothesis when it should be accepted, we say that a Type I error has been made. Tf, on the other hand, we accept the hypotheses when it should be rejected, we say that Type IT error has been made. In either case a wrong decision or error in judgement has occurred. + CONFIDENCE INTERVAL: A confidence interval also knownas the acceptance region, is a set of values for the test statistics for which the null hypothesis is accepted ie., if the observed test statistics is in the confidence interval, then we accept the null hypothesis and reject the alternative hypothesis. + LEVEL OF SIGNIFICANCE: In testing a given hypothesis, the maximum probability with which we would be willing to risk a Type I error is called the level of significance of the test. This probability is often denoted as a. We perform a hypothesis test with 100(1— @)% of confidence interval. Confidence intervals are usually calculated at 5% or 1% significance level, for which « = 0.05 and a= 0,01 respectively. Note that a 95% confidence interval does not mean there is a 95% chance that the true value being estimated in the calculated interval. Rather, it means the probability that the calculated interval contains population value. For example, a 0,05 or 5% level of significance is chosen in designing a test of a hypothesis, then there are about 5 chances in 100 that we would reject the hypothesis when it should be accepted. BI Page By Naresh Vasant Afre SAMPLING 9320344503 Z-test for means: For large samples many statistics have normal distributions (or at least nearly normal) with mean ji, and standard deviation a,. In such cases we can use Z-test. To test the null hypothesis Hy that the population mean is = a we would use the following statistic. Where X is a sample mean; 1 is a population mean; o'is a population standard deviation and n is the sample size. When necessary, the observed sample standard deviation, s is used to estimate a. Critical values for Z-test: Level of 0.10 0.05 0.02 0.01 significance « Critical value 1.645 1.96 233 2.58 Two tailed (2) Critical value 1.28 1.645 2.08 2.33 One tailed (Zx) We would accept Ho: 1 = a at the 0.05 level if the calculated value of Z satisfies -1.96 < 58 < 1.96 and would reject it otherwise. 91 Page By Naresh Vasant Afre 9320344503 SAMPLING Challenge 01: The mean lifetime of a sample of 100 fluorescent light bulbs produced by a company is computed to be 1570 hours with a standard deviation of 120 hours. If 1 is the mean lifetime of all the bulbs produced by the company, test the hypothesis 1 = 1600 hours against the alternative hypothesis = 1600 hours, using a level of significance of (a) 0.05 and (b) 0.01. Solution: @ (i) (iii) (iv) (v) 101 Page Null hypothesis Hy: 1 = 1600 hours Alternative hypothesis H,: 11 + 1600 hours. Test statistic: Here n = 100, ¥ = 1570, s = 120 and = 1600. X=" _ 1570-1600 _ =p = 250. vn ¥i00 Level of significance: « = 0.05. Critical value: The value of Z, at 5% level of significance from table is 1.96. Decision: As the computed value does not satisfy ~1.96 < Z < 1.96, the null hypothesis is rejected at 5% level of significance. Now for 1% level of significance, critical value is Z, = 2.58 and the computed value satisfies -2.58 < -2.50 < 2.58, we accept Hy at 1% level of significance. By Naresh Vasant Afre 9320344503 SAMPLING Challenge 02: A tyre company claims that the lives of tyres have mean 42,000 kms with standard deviation of 4000 kms. A change in the production process is believed to result in better product. A test sample of 81 new tyres has a mean life of 42,500 kms. Test at 10% level of significance that the new product is significantly better than the old one. Solution: @ (i) (iii) (iv) (v) l1iPage Null hypothesis Ho: 1 = 42000 hours. Alternative hypothesis H,:41 > 42000 hours. Test statistic: Here n = 81, ¥ = 42500, s = 4000 and w= 42000. X- 42500-42000 7000 Zz tae. vn val Level of significance: « = 0.10: Critical value: The value of Zz at 10% level of significance for one tailed test from table is 1.28, Decision: As the computed value satisfy 0 < Z < 1.28, the null hypothesis is accepted at 10% level of significance. Which concludes that there is no improvement. By Naresh Vasant Afre 9320344503 SAMPLING Challenge 03: The mean monthly sale of a particular household item in department stores was observed as 133.3 units per store. After an advertising campaign, data on sales of this item from 25 stores were collected, and it is found that the mean sales have increased to 141.5 units with standard |. Was the adver that the level of significance is 5%. deviation of 15.2 uni ing campaign successful? Assume Solution: (i) Null hypothesis Ho: 41 = 133.3 units. That is we assume that the average sales of units in the population is 133.3 units, and hence the advertising campaign will not increase this average sale. Alternative hypothesis H,: 1 > 133.3 units. This means that the average sale of units after advertising campaign was increased, and hence the campaign was successful. Hence it is one-tailed test. (i) Test statistic: Here n= 25, X = 141.5, s = 15.2 and y = 133.3. Fou 1415-1333 Za-5 TEE = 2.697. vn v25 (iii) Level of significance: a = 0.05. (iv) Critical value: The value of Z, at 5% level of significance for one tailed test from table is 1.645. (v)_ Decision: As the computed value does not satisfy 0 < Z < 1.645, the null hypothesis is rejected at 5% level of significance. So, we accept Alternative hypothesis Hy: 1 > 133.3 , which concludes that the advertising campaign was successful. 121Page By Naresh Vasant Afre 9320344503 SAMPLING Ztest for difference of Means: Let ¥; and X; be the sample means obtained in large samples of sizes n; and n, drawn from respective populations having means 1, and 1, and standard deviations o, and ¢,. Consider the null hypothesis that there is no difference between the population means, ie., 14, = 13. Then we can use the following test to conclude about the hypothesis. Where, We can, if necessary, use the observed sample standard deviations s, and s, as estimates of a, and a. * We would accept He: satisfies -1.96 < 1 = Jl; at the 0.05 level if the calculated value of Z < 1.96 and would reject it otherwise. Challenge 04: An examination was given to two classes consisting of 40 and 50 students respectively. In the first class the mean grade was 74 with a standard deviation of 8, while in the second class the mean grade was 78 with a standard deviation of 7. Is there a significant difference between the performance of the two classes at a level of significance of 0.01? Solution: Here ny = 40, ny = 50,X, = 74,X, = 78,5, = 8 and s, =7. (Null hypothesis Ho: ty = bo. Alternative hypothesis Hy: 11, # i. ‘istic: << [Laz e (ii) Test statistic: s= +B = |F+ = 1.606 -2.49 131|Page By Naresh Vasant Afre 9320344503 SAMPLING (iii) Level of significance: a = 0.01. (iv) Critical value: The value of Z, at 1% level of significance from table is 2.58. (v) Decision: As the computed value satisfies —2.58 < Z < 2.58, the null hypothesis is accepted at 1% level of significance. Which concludes that at a 0.01 level there is no significant difference between the classes. Challenge 05: The means of two samples of sizes 1000 and 2000 respectively are 67.5 and 68 inches. Can the samples be regarded as drawn from the same population of standard deviation 2.5 inches? Solution: Here n, = 1000,n, = 2000,X; = 67:5, = 68,0, = 25 =a. (i) Null hypothesis Ho: 14 = tha. Alternative hypothesis Hy: ty # i>. (ii) Test statistic: s = J% 4% = |S 4 22 = 0.097 67.5 — 68 0.097 = 15 (ii). Level of significance: a = 0.01. (iv) Critical value: The value of Z, at 1% level of significance from table is 2.58. (¥) » Decision: As the computed value does not satisfy —2.58 fa. Test statistic: s = [24% = 173.3 - 171.5 135 Level of significance: a = 0.1. Critical value: The value of Z, at 10% level of significance for one tailed test from table is 1.28. Decision As the computed value does not satisfy Z < 1.28, the null hypothesis is rejected at 10% level of significance. Therefore, we conclude that the male students who participate in college athletics are taller than the ones who showed no interest at a 10% level of significance. Tt should be noted, however, that the null hypothesis can be accepted at a level of 0.05. 151Page By Naresh Vasant Afre 9320344503 SAMPLING Challenge 07: On an elementary school examination in Mathematics, the mean grade of 32 boys were 72 with a standard deviation of 8, while the mean grade of 36 girls was 75 with a standard deviation of 6. Test the hypothesis at a (a) 0.05 and (b) 0.01 level of significance that the girls are better in computation than the boys. Solution: Here n, = 32, ny = 36, Xj = 72, % = 75,0, = Bando =6. @ ) (iii) (iv) (v) Null hypothesis Ho: 1s = Ho Alternative hypothesis Hy: jy <2. tine (ona. [= Test statistic: s= JB 4S= |t4 = V3 72-75 3 =-1.73 Level of significance: a, = 0.05 and a, = 0.01 Critical value: For one tailed test the critical values at 0.05 and 0.01 level of significance are Zoos = 1.645 and Zoo; = 2.33 respectively. Decision’ As the computed value ~1.73 lies outside (~1.645,0) the null hypothesis is rejected at 5% level of significance ie., we conclude that the girls are better in computation than the boys But the computed value —1,73 lies inside (—2.33,0) the null hypothesis is accepted at 1% level of significance i.e., there is no difference between the performance of the boys and girls in the exam, 16|Page By Naresh Vasant Afre 9320344503 SAMPLING More challenges: Based on Tests of means 1. Arandom sample of 50 items give the mean 6.2 and variance 10.24. Can it be regarded as drawn from a normal population with mean 5.4 at 5% level of significance? Answer: Yes, Z = 1.77. 2. Can it be concluded that the average lifespan of an Indian is more than 70 years, if a random sample of 100 Indians has an average lifespan of 71.8 years with standard deviation of 8.9 years at 5% level of significance? Answer: Yes, Z = 2.02. 3. A machine is set to produce metal plates of thickness 1.5 cms with standard deviation of 0.2 cms. A sample of 100 plates produced by the machine gave an average thickness of 1.52. cms. Is the machine fulfilling the purpose? (Take @ = 0.1) Answer: Yes, Z = 1 4. An ambulance service claims that it takes on the average 8.9 minutes to reach its destination in emergency calls. To check on this claim, the agency which licences ambulance services has them timed on 50 emergency calls, getting a mean of 9.3 minutes with a standard deviation of 1.6 minutes. What can they conclude at the LOS = 0.05 3 Answer: Z = 1.768. 5, An insurance agent has claimed that the average age of policy holders who ensure through him is less than the average for all agents, which is 30.5 years. A random sample of 100 policy holders who had ensured through him gave the mean 28.8 years and standard deviation of 6.35 years. Test his claim at 5% level of significance. Answer: Z = —2.681. 17|Page By Naresh Vasant Afre 9320344503 SAMPLING 6. A sample of 100 items, drawn from the universe with mean value 64 and standard deviation 3 has mean value 63.5. Is the difference between the means significant? What will be your inference, if the sample had 200 items? 7. The mean breaking strength of cables supplied by a manufacturer is 1800 with a standard deviation 100. By a new technique in the manufacturing process it is claimed that the breaking strength of the cables has increased. In order to test the claim a sample of 50 cables is tested. It is found that the mean breaking strength is 1850. Can we support the claim at 0.01 level of significance? Answer: Z = 3.535. 8. A poper mill has agreed to buy waste paper for recycling from a waste collection firm, under the agreement that the waste collection firm will supply the waste paper in packages of 300 kg each, for which the paper mill will pay by the packages. To speed up their work the waste collection firm is making packages by some approximation procedure. The paper mill does not object to this procedure if it gets 300 kg per package on the average. The waste collection firm has an interest not to exceed 300 kg per package, because it is not being paid more, and not to go under 300 kg because the paper mill might terminate the agreement if it does. To estimate the mean weight of wate paper per package, the waste collection firm weighed 75 randomly selected packages and found that the mean weight was 290 kg and standard deviation was 15 kg. Can we infer that the mean weight per package in the entire supply was 300 kg? Answer: Z = 5.77. 9. A sample of 450 items is taken from a population whose standard deviation is 20, The mean of the sample is 30. Test whether the sample has come from a population with mean 29. 18|Page By Naresh Vasant Afre 9320344503 SAMPLING Based on difference of means 10.A sample of 100 electric bulbs produced by the manufacturer A showed a mean lifetime of 1190 hours and a standard deviation of 90 hours. A sample of 75 bulbs produced by the manufacturer B showed a mean lifetime of 1230 hours with a standard deviation of 120 hours. Is there a difference between the mean lifetimes of the two brands of bulbs at a significance level of (a) 0.05 and (b) 0.01 ? 11. A sample of 200 fish of a particular kind taken at random from one end of a lake had a mean weight of 20 Ibs and standard deviation of 2 Ibs. At the other end of the lake, a sample of 80 fish of the same kind had mean weight of 20.5 Ibs and a standard deviation of 2 Ibs. Is the difference between the mean weight significant? 12. Ina survey of buying habits, 400 women shoppers are chosen at random in supermarket ‘A’ located in a certain section of the city. Their average weekly food expenditure is Rs. 250 with a standard deviation of Rs. 40. For 400 women shoppers chosen at random in supermarket 'B' in another section of the city, the average weekly food expenditure is Rs. 220 with a standard deviation of Rs. 55. Test at 1% level of significance whether the average weekly food expenditure of the two population of shoppers is equal Answer: No. Z = 8.82. 13.A sample of heights of 6400 Englishmen has a mean of 67.85 inches and standard deviation of 2.56 inches, while a sample of heights of 1600 Australians has a mean of 68.55 inches and a standard deviation of 2.52 inches. Do the data indicate that Australians are, on the average, taller than Englishmen? 14.A storekeeper wanted to buy a large quantity of light bulbs from two companies labelled ‘one’ and ‘two’. He bought 100 bulbs from each brand and found by testing that brand ‘one’ had a mean lifetime of 1120 hours and the standard deviation of 75 hours; and brand 'two' had a mean 191|Page By Naresh Vasant Afre SAMPLING 9320344503 lifetime of 1062 hours and standard deviation of 82 hours. Examine whether the difference of means is significant. 15. Ina survey of incomes of two classes of workers, two random samples gave the following details. Examine whether the difference between the means is significant. Sample Size Mean annual income Standard deviation (in rupees) (in rupees) i 100 582 2 Ir 100 546 Ee 16. A random sample of 1200 men from one state gives the mean pay as Rs.400 per month with a standard deviation of Rs.60 and a random sample of 1000 men from another state gives the mean pay as Rs. 500 per month with a standard deviation of Rs.80. Discuss whether the mean levels of pay of men from the two states differ significantly 201Page By Naresh Vasant Afre 9320344503 SAMPLING T- test for means: In case samples are small (n < 30), we can formulate tests of hypotheses and significance using other distributions besides the normal, such as student's 1’, chi-square and F. The following is the student's T — test statistics to test the hypothesis Hy that a normal population has mean, 1: =(%=# =F) where X is the mean of a sample of size n and s is the standard deviation, which can be computed using the following We use Table 1 to find the critical value 7, for student's T — test, the table requires the value of v called the degree of freedom, which is v = n=1 Page By Naresh Vasant Afre 9320344503 SAMPLING t Table fcum.prob] tn tastes tetas tn tans tetas one-aill 0.50 0.25 0.20 0.15 0.10 0.05 0.025 0.01 0.005 0.001 0.0005) two-tails| 1.00 0.50 040 030 0.20 0.10 0.05 0.02 0.01 0.002 0.001 0.000 1.000 1376 1.963 3078 6.914 «12.71 31.82 6366 31831 636.62 0816 1061 1386 1.886 ©2920-4303 6.965 9925 22.327 31.509 0.000 0765 0978 1.250 1638 «2353 «31824841 «S841 10.215 12.924 0.000 0741 © ogdt 1190 «1533-2132 «2778 «3747 ©4604 7.1738 B10, 0.000 0.727 0920 1.156 1476 2015-2571 3.365 4032 5.803 6.869, anwn 8 3 11] 0000 0697 0.876 1088 «1363 «1.706 «2.201 «2718 «3.108 «4.025 «4.437 12) 0.000 0.695 0.873 1.083 1.356 1.762 «2179 2681 3.055 3.930 4.318 13) 0000 0.694 0.870 1079 1350 1.771 2180 2650 3.012 3.882 4.221 14] 0.000 0.692 0.858 1.076 1.345 1.761 2145 2624 «2977 «3.787 4.140 15| 0.000 0.691 0.866 1.074 1.341 1.753 2131 2602 2047 3.733 4.073 0.000 0686 0.859 1.063 «1.323 «1.721 2080 «2518 «28313827 3.819, 0.000 0.686 0.858 «1.081 1.321 «1.717 2074 ©2508 2819 3.505 3.702, Y 0685 0858 1.080 «1:319 «1.714 ©2089 25002807 3.485 3.768 0.000 0685 0857 10591318 «1.711 «2088 «2492-2797 3.467 3.745, 0.000 0.684 0.856 1.058 1.316 1.708 2.050 2485 2787 3.450 3.725 BENNY 8 0.000 0.681 0851 1.050 1.903 1.684 «202124232704 3.307 3551 0.000 0679 O48 «1.085 1.206 «1.671 2000 2390 ©2660 3232 3.460, 0.000 0.678 0.846 1.083 «1.202 «1.664 «1.990 2374 26303195 3.416. 100] 0.000 0677 0.885 1.082 «1.200 1.660 1.984 2364 «2626 «3.174 3.300 1000] 0.000 0675 0.842 1.037 1.282 «1.646 «1.962 2.330 2581 3.098 3.300 ess 0% 50% 60% 70% 80% 90% 95% 98% 99% 99.8% 99.9% Confidence Level Table 1, source: T Table | T Table (tdistributiontable.com) 221[Page By Naresh Vasant Afre 9320344503 SAMPLING Lo etes cite oe PT Tre e [=a t| ihrem spe fea) Challenge 01: In the past a machine has preduced washers having mean thickness of 0.050 inch. To determine whether the machine is in proper working order a sample of 9 washers is chosen for which the mean thickness is 0.053 inch and the standard deviation is 0.003 inch. Test the hypothesis that the machine is in proper working order using a level of (a) 0.05 and (b) 0.01. Solution: Here n = 10,¥ = 0.053 and s = 0.003. (Null hypothesis Hy: 0.050. (Machine is in proper working order) Alternative hypothesis H,: 1. + 0.050. (Machine is in proper working order), (i) ‘Test statistic: _ (ke = (C58 “\swn)~ \ 0.00873 (iii) Level of significance: a, = 0.08 and az = 0.01 (iv) Critical value: For two tailed test the critical values at 0.05 and 0.01 level of significance and v = n— 1= 9-1 =8 degrees of freedom are Ty os = 2.306 and Ty 9; = 3.355 respectively. 231Page By Naresh Vasant Afre 9320344503 SAMPLING (v) Decision: (a) As the computed value 3 lies outside (—2.306,2.306) the null hypothesis is rejected at 5% level of significance and v = 9. (b) As the computed value 3 lies inside (-3.355, 3.355) the null hypothesis is accepted at 1% level of significance and v = 9 degrees of freedom. Challenge 02: A soap manufacturing company was distributing a particular brand of soap through a large number of retail shops. Before a heavy advertisement campaign, the mean sales per week per shop was 140 dozens. After campaign a sample of 26 shops were taken and the mean sale was found to be 147 dozens with standard deviation of 16. Can you consider the advertisement ef fective using a level of (a) 0.05 and (b) 0.01. Solution: Here n = 26,X = 147 and s = 16. @ (i) Gi) (iv) (v) 2[Page Null hypothesis Hy: = 140. (Advertisement is not effective) Alternative hypothesis H,; > 140. (Advertisement is effective) Test statistic: r=( =H = (2B) a22 ~\sn)~ \ 16/26] Level of significance: a, = 0.05 and a, = 0.01 Critical value: For one tailed test the critical values at 0.05 and 0.01 level of significance and vy =n — 1 = 26 ~ 1 = 25 degrees of freedom are Tyos = 1.708 and Tyo: = 2.485 respectively. Decision: (a) As the computed value 2.19 lies outside (0, 1.708) the null hypothesis is rejected at 5% level of significance and v = 25. By Naresh Vasant Afre 9320344503 SAMPLING (b) As the computed value 2.19 lies inside (0,2.485) the null hypothesis is accepted at 1% level of significance and v = 25 degrees of freedom. Challenge 03: Arrandom sample of 10 boys had the following 1..'s: 70,120,110, 101,88, 83, 95, 98, 107, 100. Do these data support the. assumption of the population mean 1.Q. of 100 2 Solution: Xi Gj — 0? = @ - 97.27 70 739.84 120 519.84 110 163.84 101 14.44 88 84.64 83 201.64 9, 4.84 98 0.64 107 96.04 100 7.84 Ye — X)? = 1833.6 & V20373 = 14.27 (Null hypothesis Ho: = 100. (Population mean J... is 100) 21Page By Naresh Vasant Afre 9320344503 SAMPLING Alternative hypothesis H,: 41 # 100. (Population mean I.Q. is not 100) (ii) Test statistic: r=( #) (a “\sfyn) ~ \14.27/V10) — (iii) Level of significance: Not given in the question, so let a = 0.05. (iv) Critical value: For two tailed test the critical values at 0.05 level of significance and v = n— 1 = 10 ~ 1 = 9 degrees of freedom is Toos = 2.262. (v) Decision: As the computed value —0.62 lies inside. (—2,262, 2.262) the null hypothesis is accepted at 5% level of significance and v = 9 degrees of freedom. Challenge 04: A random sample of 8 envelops is taken from letter box of a post of fice and their weights in grams are found to be 12.1,11.9, 12.4, 12.3, 11.9, 12.1, 12.4,12.1. Does this sample indicate at 1% level that the average weight of envelops received at that post of fice is 12.35 gms ? ‘Solution: 261Page Xi (x — X)? = (% — 12.15)? 121 0.0025 19 0.0625 124 0.0625 123 0.0225 119 0.0625 12.1 0.0025 124 0.0625, 121 0.0025 By Naresh Vasant Afre 9320344503 SAMPLING @ (ii) (iii) (iv) vy) 271 Page Ye: — x)? = 0.28 = s=V004=0.2 Null hypothesis Ho: t = 12.35. Alternative hypothesis Hy: 1 # 12.35. Test statistic: r= (Cf) = (a Level of significance: «= 0.01, Critical value: For two tailed test the critical values at 0.01 level of significance and vy =n~—1=8-1= 7 degrees of freedom is Tao, = 3.499. Decision: As the computed value —2.83 lies inside (—3.499, 3.499) the null hypothesis is accepted at 1% level of significance and v = 7 degrees of freedom. By Naresh Vasant Afre 9320344503 SAMPLING T- test for difference of means: ‘Suppose the two random samples of size n; and n, are drawn from a normal (or approximately normal) populations whose standard deviations are equal, i., 6, = 03. Suppose further that these two samples have means and standard deviations given by X;, X, and s,,5, respectively. To test the hypothesis Hy that the samples are coming from the same population (i.e., 1, = jt, as well as 0, = o,,) we use the statistic given by Where Ings? + nas? on fats ny +m -2 The distribution of T is student's distribution with v =n, + mz —2 degrees of freedom. Challenge 05: Samples of two types of electric bulbs were tested for length of life and the following data were obtained, Type I Type IT No. of samples 8 7 Mean of the sample in hrs 1134 1024 Standard deviation in hrs 35 40 Test at 5% level of significance whether the difference in the sample means is significant. Solution: We have X, = 1134,X, = 1024,s, = 35,5, = 40,n, = 8 and n, (Null hypothesis Ho: 1 = uz Alternative hypothesis Hy: 1, # j. 2B Page By Naresh Vasant Afre 9320344503 SAMPLING (ii) Test statistic: msi +n2s3 _ |B x35? +7 x 40? ™m+m—-2~ Be7—2 = 40419 ay ae _ 1134-1024 _ 110 Sag = 5.288 208 Es z 40.19, (iii) Level of significance: « = (iv) Critical value: The table value at a = 0.05 for v= ny +1, =2 = 13 degrees of freedom is Tyos = 2.16. (v) Decision: As the computed value |7'| = 5.288 is greater than the table value, the null hypothesis is rejected. The difference is significant. Challenge 06: A medicine was found to be effective for 9 patients in 8 days on an average with standard deviation of 2.2 days. Another medicine administered to another group of 8 patients was found to be ef fective in 6 days on an average with standard deviation of 2.6 days. Use 5% level of significance to test the null hypothesis that the two medicines are equally effective. Soluti We have X, = 8,X, = 6,5, = 2, $2 = 2.6m, =9 and np (Null hypothesis Ho: sty = Io. Alternative hypothesis Hy: 11, # fz (ii) Test statistic: Ings? + nas} _ [9X 2.22 + 8x 2.62 “Ing —2 948-2 ~ 291 Page By Naresh Vasant Afre 9320344503 SAMPLING 2 =y=h o. a 1,1 mtig 255\5+5 (iii) Level of significance: a = 0.05. (iv) Critical value: The table value at a = 0.05 for v =n; +m. — degrees of freedom is Tyo5 = 2.131. (v) Decision: As the computed value |7'| = 1.613 is less than the table value, the null hypothesis is accepted at 5% level of significance. Challenge 07: The IQs (intelligent quotients) of 16 students from one area of a city showed a mean of 107 with a standard deviation of 10, while the IQs of 14 students from another area of the city showed a mean of 112 with standard deviation of 8, Is there a significant difference between the IQs of the two group at a (i) 0.04, (ii) 0.05 level of significance > ion: We have X; = 107, X;'= 112,s, = 10,s, = 8,n, = 16 and n, = 14. Tf ju, and 1, denote population mean IQs of the students from the two areas, we have to decide between the following Null and Alternate hypothesis (i) | Null hypothesis Ho: 4, = j1z, there is no difference between the groups Alternative hypothesis I1,:11, # jiz, there is a significant difference between the groups (ii) Test statistic: 7 z 16 x 10? + 14 x 8? ea fase tmese _ 16 XI0E + 1 Xe _ oan 1 12-2 16+14=2 301Page By Naresh Vasant Afre 9320344503 SAMPLING oe 107-112 7.1 7.1 «#3455 — Vntig 9442 /Te¢ +14 (iii) Level of significance: c, = 0.01 and cr, = 0.08. (iv) Critical value: The table values at a; = 0.01 and a, = 0.05 for v=ntn, Tyos = 2.048 28 degrees of freedom are Too, = 2.763 and (v)__ Decision: As the computed value |T| = 1.45 is less than the table values Tyo; and Tos, the null hypothesis is accepted at both 1% and 5% level of significance. 311|Page By Naresh Vasant Afre 9320344503 SAMPLING Paired T-test for difference of means: Let us consider a case when (i) the sample sizes are equal, i.e, n, =n; =n (say), and (ii) the two samples are not independent but the sample observations are paired together, i.e., the pair of observations (x;,y;), (i = 1,2,...,n) corresponds to the same i —th sample unit. The problem is to test if the sample means differ significantly or not. For example, suppose we want to test the efficacy of a particular drug, say, for inducing sleep. Let x; and y; (i= 1,2,...,n) be the readings, in hours of sleep, on the i-th individual, before and after the drug is given respectively. Here instead of applying the difference of means T — test, we apply the paired T — test given below. Here we consider the increments, d; = Ye G=1,2,...0). Under the null hypothesis, Ho that increments are due to fluctuations of sampling, i.e., the drug is not responsible for these increments, the following statistic follows Student's T —distribution with (n ——1) degrees of freedom. d T “s/n where. 32|Page By Naresh Vasant Afre 9320344503 SAMPLING Challenge 01: Below are given the gain in weights (in Ibs.) of pigs fed on two foods A and B. Pig number 1]2/]3 ]4{]5 |] 6 | 7 | 8 | Total Gain in | Food A 49 53 51 52 47 50 52 53 407 weight in Ibs Food B 52 ‘Sb. 52 53 50 54 54 53 423 Can we conclude that Food B is better than Food A in increasing the weight? Solution: X: Food A ¥: Food B q,=X-¥ (a- ay 49 52 =3 1 53 55 =2 0 51 52 =I 1 Cy 53 =i 4 a7 50 =3 1 50 54 =4 4 Sao WO | 7 54 -2 0 53 om 0 & Total 16 12 (i) Null hypothesis Hy: d = X —Y = 0 there is no significant difference in increase in weights between Food A and Food B Alternative hypothesis H,:X < Y, Food B is superior to Food A. (ii) Test statistic: 331Page By Naresh Vasant Afre 9320344503 SAMPLING Therefore, d -2 T= = —_* _ = -43205. S/Vn~ 13093/V8 (iii) Level of significance: a = 0.05. (iv) Critical value: The table value at a = 0.05 for v = 8— 1 =7 degrees of freedom for one tailed is Tyo; = 1.90. (v) Decision: As the computed value |7'| = 4:3205 is greater than the table value Tos, the null hypothesis is rejected at 5% level of significance and we conclude that Food 8 is superior to Food A. Challenge 02: Ten soldiers visit a riffle range for two consecutive weeks. For the first week their scores are: 67,24,57, 55, 63, 54,56, 68, 33,43 and during the second week they score in the same order: 70, 38, 58, 58, 56, 67, 68, 72, 42,38. Examine if there is any significant difference in their performance. Solution: X: 1% week score | Y: 2"4 week score aqj=X-¥ (@-a° 67 70 = 2.89 24 38 =14 56.49 57 58 =1 13.69 55 58 =3 2389 63 56 7 136.89 34|Page By Naresh Vasant Afre 9320344503 SAMPLING 54 67 =13 68.89 56 68 =i2 53.29 68 72 =4 0.49 33 42 9 18.49 B 38 5 94.09 Total =47 448.1 (i) Null hypothesis Ho:d = X —¥ = 0, there is no significant (ii) (ii ) (iv) ) 351Page difference in the scores of 1* and the 2 week. Alternative hypothesis H,: X # Y, there is a significant difference in the scores of 1* and the 2"! week. Test statistic: n a ft d=-)) a . 2 448.1 Ya -d) = > 1 S = V49.7889 = 7.0561 49.7889, Therefore, da a7 S/n 7.0561/V10 — 2.1064. Level of significance: a = 0.05. Critical value: The table value at « = 0.05 for v= 10-1=9 degrees of freedom for two tailed is Ty os = 2.262. Decision: As the computed value |T| = 2.1064 is less than the table Value Ty.os, the null hypothesis is accepted at 5% level of significance and we conclude that there is no significant difference in the scores of 1** and the 2" week. By Naresh Vasant Afre SAMPLING 9320344503 Challenge 03: Two researchers adopted different sampling techniques while investigating the same group of students to find the number of students falling in different intelligence level. The result is as follows: Researcher Below Average Above Genius average average x 86 60 44 10 y 40 33 es [N 2 Would you say that the sampling techniques adopted by the two researchers are significantly different? Solution: x Y d,=*-Y (a;- a" 86 40 46 441 60 33 27 4 4 25 19 36 10 2 8 289 Total 25 770 (i) Null hypothesis Hy:d = ¥—¥ =0, there is no significant difference in the sampling techniques adopted by the two researchers. Alternative hypothesis H,:X # Y, there is a significant difference. sampling techniques adopted by the two researchers. (ii) Test statistic: 361Page By Naresh Vasant Afre 9320344503 (ii) (iv) Ww) 371 Page SAMPLING ste ya a = 2 = 256.67, =— dy = = 256.67, = 1 2 S = V256.67 = 16.0209 Therefore, a a) ~ S/WNn~ 16.0209/V4 3.1209. Level of significance: « = 0.05. Critical value: The table value at a = 0.05 for v= 4— 1 = 3 degrees of freedom for two tailed is Ty 95 = 2.353. Decision: As the computed value |T| = 3.1209 is greater than the table value Tos, the null hypothesis is rejected at 5% level of significance and we conclude that there is a significant difference sampling techniques adopted by the two researchers. By Naresh Vasant Afre 9320344503 SAMPLING More challenges: Based on Tests of means 1. A test of breaking strength of 6 ropes manufactured by a company showed a mean breaking strength of 7750 lb and a standard deviation of 145 Ib, whereas the manufacturer claimed a mean breaking strength of 8000 lb. Can we support the manufacturer's claim at a level of significance of (a) 0.05, (b) 0.01 2 Answer: T = -3.86, 2. Arrandom sample of size 16 from a normal population showed a mean of 103.75 and standard deviation of 7.5. Can we say that the population has mean of 108.75 ? 3. A company supplies tooth paste in a packing of 100 gms. A sample of 10 packings gave the following weights in gins. 100.5, 100.3, 100.1, 99.8, 99.7,99.7, 100.3, 100.4, 99.2, 99.3. Does the sample support the claim of the company that the packing weighs 100 gms ? Based on difference of means 4. Onan examination in psychology 12 students in one class had a mean grade of 78 with standard deviation of 6, while 15 students in another class had a mean grade of 74 with a standard deviation of 8. Using a significance level of 0.05, determine whether the first group is superior to the second group. 5. Six guinea pigs injected with 0.5 mg of a medication took an average 15A secs to fall asleep with a standard deviation of 2.2 secs, while six other guinea pig pigs injected with 1.5 mg of medication took an average 11.2 secs to fall asleep with a standard deviation of 2.6 secs. Use 5% level of significance to test the null hypothesis that the difference is dosage has no effect. 3B1Page By Naresh Vasant Afre 9320344503 SAMPLING Answer: |T| = 3.02 6. If two independent random samples of sizes 15 and 8 have respectively the following means and population standard deviations, X=980, X,=1012, o,=75, o,=80. Test the hypothesis that 1, = pig at 5% level of significance. Based on difference of means paired 1 — test 7. The scores of 10 candidates prior and after training are given below: Prior | 84 | 48 36 37 54 69 =| 83 96. |90 |65 After | 90 58 56 | 49 62 81 84 86 84 |75 Is the training ef fective? 8. Eleven school boys were given a test in Statistics. They were given a month's tuition and a second test was held at the end of it. Do the marks give evidence that the students have benefited by extra coaching? Boys 1 2 3 4 5 6 7 8 9 | 10] 11 1test | 23 | 20 | 19 [21 | ig | 20 | 18 | 17 | 23 | 16 | 19 atest | 24 | 19 | 22 | 18 | 20 | 22 | 20 | 20 | 23 | 20 | 18 39] Page By Naresh Vasant Afre 9320344503 SAMPLING Chi-Square distribution: Tf X4,Xp,..X, are normally distributed independent random variables, ‘then it is known that X? + X} +-- +X; follows a probability distribution, called Chi-Square (7) distribution with n degrees of freedom. The probability density function of the x? —distribution is given by OAPI %, x3 (a), we will reject Ho and conclude that the difference is significant. 41|Page By Naresh Vasant Afre SAMPLING 9320344503 ASSSELRERRERS Baesaguaaes a2|Page 0 0.00 0.02 oun 0.30 0.55 os7 124 1.65 209 256 3.05 357 411 4.66 5.23 581 Gal 701 763 8.26 954 10.86 12.20 13.56 14.95 16.36 4779 20.69 23.69, 26.66 271 33.97 3748 4144 4544 49.48 58. 57.63 6.75, 65.90 70.06 Chi-square Distribution Table O75 0.00 0.05 0.22 0.48. 0.83. 1.24 1.69 218 2.70 3.25 3.82 4.40 5.01 5.63 6.26 61 7.56 8.23 8o1 9.59 10.98 12.40 13.84 15.81 16.79 18.29 19.81 DSS 26.00 29.16 32.36 36.40 40.48 44.60 48.76 52.00 57.15 61.39 65.65 69.92 Tae 5 0.00 0.10, 0.35 o7 115 1.64 2a7 278 3.33 3.04 457 5.23 5.80 637 7.26 7.06 8.07 9.39 10.12 10.85 1234 13.85 15.38 16.98 18.49 20.07 21.66 2.88 28.14 3144 34.76 38.06 43.19 4745 SLTA 56.05 60.39 64.75 69.13, 73.52 77.98 a om 021 0.58 1.06 1.6L 2.20 2.83 349 417 437 5.58 6.30 708 779 855 oat 10.09 10.86 11.65 12.44 14.04 15,66 17.29 18.94 20.60 22.97 23.95 27.34 30.77 34.22 37.69 42.06 46.46 50.88 55.33 59.79 64.28 68.78 73.29 77.82 82.36 T an AGL 6.25 18 9.4 10.64 12.02 13.36, 14.68 15.99 17.28 18.55 19.81 21.06 2231 23.54 2477 25.99 27.20 28.41 30.81 33.20 39.96 37.02 40.26 42.98 44.90 49.51 54.09) 58.64 63.17 68.80, 7440 79.97 35.53 91.06, 96.58 102.08 107.57 113.04 118,50, 05 aad 5.99 78L 9.49 11.07 12.59 M407 1351 1692 1831 1068 21.03 2.36 23.68 25.00 26.30 27.59 2887 30.14 BLAL 83.02 4134 4377 46.19 48.60 53.38 58.12 62.83 67.50 7331 79.08 S482 90.53 96.22 101.88, 107.52 113.15, 118.75 1434 30.19 31.53, 32.85, 34.17 36.78, 36 41.92 44.46 46,98 49.48. 5197 56.90 61.78 66.62 7142 77.38 83.90 89.18) 95.02 100.84 106.63, 112.99 gid 123.86, 120.56, a 6.63 9.21 1.34 13.28 15,09 16.81 18.48 20.09 21.67 23.21 2.72 26.22 27.69) 29.14 390.58 32.00 33.41 34.81 36,19) 37.57 40.29 42.98 45.64 48.28 50.89) 53.49 56.06 61.16 66.21 71.20 76.15, $2.29 88,98 94.42 100.43, 106.39 112.33 118.24 240.12 129.97 135,81 By Naresh Vasant Afre 9320344503 SAMPLING Challenge 01: The following table shows the distribution of the digits in the numbers chosen at random from a telephone directory: Digit: 0 a} 2)/3 } 4 | 5 | 6 | 7 | 8 | 9 | Total Frequency: | 1026 | 1107 | 997 | 966 | 1075 | 993 | 1107 | 972 | 964 | 853 | 10000 Test whether the digits may be taken to occur equally frequently in the directory. Solution: Let Hy: The digits occur equally frequently i.e., they follow a uniform distribution, The total frequency of digits is 10000. Tf the digits occur uniformly, then each digit will occur ~“°°° = 1000 times. 0; Ey (0; — E;)? 1026 1000 676 1107 1000 11449 997 1000 9 966 1000 1156 1075 1000 5625 993 1000 49 1107 1000 11449 972 1000 784 964 1000 1296 853 1000 21609 10000 Xe —E,)? = 54102 43 | Page By Naresh Vasant Afre 9320344503 SAMPLING oop = 54102. x (0;-E)? _ 54102 Here the degree of freedom v = 10-1=9. From the table xéos(9) = 16.92. As the calculated value x? > x305(9), Ho is rejected. That is, the digits do not occur uniformly in the telephone directory. Challenge 02: The following data give the number of air-craft accidents that occurred during the various days of a week. [B= | Mon Tue | Wed | Thu | Fri | Sat No. yy 15 19 13 12. 16 15 accidents: Test whether the accidents are uniformly distributed over the week. Solution: Let Hy: Accidents occur uniformly over the week. The total frequency of digits is 90. Tf the digits occur uniformly, then each digit will occur “°° = 1000 times. 0; Ej (0; =F)? 15 15 0 19 15 16 13 15 4 12 15 9 16 15 i 15 15 0 X@,- #97 = 30 4a | Page By Naresh Vasant Afre 9320344503 SAMPLING (0; - E;)? _ 30 yee Here the degree of freedom v = 6-1 From the table xo5(5) = 11.07. As the calculated value x? < x§5(5), Ho is accepted. That is, accidents occur uniformly over the week Challenge 03: A survey of 320 families with 5 children revealed the following distribution: No. of 0 1 2 3 4 5 boys No. 1 0. of 5 4 WA 2 0 girls | No. of | 12 40 88 110 56 14 families Is this result consistent with the hypothesis that male and female births are equally probable? Solution: Let Hy: Male and female births are equally probable i.e., P(male birth) }and P(Female birth) = >. Using Binomial distribution, the probability that a family of 5 children has male children is P(X =r) =8c,(*)' (2). Yen P= x) Fy = 320 x P(X =x) 0; Gi- Ey 0 T 10 12 4 Be 1 I 50 40 100 ge 45 | Page By Naresh Vasant Afre 9320344503 SAMPLING 2 1 100 88 144 10x55 3 1 100 110 100 10x55 4 1 50 56 36 5X zs 5 1 10 14 16 3B Here the degree of freedom v = 6-1 From the table yZox(5) = 11.07. As the calculated value x? < x3os(5); Ho is accepted. That is, male and female births are equally probable. Challenge 04: Fit a Poisson distribution for the following distribution and also test the goodness of fit. xX 0 1 2 3 4 5 Total Frequency: | 142 156 69 27 5 1 400 Solution: Let Ho: The given distribution fits into Poisson distribution. To fit a Poisson distribution, we need to find the parameter mean m of the distribution. io fit “SE The Poisson distribution follows the following probability density function 4 0X 142 +1X156+2X694+3X2744X545X1_ 400 400 ~ 400 m P(X =x) 46 | Pace By Naresh Vasant Afre 9320344503 SAMPLING hee P(K=x) E, = 400 x P(X =x) 0; 0 0.368 147 142 1 0368 147 156 2 0.184 vid 69 3 0.0613 2 a 4 0.0153 6 5 5 0.0031 a 1 As the last two frequencies are less than 10. Therefore we regroup the frequency and get the following a 0; 142 156 0 aa Br 147 147 74, az EP 25 a1 25 4 S Co eB 28 to: Bat iar 74* 32 Here we have used sample data to compute expected frequency and mean m. Sov =n—2= 4-2 = 2 and x3o5(2) = 5.99 As the calculated value x < 395(2), Ho is accepted. That is, the given distribution fits into Poisson di: a7 | Page ribution. By Naresh Vasant Afre 9320344503 SAMPLING Challenge 05: A theory predicts the proportion of beans in the four groups 4,B,C and D should be 9:3:3: 1. In an experiment among 1600 beans, the number in the four groups 882,313, 287 and 118. Does the experimental result support the theory? Solution: Let Hy: The experimental result support the theory. E o (0;- Ei)? 2 x 1600 = 900 882 324 16 ~ 5 1600 = 300 313 169 16 ~ 5 1600 = 300 287 169 7g 1600 = + x 1600= 100 18 324 16 \ (p=? _ 324 169 169 324 E 900 ~ 300° 287° 118° ae Here the degree of freedom v From the table 7j95(3) = 7.81. As the calculated value x? < x305(5), Ho is accepted. The experimental result support the theory 48 | Page By Naresh Vasant Afre 9320344503 SAMPLING x ‘est for independence of attributes: If the population is known to have two major attributes A and B, then A can be divided into m categories Ay,Ap,.,Am and B can be divided into n categories By, Bz, ..,By. Accordingly the members of the population, and hence those of the sample, can be divided into mn classes. In this case the sample data may be presented in the form of a matrix containing m rows and n columns and hence mn cells, and showing the observed frequencies 0, in the various cells, where i = 1,2,..,.m and j= 1, That is 0;; means the number of observed frequencies possessing the attributes 4; and B;. The matrix or tabular form of the sample data is called Contingency table and is given below: Row A\B B, B. - B - By N * * " total Ay On Or = On 2 Orn O11. Ap On On. = On; 7 Oo On. Ay On Oi | Oi; - Oin Or. Am } Oma Ome = Om; = Onn Ors Column Hh _ ¥ i total . I 7” o Now based on the null hypothesis Hy: that the two attributes A and B are independent, we compute the expected frequency Ei; for various cells, using the following formula 0;, x 0, id Then we compute a9 | Page By Naresh Vasant Afre 9320344503 SAMPLING ¥ _ {i a The number of degrees of freedom v = (m—1)(n— 1). Tf x2 < x2(q), la is accepted at a level of significance and v degrees of freedom ie., the attributes A and B are independent. Tf x2 > x2(q), Ha is rejected at a level of significance and v degrees of freedom i.e., the attributes A and B are not independent. Challenge 06: The following data is collected on two characters. Based on this, can you say that there is no relation between smoking and literacy? ‘Smoking Smokers Non- smokers Literacy Literates 83 57 Illiterates 45 68 Solution: Let H: Literacy and smoking habits are independent. ‘Smoking ‘Smokers Non- smokers Row total Literacy Literates 83 57 140 Illiterates 45 68 113 Column total 128 125 253 S01Page By Naresh Vasant Afre 9320344503 SAMPLING 0 E (-EY E 83 128 x 140 Say 70.83 © 71 57 125 x 140 aay 0917 69 45 128 x 113 Sey STAT 87 68 | 125 x 113 7 Sey = 58.83 & 56 The degree of freedom is v = (2—1)(2—1) =1 9.22. From table xZo5(1) = 3.84. As x > x25(1)- Mo is rejected That is, there is some association between literacy and smoking. Challenge 07: The following data shows the result of an experiment to investigate the effect of vaccination of laboratory animals against a particular disease. Using (a) 0.01, (b) 0.05 significance level, test the hypothesis that there is no dif ference between the vaccinated and unvaccinated groups i., vaccination and this disease are independent. Disease Got disease Did not get disease Vaccination Vaccinated 9 42 Not vaccinated 17 28 S11Page By Naresh Vasant Afre SAMPLING 9320344503 Solution: Let Hy: Vaccination and getting disease are independent. Disease Got disease | Did not get disease | Row total Vaccinatior Vaccinated 9 2 51 Not vaccinated 17 28 5 Column total 26 70 96 0 E =F E ° =138be 14 a 2 = 37.1937 = 0676 a, = 12.19 = 12 28 = 32.81 > 33 The degree of freedom is v = (2—1)(2-1)=1 From table xZos(1) = 3.84. As x? > x8os(1). Ha is rejected That is, there is some association between vaccination and getting disease. S21Page By Naresh Vasant Afre 9320344503 SAMPLING Challenge 08: Two sample polls of votes for two candidates A and B for a public office are taken, one from among the residents of rural areas. The result is given in the table. Examine whether the nature of the area is related to voting preference in this election. Votes for A B Total Area Rural 620 380 1000 Urban 550 450 1000 Total 1170 830 2000 Solution: Let Hg: The nature of area is independent of the voting preference in the election. @ B (oF E 620 1170 x 1000 2.0940 zo00 = 985 380 830 x 1000 2.9518 z000 = “15 550 1170 x 1000 2.0940 2000 = 988 450 830 x 1000 2.9518 2000 = 428 Ey pt 1 = 2.0940 + 2.9518 + 2.0940 + 2.9518 = 10.0916, a The degree of freedom is v =(2~-1)(2-1)=1 From table yio5(1) = 3.84. As x? > x0s(1). Hp is rejected. That is, the nature of the area is related to voting preference in this election. 531Page By Naresh Vasant Afre 9320344503 SAMPLING More challenge: 1. A die was thrown 132 times and the following frequencies were observed. Outcome | 1 2 3 4 5 6 | Total frequencies) 15 20 25 15 29 28 132 Test the hypothesis that the die is unbiased. 2. The following mistakes per page were observed ina book. No. ° a 0 1 2 3 4 mistakes No. of ° 21 90 19 5 2 pages Fit a Poisson distribution and test the goodness of fit. 3. A die is thrown 60 times with the following results. Face 1 2 3 4 5 6 Frequency| 8 7 12 8 4 11 Test at 5% level of significance if the die is honest. 4. Inan experiment of pea-breeding, Mendal obtained the following frequencies of seeds: 315 round and yellow; 101 wrinkled and yellow: 108 round and green; 32 wrinkled and green. Total 556. Theory predicts that the frequencies should be in the proportion 9: 3:3: 1 respectively. Does the experimental result support the theory? 5. Abird watcher sitting in a park has spotted a number of birds belonging to 6 categories. The exact classification is given below: Category 1 2 3 4 5 6 Frequency| 6 7 13 7 6 5 SA|Page By Naresh Vasant Afre 9320344503 SAMPLING Test at 5% level of significance whether or not the data is compatible with the assumption that this park is visited by birds belonging to these six categories in the proportion 1:1:2:3: 1:1. 6. The following table gives the number of accounting clerks not committing errors among trained and untrained clerks working in an organisation. No. of clerks No. of clerks not | Total committing errors | committing errors Trained 70 530 nl Untrained 155 745, avo Total 225 1275 oft 7. The following table shows the performances of students in Mathematics and Physics. Test the hypothesis that the performance in Mathematics is independent of performance in Physics. Grades in Mathematics High Medium Lov Grades in High 56 1 R Physics Medium 47 163 on Low 14 42 a 8. A random sample of students was selected and asked their opinion about ‘autonomous colleges’. The result is given below. Opinions Class Total Favouring Opposing F.Y.B. Tech 120 80 200 S.Y.B. Tech 130 70 200 S51Page By Naresh Vasant Afre 9320344503 SAMPLING T.y. B. Tech 70 30 100 Fourth Y. B. Tech 80 20 100 Total 400 200 600 Test the hypothesis at 5% of level of significance that opinions are independent of the class groupings. S61Page Answer: x? = 12.7428 By Naresh Vasant Afre 9320344503 SAMPLING References: 1. Murray R. Spiegel, John Schiller, R. Alu Srinivasan, Probability and Statistics, Schaum's Outlines, Mc Graw Hill. 2. S.C. Gupta and V. K. Kapoor, Fundamentals of Mathematical Statistics, Sultan Chand and Sons. 3. Wayne W. Daniel and Chad L. Cross, Biostatistics (A foundation for analysis in Health Sciences), 10% Edition, Wiley. S71Page

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy