A1 QuestionFinalExam
A1 QuestionFinalExam
DURATION : 3 HOURS
INSTRUCTIONS TO CANDIDATES:
1. This examination paper consists of SEVEN (7) questions. Answer ALL questions.
2. All answers to a new question should start on a new page.
3. All the calculations and assumptions must be clearly stated.
4. All calculations must be in FOUR (4) decimal places and use appropriate statistical
notations.
EXAMINATION REQUIREMENTS:
1. Statistical Tables & Formulae 2.0
APPENDIX:
1. None
This examination paper consists of EIGHT (8) printed pages including the front page.
CONFIDENTIAL 2122II/BUM2413
An experiment was conducted to compare two methods (namely slide method and digital
method) used by medical professionals in obtaining medical images for diagnosis purposes. A
group of medical professionals were required to retrieve an image from a library of slides (slide
method) and from a computer database (digital method). The time (in seconds) taken for each
medical professional to retrieve an image using both methods were recorded. A statistical
analysis for the recorded paired data using 4% significance level was conducted and the
Microsoft Excel output is given in Figure 1.
Figure 1
i) Are the sample taken from two different populations?
[1 Mark]
ii) Is there enough evidence to suggest that retrieving image using the digital method is
faster than the slide method by more than 30 seconds? Use P-value approach.
[5 Marks]
iii) Conduct a necessary analysis to test the time taken by the medical professionals to
retrieve an image using the digital method has a standard deviation of 3 seconds. Use
10% level of significance.
[8 Marks]
2
CONFIDENTIAL 2122II/BUM2413
QUESTION 2 [6 MARKS]
A production manager of a computer supplier company claims that circuit board Model C20-
S produced by the company has a failure rate of 30% on a thermal cycling test. The thermal
cycling test is known to cause failures in boards with weak circuit connections. Given that a
97% confidence interval for the true percentage for the failure of circuit boards is
( 0.1814,0.2986) .
i) Calculate the estimation error made in estimating the true percentage for the failure of
circuit boards using the given confidence interval.
[2 Marks]
ii) Use confidence interval approach to test the production manager’s claim.
[4 Marks]
The effectiveness of the pesticides produced by the selected companies have been compared
by the types of machines used during the production (machine A, B and C) and the total amount
of the chemical added in the solutions (50ml, 100ml and 150ml). The life span (in minutes) of
the insects were recorded for each machine type and solution amount used. The result of
ANOVA is summarised as in Figure 2 at 2% significance level.
ANOVA
Source of Variation SS df MS F P-value F crit
Sample 12.6667 2 6.3333 2.3425 0.124685 4.9001
Columns 62.8889 2 31.4444 11.6301 0.000572 4.9001
Interaction 20.4444 4 5.1111 1.8904 0.155925 R
Within P 18 Q
Total 144.6667 26
Figure 2
Based on Figure 2,
i) identify the number of replicates and treatments? State all the treatments involved in
this study.
[4 Marks]
3
CONFIDENTIAL 2122II/BUM2413
iii) is there any interaction effect between the types of machines used and the total amount
of chemical added in the solutions?
[5 Marks]
iv) based on your answer in iii), do we need to test for the marginal effect? Give a reason.
[2 Marks]
A researcher is interested to study the relevancy of lockdown to curb the spread of COVID-19
in Malaysia by looking at the relationship between the number of MySejahtera check-ins and
the number of new cases reported. The data from the Ministry of Health Malaysia for seven
randomly selected days is given in Table 1.
Table 1
Number of
Number of
Day MySejahtera check-ins
new cases
(in millions)
1 19.8 2461
2 19.9 1870
3 14.6 3731
4 17.5 4521
5 14.5 5725
6 18.9 1229
7 15.1 2464
4
CONFIDENTIAL 2122II/BUM2413
ii) Use the following partial information to calculate the correlation coefficient. Then,
interpret its value.
x = 120.3, y = 22001, x y
2
= 2102.93, 2
= 84, 270,585
S xx = 35.4886, S yy = 15,121, 442
[6 Marks]
iii) Estimate the regression model parameters and write the estimated linear regression
model.
[6 Marks]
vi) Based on your answer in v), would you recommend lockdown if COVID-19 cases rise
again? Justify your answer.
[2 Marks]
5
CONFIDENTIAL 2122II/BUM2413
An analyst from the service provider company, Uper is interested to study the distance (in km)
covered by an Uper driver based on the age and the number of years of driving experience. A
statistical analysis at 4% level of significance is conducted using Microsoft Excel and the output
is shown in Figure 3.
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.6924
R Square 0.4794
Adjusted R Square 0.5492
Standard Error 4517.3675
Observations 11
ANOVA
Significance
df SS MS F F
Regression 2 150316757.6 75158379 3.6830 0.0335
Residual 8 163252870 20406609
Total 10 313569627.6
Standard
Coefficients Error t Stat P-value Lower 96.0% Upper 96.0%
Intercept 37985.042 4348.1380 8.7359 2.31E-05 27336.5169 48633.5662
age -5.2979 205.6174 -0.0258 0.9801 -508.8520 498.2561
experience -1669.4665 961.7823 -1.7358 0.02081 -4024.8569 685.9239
Figure 3
iii) Use the P-value approach to test whether there exists significant relationship between
the driver’s age and the number of years of driving experience for the distance covered.
[5 Marks]
6
CONFIDENTIAL 2122II/BUM2413
iv) Based on Figure 3, are both independent variables significant to be considered as the
recruitment criteria for Uper drivers?
[3 Marks]
v) State the predicted linear regression model. Then, use the model to predict the distance
covered by a 33-year-old driver with 5 years of driving experience.
[3 Marks]
Let X be the number of defect bottles from an operation of manufacturing water bottles in a
carton of 24 bottles. A total of 40 cartons are inspected and the number of defect bottles for
each carton are recorded in Table 2.
Table 2
Number of defects 0 1 2 3
Frequency 3 14 M 7
Probability 0.0080 0.0960 0.3840 0.5120
ii) Can we conclude that the number of defect bottles recorded fit to the Binomial
distribution at 1% significance level with n = 3 and p = 0.2 ?
[10 Marks]
7
CONFIDENTIAL 2122II/BUM2413
A study is conducted to determine the relationship between the results (pass or fail) in a
mathematics course among UMP students and their education background (STPM,
Matriculation and Diploma program). A random sample of 75 UMP students were observed
where the ratio of passed and failed students is 2:1. Table 3 shows the distribution of passed
and failed students from three different education backgrounds.
Table 3
Education Results of mathematics course at UMP
background Pass Fail
STPM X 5
Matriculation 20 Y
Diploma 12 10
ii) Would you conclude that the results of UMP students in mathematics course is
dependent on their education background at = 0.025 ?
[10 Marks]