Review Question Stat
Review Question Stat
Introduction
1. What is statistics?
Statistics is the scientific application of mathematical principles to the collection,
analysis, and presentation of numerical data.
2. Define the terms population, sample, parameter and statistic. Explain the
relationships among them
Population:-.
A populationis the set of all the individuals that an investigator wishes to study and draw
conclusions.
Sample:- Asampleis a subset of the population that is actually being studied.
Parameter:-
Measures used to describe the population are calledparameters
Statistics:-
Measures computed from sample data are calledstatistics
3. All of the items or individuals about which you want to draw a conclusion are known
as: ( )
a. a population
b. a parameter.
c. a statistic
d. a sample
4. A numerical measure that describes a characteristic of a sample is known as: ( )
a. a statistic.
b. a census.
c. a parameter.
d. the scientific method.
5. The general manager of a large corporation randomly selected 150 employees to
determine the adequacy of the company's medical plan. When the selected
employees were asked whether or not the medical plan was adequate, 54% responded
\yes, the medical plan is adequate". The value 54% is( )
a. a parameter.
b. a statistic.
c. a sample of employees.
d. a population of interest.
6. The branch of statistics that uses sample data to draw conclusions about an entire
population is known as: ( )
a. Inferential statistics.
b. Descriptive statistics.
c. Experimentation.
d. Primary sources.
7. Responses to the question, “How tall are you?” represent what type of variable?( )
a. numerical continuous
b. numerical discrete
c. nominal
d. categorical
8. A student’s class designation, shown as Freshman, Sophomore, Junior, or Senior
represents which scale of measurement? ( )
a. Interval Scale
b. Ordinal Scale
c. Ratio Scale
d. Nominal Scale
9. An investigator into the life expectancy of IV drug abusers divides a sample of patients
into HIV-positive and HIV-negative groups. What type of data does this division
constitute? ( )
a. nominal
b. ordinal
c. interval
d. ratio
e. continuous
In order to determine whether a new gene therapy will benefit colon cancer patients, a
random sample of patients is given either the new gene therapy, conventional therapy, or
a placebo. The number of months of survival was measured to determine therapy success.
10. The independent variable was: ( ) a. the type of therapy
b. the number of months survival
c. gene therapy
d. colon cancer
11. The dependent variable was: ( )
a. the type of therapy
b. the number of months survival
c. gene therapy
d. colon cancer
12. Statistical methods are classified into two major categories: descriptive and
inferential. Describe the general purpose for the statistical methods in each
category.
13. A medical researcher is testing the effectiveness of a new drug for treating
Parkinson's disease. Ten subjects with the disease are given the new drug and 10 are
given a placebo. Improvement in symptomology is measured. What would be the
roles of descriptive and inferential statistics in the analysis of these data?
14. Classify the following variables according to whether they are nominal, ordinal,
discrete quantitative, or continuous quantitative.
(a) Systolic blood pressure, measured to the nearest mmHg, in a series of patients
admitted to hospital with myocardial infarction.
(b) Blood cholesterol level, measured to the nearest 0.1 mmol/l, in a series of men
attending a health promotion clinic.
(h) Number of sexual partners in the past month in a series of patients attending a
clinic for sexually transmitted diseases.
Descriptive statistics I
1. The median of the values 3.4, 4.7, 1.9, 7.6, and 6.5 is 4.7. ( )
2. The standardized normal distribution has a mean of 0 and a standard deviation of 1.(
)
3. In a symmetrical distribution, the three measures of central tendency (i.e., the mean,
median, and mode) are equal. ( )
4. If the mean of a numerical data set exceeds the median, the data are considered to be
left-skewed. ( )
5. The coefficient of variation (CV) measures the variability in a data set relative to the
size of the median. ( )
6. Theoretically speaking, the total area under any normal curve is always equal to 1.00.
()
7. Which of the following is not a measure of central tendency? ( ) a. Mean
b. Median
c. Mode
d. Standard deviation
e. A., B., and C. are all measures of center.
8. Consider the following data: 1; 7; 3; 3; 6; 4 The mean and median for this data are: (
)
A. mean = 4, median = 3
B. mean = 4.8, median = 3.5
C. mean = 4, median = 3.3
D. mean = 4.8, median = 3
E. mean = 4, median = 3.5
9. Which measure of central tendency can be used for both numerical and categorical
variables?( )
A. Mean.
B. Quartiles.
C. Mode.
D. Median.
10. Which of the following measures is resistant to outliers? ( )
A. mean
B. median
C. standard deviation
D. variance
E. range
11. The population mean is_____; the sample mean is _____. The population standard
deviation is _____; the sample standard deviation is _____. ( )
12. If the variance of a distribution is 9, the standard deviation is: ( )
A. 3
B. 6
C. 9
D. 81
E. Impossible to determine without knowing n.
13. We have five sets of data that have been plotted with a dot representing an
observation.
14. Suppose you and your classmates take a very difficult exam. Your teacher is
feeling kind and decides to add 10 points to everyone's exam score. How does the sample
mean and sample standard deviation change after the 10 points are added? ( )
a. The sample mean and sample standard deviation each increase by 10.
b. Only the sample mean increases by 10.
c. Only the sample standard deviation increases by 10.
d. Neither the sample mean nor the sample standard deviation increase by 10.
15. The mean and standard deviation are good descriptions for ( )
A. Categorical variables.
B. Bimodal distributions.
C. Any quantitative variable.
D. Symmetric distributions with no outliers.
E. Skewed distributions.
16. A distribution of scores has a mean = 30, Median = 20, and a Mode = 10. The
distribution: ( )
a. has a positive skew
b. has a negative skew
c. is normal
d. is bimodal
17. Which of the following is not a property of the normal distribution? ( )
A. Its range is from negative infinity to positive infinity.
B. It is bell-shaped.
C. It is slightly skewed left.
D. All central tendency measures (i.e., mean, median, mode) are of equal numerical
value
18. According to the empirical rule, if the data form a "bell-shaped" normal distribution,
approximately what percent of the observations will be contained within two
standard deviations around the mean? ( )
A. 99.7%
B. 68%
C. 75%
D. 95%
19. Among women aged 18 to 34 in a community, weight is normally distributed with a
mean of 52 kg and a standard deviation of 7.5 kg. What percentage of women will
have a weight over 59.5 kg? ( )
A. 2%
B. 5%
C. 10%
D. 16%
20. The area under the standardized normal curve from 0 to 1.96 would be: ( )
A. the same as the area from 0 to -1.96.
B. found by using Table E.2.
C. equal to 0.4750.
D. all of the above.
2) Approximately what percent of young women are taller than 61.3 inches?
Descriptive statistics II
(A) histogram (B) stem plot (C) scatterplot (D) boxplot (E) bar chart
Estimation
5. When we make a confidence interval estimates for the mean, if the population
standard deviation is known ,we can use Z distribution.
6. For any population, which of the following statements is TRUE about the sample mean
X of a simple random sample as n, the sample size, gets large? ( )
A. The value of X will approach 0 and the distribution of X will approach N (0,1).
B. The value of X will approach μ and the distribution of X will be approximately normal.
C. The value of X will become large and the distribution of X will be approximately
normal.
D. The value of X will approach μ and the distribution of X will be the same as that of X.
7. If other factors are held constant, increasing the level of confidence from 95% to 99%
will cause the width of the confidence interval to: ( )
a. increase
b. decrease
c. not change
d. there is no consistent relation between interval width and level of confidence
8. Suppose we want to estimate the average weight of an adult male in Dekalb County,
Georgia. We draw a random sample of 1,000 men from a population of 1,000,000
men and weigh them. We find that the average man in our sample weighs 180
pounds, and the standard deviation of the sample is 30 pounds. What is the 95%
confidence interval?( )
(A) 180 + 1.86 (B) 180 + 3.0 (C) 180 + 5.88 (D) 180 + 30
(E) None of the above.
9. Suppose a simple random sample of 150 students is drawn from a population of 3000
college students. Among sampled students, the average IQ score is 115 with a
standard deviation of 10. What is the 95% confidence interval for the students' IQ
score?
10. Use the given degree of confidence and sample data to find the confidence interval
for the population mean:
Heights of women: 95% confidence; n=50, x=63.4 in., s=2.4 in. t tests
3. power
4. A hypothesis test is a statistical method that uses sample data to evaluate a hypothesis
about a population. ( )
5. The shape of t distribution is symmetrical about zero. ( )
6. As the sample size n gets larger, the t distribution gets closer to the standard normal
distribution. ( )
7. A statistically significant difference might not be clinically significant (unimportant).
()
8. When using the P-value to make decision, the P-value > 0.05,we should reject the null
hypothesis. ( )
9. The greater the P-value, the greater the evidence against the null hypothesis. ( )
10. Type II error occurs when a researcher rejects a null hypothesis that is actually true.
()
11. There are many different statistical tests. The choice of which test to use depends on
several factors, excluding ( )
A. The type of data
B. The distribution of the data
C. The type of study design
D. The parameter of population.
12. The probability to make type II error in hypothesis testing is ( )
a) α
b) β
c) 1−α
d) 1−β
13. The following statement about hypothesis, which one is right? ( )
a) In null hypothesis, we state there is no change, no difference, or no relationship in
population.
b) In null hypothesis, we state there is a change, a difference, or a relationship for the
general population.
c) In alternative hypothesis, we state there is no change, no difference, or no
relationship in population.
d) All above are wrong.
14. When we use paired t-test, the alternative hypothesis can be stated as ( )
A. 1μ = 2μ
B. 1μ ≠2μ
C. dμ = 0
D. dμ ≠0
15. Two hundred obese patients are randomized to either a diet program or an exercise
program to lose weight. Weights of each patient are measured both before the
program starts and after 3 months. An appropriate statistical procedure to determine
whether or not the mean weight change (after-before) was statistically different for
the two programs is: ( )
A. Paired t-test
B. One way analysis of variance
C. Chi square test
D. 2 sample unpaired t-test
16. An independent measures experiment uses two samples with n = 8 in each group to
compare two experimental treatments. The t-statistic from this experiment will have
degrees of freedom equal to ( )
a. 7
b. 14
c. 15
d. 16
17. When we analyze numerical data using paired t-test, which formula is the right one? (
)
A. nsxt/0μ−=
B. 22122121//)()(nsnsxxtpp+−−−=μμ
C. nsdtdd/μ−=
D. 1)1()1(2121++−−−=nnxxt
18. A researcher is studying whether diet pills really work. The researcher gets two
groups of people. The first group of 20 people is given the diet pill to help suppress
their appetite. The second group of 15 people is given a placebo. Both groups are
then instructed to try to lose weight. The researcher hypothesizes that the people who
were given the diet pill will lose more weight. The diet pill group lost a mean of 4.78
pounds (with a standard deviation of 3.26) during the one month experiment. The
members of the placebo group, on the other hand, lost a mean of 3.61 pounds (with a
standard deviation of 3.47). Which type of hypothesis test should be conducted in
order to assess whether people using the diet pills lost more weight?
Conduct the appropriate hypothesis test.
STEP 1: State your hypotheses (include both H0 and H1). Set α = .05, two-tailed.
STEP 2: find the critical value.
19. Captopril is a drug designed to lower systolic blood pressure. When subjects were
tested with this drug, their systolic blood pressure readings (in mm of mercury) were
measured before and after the drug were taken, with the results given in the
accompanying table. Use a 0.05 significance level to test the claim that captopril is
effective in lowering systolic blood pressure.
Subject 1 2 3 4 5 6 7 8 9 10 11 12
Before 20 17 19 17 17 18 19 20 18 15 16 210
0 4 8 0 9 2 3 9 5 5 9
After 19 17 17 16 15 15 17 18 15 14 14 177
1 0 7 7 9 1 6 3 9 5 6
20. Patients recovering from an appendix operation normally spend an average of 6.3
days in the hospital. The distribution of recovery times is normal with a σ = 1.2 days. The
hospital is trying a new recovery program that is designed to lessen the time patients
spend in the hospital. The first 10 appendix patients in this new program were released
from the hospital in an average of 5.5 days. On the basis of these data, can the hospital
conclude that the new program has a significant reduction of recovery time? Test at the .
05 level of significance with a one-tailed test. STEP 1: State your hypotheses (include
both H0 and H1).
STEP 2: Set up the criteria for making a decision. That is, find the critical value.
STEP 3: Summarize the data into the appropriate test-statistic.
21. An experiment was planned to compare the mean time (in days) to recover from a
common cold for persons given daily dose of 4 milligrams (mg) of vitamin C versus
those who were not given a vitamin supplement. Suppose that 35 adults were
randomly selected for each treatment category and that the mean recovery time and
standard deviations for the two groups were as follows:
Suppose your research objective is to determine whether the use of vitamin C changes the
mean time required to cover from a common cold and its complications.
Give the null and alternative hypotheses for the test.
State your test statistic and conduct the statistical test of the null hypothesis and state
your conclusions (α=.05).
ANOVA
1. What is analysis of variance (F test)? What is the difference between t test and
ANOVA?
7. A study is performed to compare the mean level of vitamin A that is measured in the
sera among children from four different countries. If you wanted to test the
hypothesis that there are no differences in mean levels among the four countries, the
statistical procedure you might use is:( )
A. Paired t-test
B. One way analysis of variance
C. Chi square test
D. Unpaired t-test
8. In order to calculate the F test statistic for a one-way ANOVA experiment, one would
perform which of the following operations? ( )
A. SS/SS betweenwithin
B. MS/MS betweenwithin
C. MS/MS withinbetween
D. SS/SS withinbetween
9. If there is no treatment effect, the F ratio is near ( )
a. zero b. ten c. infinity d. one
10. In a one-way ANOVA, if the computed F statistic exceeds the critical F value we
may:( )
a. retain (fail to reject) the null hypothesis because a mistake has been made.
b. retain (fail to reject) the null hypothesis since there is no evidence of a difference.
c. reject the null hypothesis since there is evidence of a treatment effect.
d. reject the null hypothesis since there is evidence all the means differ.
11. The accompanying table is a summary table for an analysis of variance. Some of the
information has been left out. Fill in the missing material and then continue with the
following questions.
Source SS df MS F
Between-groups ________ 3 ________ 6.25
Within-groups ________ 16 2.114
Total ________ ________
12. In a study on the effects of stress on illness, a researcher tallied the number of colds
people contracted during a six-month period as a function of the amount of stress
they reported during the same time period. There were three stress levels: minimal,
moderate, and high stress, with four participants in each group. An ANOVA
summary table is below. Fill in the missing data.
Summary table Source SS df MS F
Between groups 22.17
Within groups 1.64 -
Total 36.92 - -
13. A study was conducted to test the question as to whether cigarette smoking is
associated with reduced serum-testosterone levels in men aged 35 to 45. The study
involved the following four groups: a. Nonsmokers who had never smoked
b. Former smokers who had quit for at least six months prior to the study
c. Light smokers, defined as those who smoked 10 or fewer cigarettes per day
d. Heavy smokers, defined as those who smoked 30 or more cigarettes per day
Each group consisted of 10 men and Table 1 shows raw data, where serum-testosterone
levels were measured in mg/dL
Conduct the appropriate hypothesis test.
Step1 state the hypotheses (include both H0 and H1),and select an alpha level
Step4 make a decision about H0 (Reject or Fail to reject?), and state a conclusion
Chi-square tests
3. Observed categories must have whole numbers, but expected categories can have
decimals.
4. As the degrees of freedom increase (and especially when the degrees of freedom are
more than 90), the graph of the chi-square distribution looks more and more ( ).
A. symmetrical
B. skewed right
C. skewed left
D. asymmetrical
5. The null hypothesis of 2×2 table chi-square test may be ( )
A. π1= π2
B. π1≠π2
C. p1= p2
D. p1≠p2
E. μ1=μ2
6. The test statistic of categorical data is ( )
A. χ2
B. p
C. Sp
D. π
E. μ
7. In which conditions to conduct the corrected χ2 test? ( )
A. N≥40 and Ti≥5
B. N≥40 or Ti≥5
C. N<40 and 1≤Ti<5
D. N<40 and Ti<5
E. N≥40 and 1≤Ti<5
8. Which treatment is better? A randomized controlled trial was designed to compare the
effectiveness of splinting against surgery in the treatment of carpal tunnel syndrome.
Results are given in the table below (based on data from “Splinting vs. Surgery in the
Treatment of Carpal Tunnel Syndrome,” Journal of the American Medical
Association). Using a 0.05 significance level, test the claim that success is
independent of the type of treatment. What do the results suggest about treating
carpal tunnel syndrome?
Successful Unsuccessful
Splint treatment 60 23
Surgery treatment 67 6
9. Suppose we want to compare whether the two drugs have the same effective rate for
acute lower respiratory infection treatment (Table 1), please give the detailed
procedure of the hypothesis test and make final decision (χ20.05,1=3.84) .
Table 1 Comparison of the effective rate between Effect Non-effect
drug A and drug B Treatment
Drug A 68 6
Drug B 52 11
10. Psychological and social factors can influence the survival of patients with serious
diseases. One study examined the relationship between survival of patients with
coronary heart disease and pet ownership. Each of 92 patients was classified as
having a pet or not and by whether they survived for one year. The researchers
suspect that having a pet might be connected to the patient status. Here are the data:
Patient Status Pet Ownership
No Yes
Alive 28 50
Dead 11 3
Total 39 53
2 2
State the hypotheses for a χ test of this problem, find the χ test statistic, its degrees of
freedom, and the P-value. State your conclusion in terms of the original problem.
11. In the following example, a researcher attempts to determine if a drug has an effect on
a particular disease. Counts of individuals are given in the table, with the diagnosis
(disease: present or absent) before treatment given in the rows, and the diagnosis
after treatment in the columns. The test requires the same subjects to be included in
the before-and-after measurements (matched pairs).
After: present After: absent Row total
Before: present 101 121 222
Before: absent 59 33 92
Column total 160 154 314
b. Find the χ2 test statistic, its degrees of freedom, and the P-value.
4. Nonparametric tests do not require that samples come from populations with normal
distributions or any other particular distributions. Consequently, nonparametric tests
of hypotheses are often called distribution-free tests.
5. The Kruskal-Wallis test, the nonparametric equivalent of the one-way analysis of
variance, compares ordinal or skewed data with more than two independent groups
6. Nonparametric tests are also referred to as ____ free tests.
a. distribution
b. measurement
c. definition
d. parameter
7. The requirements of parametric test is ( )
a. Normal distribution
b. Equal variances
c. Independent observation
d. All of the above
e. None of the above
8. In a completely randomized design rank sum test (the number of groups=2) when the
null hypothesis is rejected, the final decision may be ( )
a. There is no difference between the distributions of the two groups
b. The difference between the distributions of the two groups is very big
c. The difference between the distributions of the two groups is statistically
significant
d. The difference between the means of two population is statistically significant
e. The difference between the variances of two populations is statistically significant.
9. When data from paired designed study and don’t meet normal distribution, the suitable
methods are: ( )
A、Wilcoxon sign rank sum test.
B、Wilcoxon rank sum test.
C、T test
D、H test
E、M test
21. The following table and figure give the relation between pairs of data values (x, y).
Now it has been calculated that r = - 0.956 and P=0.003. Please describe the
characteristics of the relationship between x and y.