首页 > 外语翻译

HypothesisTests:假设检验

2023年12月24日发(作者：1 36集电视剧免费观看)

ENGI 3423 Single Sample Hypothesis Tests Page 11-01

Classical hypothesis tests are close relatives of the classical confidence intervals.

Some general statements will be introduced after the first example.

Example 11.01

The lifetime X of a particular brand of filaments is known to be normally distributed. A

random sample of six filaments is tested to destruction. Those six filaments are found to

last for an average of 1,007 hours with a sample standard deviation of 6.2 hours.

Is there sufficient evidence to conclude, at a level of significance of 5%, that the true

mean lifetime of this brand of filaments is not 1,000 hours?

Repeat this question with a level of significance of 1%.

Test the null hypothesis Ho :

 = 1000

against the alternative hypothesis Ha :

  1000 .

Distribution:

X~N,2

Data:

If Ho is true, then

n6,x1007,s6.2

2X~N1000,6ZX1000~N0,1

6

But σ is not known.

X1000T~t5

S6

t .025, 5 ≈ 2.57058

[Note: “S” is upper case because it is a random quantity.

 = n − 1 = 5 is the number of degrees of freedom for the t distribution.]

ENGI 3423

Example 11.01 (continued)

Single Sample Hypothesis Tests Page 11-02

Method 1

Reject Ho in favour of Ha iff

xcUorxcL

PXcUorXcLHtrue5%

6.210006.51

6cL,cU993.5,1006.5to1d.p.

x = 1007 > cU

cL,cU10002.57

Therefore REJECT Ho at a level of significance of

 = .05 .

[This result is equivalent to the classical two-sided confidence interval of example

10.04.]

Method 2

x1007

tobsxsn100710006.2672.772.5312d.p.

Reject Ho in favour of Ha iff

tobst.025,5

Therefore REJECT Ho at a level of significance of

 = .05 .

 = 1%,

Method 1

scL,cUt.005,5

n6.210004.03100010.20

6cL,cU989.8,1010.2butx1007

cLxcU. Do NOT reject Ho .

Example 11.01 (continued)

 = 1%, Method 2

ENGI 3423

x1007tobs2.77Single Sample Hypothesis Tests Page 11-03

t.005,54.03tobst.005,5

Therefore do NOT reject Ho .

Interpretation:

If Ho is true, then the p-value (the probability thatX is further away from

 = 1000 than

x1007) is between 5% and 1%. The level of significance  is an upper bound to the

probability of committing a type I error: P[reject

Ho |

Ho true]   .

Decision Tree: [from page 9.19]

P[type I error] = P[reject

Ho |

 =

o (Ho true)]  

P[type II error] = P[accept

Ho |

 =

1 (Ho false)] =

 (1)

1 

 = power of the test.

ENGI 3423 Single Sample Hypothesis Tests Page 11-04

General method for two-tailed tests:

State hypotheses:

Ho :

 =

o vs.

Ha :

 

o

The burden of proof is on Ha.

Choose the level of significance  .

State your assumptions

(for example, the random quantity X

is nearly normal).

Find

x (the test statistic).

If  is unknown, then estimate it using s .

Case 1:  is unknown and n is small

x space

sFind

ot/2,n1

n

sIff

xot

/2,n1nor

t space

tFind

/2,n1

xo

sn and

tobsxots

/2,n1n Iff

|tobs|t/2,n1

then reject Ho in favour of Ha.

Case 2: n is large (> 30) is the same as Case 1 except that

t/2,n1 is replaced by

t/2,z .

/2Common values: z.025 = 1.95996 , z.005 = 2.57583 .

Case 3:  is known is the same as Case 2 except that s is replaced by  .

ENGI 3423 Single Sample Hypothesis Tests Page 11-05

Example 11.02

A manufacturer claims that replacement machinery fills paper bags with exactly one

kilogramme of sugar each, on average. A random sample of 400 bags of sugar is

weighed, producing a sample mean mass of 996.5 grammes and a sample standard

deviation of 25.1 grammes. At a level of significance of .01, is there sufficient evidence

to doubt the manufacturer’s claim?

n400,x996.5,s25.1

Test Ho :

 = 1000 vs. Ha :

  1000 at

 = .01

[Reason for selecting a two-sided alternative hypothesis rather than one-sided:

Before we have any data to examine, if the manufacturer’s claim is false, then we

have no pre-conceptions as to whether the true value of μ is greater than or less

than 1000. We are seeking only evidence that μ is different from 1000. We are

not seeking evidence, a priori, for a decrease.]

Method 1

t/2,n1sn

1000t.005,39925.1

40010002.601.25510003.263996.7,1003.3

x996.5cL

Therefore reject

Ho .

YES,

  1000.

Method 2

1d.p.

t .005, 399

≈ t .005, 200 =

x996.51000tobs2.78

sn25.1400tobs

2.782.60

Therefore reject

Ho . YES,

  1000.

ENGI 3423 Single Sample Hypothesis Tests Page 11-06

p-value (Method 3):

xxFindzobsortobs

snnFind p = P[ | Z | > | zobs | ] or p = P[ | T | > | tobs | ]

Compare p to  .

Example 11.02 (continued, using method 3):

t obs = − = −2.79 (2 d.p.)

Using t

, 399 ≈ z ,

P[ | Z | > 2.79] = 2 Ф(−2.79)

= 2  .00264 = .00528 < .01000 =

 .

Therefore reject

Ho . YES,

  1000.

Note:

Tables are not usually provided for P[T < t obs] ,

but the values can be obtained from software, such as the Excel file at

/~ggeorge/3423/demos/ .

t .005, 399 = → cL = , cU =

t obs = − → p = P[ | T | > t obs ] = .005543

The corresponding, more precise, confidence interval allows us to claim that

“we are 99% sure that < μ < ”.

ENGI 3423 Single Sample Hypothesis Tests Page 11-07

General Method (upper-tailed tests):

State hypotheses:

Ho :

 =

o vs. Ha :

 >

o

The burden of proof is on Ha.

Choose the level of significance  .

State your assumptions

(for example, the random quantity X

is nearly normal).

Find

x (the test statistic).

If  is unknown, then estimate it using s .

Method 1: Method 2:

Evaluate

Reject

Ho iff

xstt,n1

ct,n1obssnReject Ho iff

x> c .

n

Method 3:

xEvaluate

tobs and p = P[ T > tobs]

sn

Reject

Ho iff p <  .

Let us explore the meaning of

 , the probability of committing a Type I error, in

the case when the alternative hypothesis is one (upper) tailed,

Ha :

 >

o:

 is actually an upper bound to P[Type I error], the “worst case scenario”, which

occurs when the null hypothesis is just barely true.

ENGI 3423 Single Sample Hypothesis Tests Page 11-08

General Method (lower-tailed tests):

State hypotheses:

Ho :

 =

o vs. Ha :

 <

o

The burden of proof is on Ha.

Choose the level of significance  .

State your assumptions

(for example, the random quantity X

is nearly normal).

Find

x (the test statistic).

If  is unknown, then estimate it using s .

Method 1: Method 2:

Evaluate

Reject Ho iff

xstt,n1

ct,n1obssnnReject Ho iff

x< c .

Method 3:

xEvaluate

tobs and p = P[ T < tobs]

sn

Reject Ho iff p <  .

ENGI 3423 Single Sample Hypothesis Tests Page 11-09

Example 11.03

An opinion poll of 100 randomly selected customers produces 58 customers who state a

preference for brand A. Does a majority of the population of customers prefer brand A?

From the random sample of 100 customers, how many must state a preference for brand

A in order for the inference “a majority of the population of customers prefers brand A”

to be valid?

Ho: p = .5 (or less)

Ha: p > .5

Choose

 = .05

Assume that the sample is random, so that, to a good approximation,

~Np,pq

Pnx58垐

.42

p.58q1pn100pq.5.5.0025

If Ho is true, then

n100

Use method 1 (because of the second part of

the question).

czpz.05"n"pqn

.51.64.0025.582ˆ.58c,

do NOT reject Ho.

There is insufficient evidence for a majority.

c = .  x = 58.2 . Therefore

xmin = 59

ENGI 3423 Two Sample Hypothesis Tests Page 11-10

Two sample z test

From the central limit theorem, we know that, for sufficiently large sample sizes from

two independent populations of means

 1

 2 and variances 12

, 22

, the sample

means are distributed as

22212122 , , with

XX~N,X1~N,X~N,1

12221n2nn1n212

Example 11.04

A large corporation wishes to determine the effectiveness of a new training technique. A

random sample of 64 employees is tested after undergoing the new training technique and

obtains a mean test score of 62.1 with a standard deviation of 5.12 . Another random

sample of 100 employees, serving as a control group, is tested after undergoing the old

training methods. The control group has a sample mean test score of 58.3 with a

standard deviation of 6.30 .

(a) Use a two-sided confidence interval to determine whether the new training

technique has led to a significant change in test scores.

(b) Use an appropriate hypothesis test to determine whether the new training

technique has led to a significant increase in test scores.

(a)

n164x162.1s15.12

n2100x258.3s26.30 Two different groups of employees; may assume independence.

Both sample sizes are large (>> 30)  normal. Choose

 = 1%.

x1x262.158.33.8

s12s225.1226.3020.8065

n1n264100The 99% CI for μ1 − μ2 has its boundaries at

3.82.5750.80653.82.311.49,6.11

The CI does not include 0.

Therefore YES, the new training technique has led to a significant change

in test scores.

[Note that if t .005, 162 = is used instead of z .005 , then the CI would be

3.8 ± instead of 3.8 ± , leading to no change to 1 d.p.!

It is usually valid to replace t by z when ν > 100.]

ENGI 3423 Two Sample Hypothesis Tests Page 11-11

Example 11.04 (continued)

(b)

5.1226.302

VX1X264100X1X2~N,0.8065

 Seeking evidence for an increase.

Therefore use an upper-tailed test. [Again choose

 = 1%].

Test Ho :

 1 

 2 = 0 vs. Ha :

 1 

 2 > 0 .

HotrueMethod 1:

X1X2~N0,0.8065

Method 2:

c"z"02.320.80652.089

x1x23.8c

Therefore reject Ho in favour of Ha :

 1 

 2 > 0.

[Expressed crudely, “we are 99% sure that the training process has

increased test scores.”]

tobsx1x2s12s22n1n23.804.23

0.8065 t = z =

tobs > z

Therefore reject Ho in favour of Ha :

 1 

 2 > 0.

Method 3:

tobs =

P[ Z > tobs] = Ф(−) < .0003 (from Table A.3)

OR, using /~ggeorge/3423/demos/

with 63+99 = 162 degrees of freedom,

P[ T > tobs] = . < any reasonable

 .

Therefore reject Ho in favour of Ha :

 1 

 2 > 0.

ENGI 3423 Two Sample Hypothesis Tests Page 11-12

General Method (Method 2 illustrated here):

Establish the null hypothesis Ho :

 1 

 2 = o (often o = 0 )

Select the appropriate alternative hypothesis Ha .

Select the level of significance  , which leads to the boundaries of the rejection region

for z

(assuming either

 known or large n or both):

zc  = 5%  = 1%

1 - tail 1.64485 2.32634

2 - tail 1.95996 2.57583

x1x2ozFind

22s1s2n1n2Compare z to zc .

ENGI 3423 Two Sample Hypothesis Tests Page 11-13

Two sample t test:

If n1 and/or n2 is/are small (< 30) and the population variances are both equal to an

unknown number (12 =

22

 2 ) and the random quantities X1 and X2 are

independent and have normal (or nearly normal) distributions, then a t test may be used.

The separate sample variances s12 and s22 are both point estimates of the same unknown

population parameter



2. A better point estimate of



2 is a weighted average of these

two estimates, with the weights given by the numbers of degrees of freedom. Thus both

sample variances are replaced by the pooled sample variance

In the hypothesis test,

z

sP21s122s22

12where

1 = n1  1 and

2 = n2  1 .

x1x2o is replaced by

tss2n1n2212121sPn1n2 which has

 =

1 +

2 degrees of freedom.

x1x2o ,

Example 11.05

An investigator wants to know which of two electric toasters has the greater ability to

resist the abnormally high electrical currents that occur during an unprotected power

surge. Random samples of six toasters from factory A and five toasters from factory B

were subjected to a destructive test, in which each toaster was subjected to increasing

currents until it failed. The distribution of currents at failure (measured in amperes) is

known to be approximately normal for both products, with a common (but unknown)

population variance. The results are as follows:

Factory A: 20 28 24 26 23 26

Factory B: 21 18 19 17 22

(a) State the hypotheses that are to be tested.

(b) State the assumptions that you are making.

(a) Ho :

 A 

 B = 0 (no difference between toasters)

Ha :

 A 

 B ≠ 0 (significant difference between toasters)

[In advance of examining the data, we have no preconceptions of which toaster

might be better.]

ENGI 3423

(b)

(c)



tobsTwo Sample Hypothesis Tests Page 11-14

Given in the question:

XA~NA,2

2XB~NB,Assumption:

XA, XB are independent.

The summary statistics are

nA = 6

nB = 5

xA = 24.5

xB = 19.4

sA = 2.81 ...

sB = 2.07 ...

 A = nA  1 = 5 and

 B = nB  1 = 4 

 = 5 + 4 = 9

sP2AsA2BsB2AB5()24()2  6.300

54116.3001.519

65standard error =

sP11nAnBxAxBo11sP2nAnB(24.519.4)3.356

With  = .01 , t/2,

 = t.005, 9 =

| tobs | > t/2,

 , therefore reject Ho in favour of Ha :

 A 

 B ≠ 0.

From the data, we can conclude, with a high level of confidence, that toaster

A is more robust.

ENGI 3423 Two Sample Hypothesis Tests Page 11-15

Paired t test

Example 11.06

Nine volunteers are tested before and after a training programme. Based on the data

below, can you conclude that the programme has improved test scores?

Volunteer: 1 2 3 4 5 6 7 8 9

After training: 75 66 69 45 54 85 58 91 62

Before training: 72 65 64 39 51 85 52 92 58

Let XA = score after training and XB = score before training.

Test Ho :

 A 

 B = 0 vs. Ha :

 A 

 B > 0

Choose  = .01 .

INCORRECT METHOD:

nA = nB = 9 

A =

B = 8 

 = 16

sA =

xA =

sB =

xB =

sP2AsA2BsB2AB8()28()28

211249.49964.2)0(67.



t121 s.e. =

sPnAnB7.445

xAxBos.e.0.403

Compare with t,

 = t .010, 16 =

0.4032.583 Therefore do not reject

Ho : no increase in test scores !

ENGI 3423 Two Sample Hypothesis Tests Page 11-16

The error is that

the two test scores are NOT independent.

[They are highly correlated.]

The correct method is to take account of the fact that XA and XB are paired,

by examining the differences D = XA  XB .

Volunteer: 1 2 3 4 5 6 7 8 9

After training xA: 75 66 69 45 54 85 58 91 62

Before training xB: 72 65 64 39 51 85 52 92 58

Difference d 3 1 5 6 3 0 6 −1 4

Test Ho :

D = 0 vs. Ha :

D > 0 with  = .01 .

Summary statistics:

n = 9 

 = 8 ,

d = 3 , sD =

dDo30

t3.530

...sD2.54959n

Compare with t

,

 = t .010, 8 =

Therefore reject Ho .

At a 1% level of significance, we conclude that the training has, indeed, increased the test

scores.

An Excel spreadsheet file for both methods is available at

/~ggeorge/3423/demos/ .

ENGI 3423 Two Sample Hypothesis Tests Page 11-17

When should we use a paired two sample t test?

When samples of equal size n are taken from two populations, the unpaired two sample

t test will have

 = 2n  2 degrees of freedom, but the paired two sample t test will

have only

 = n  1 degrees of freedom. The power of the unpaired test to distinguish

between null and alternative hypotheses is greater, especially for small sample sizes.

The paired test is valid even if the two populations are strongly correlated, whereas the

unpaired test is based on the assumption that the two populations are independent (or at

least uncorrelated).

We should use the paired t test if there is reason to believe that the two populations from

which the samples come may be correlated, or if the variance within the samples is high.

If the samples are pairs of observations of two different effects on the same set of

individuals, then independence between the populations is unlikely and one should use

the paired t test.

Otherwise, (and especially if the sample size is very small), use the unpaired t test.

Note (not examinable):

The correlation

 is a measure of the linear dependence of a pair of random quantities.

Independence 

 = 0

The relationship between the t statistics for the unpaired and paired two sample t tests is

TTpairunpair

1

The unpaired t test can therefore be used only if the random quantities are uncorrelated.

And, upon replacing the unknown underlying true correlation

 by the observed sample

correlation coefficient r, the two observed values of t are related by

tunpairtpair

2rsAsB122sAsBwhere sA and sB are the two observed standard deviations from samples A and B

respectively.

In Example 11.06, r = .996, leading to an error factor of .

tunpair = , tpair = and one can verify that

= 

ENGI 3423 Two Sample Hypothesis Tests Page 11-18

Inferences on Differences in Population Proportions

[not examinable (except for bonus)]

ˆ is distributed approximately as We have seen that the sample proportion

Ppqˆ~NPp,,

nwhere n is the sample size, p is the population proportion and q = 1  p .

This approximation holds provided that np (the expected number of successes) and nq

(the expected number of failures) are both sufficiently large (both numbers greater than

10 is usually sufficient).

We have also seen that for any two random quantities X, Y :

E[ X  Y ] = E[ X ]  E[ Y ] and

for any two uncorrelated random quantities X, Y : V[ X  Y ] = V[ X ] + V[ Y ].

For two independent large random samples, it then follows that

p1q1p2q2ˆˆ

P~Np1p2,1P2nn

21

 a (1)100% confidence interval estimate for p1  p2 is

ˆ1qˆ1pˆqˆpˆ1pˆ2zp22

n1n22

A special case arises in hypothesis tests whenever the null hypothesis is Ho : p1 = p2 .

In this case the two sample proportions are point estimates of the same unknown

population proportion p .

The pooled estimate of p is

ˆn2pˆ2Total number of successesxx2npˆp111

Total sample sizen1n2n1n2

and the standard error becomes

11ˆqˆpn .

n21ˆ1pˆ2pCompare

z to z /2 (two tailed test) ,

sor z  (lower tailed test) or z  (upper tailed test).

s

ENGI 3423 Two Sample Hypothesis Tests Page 11-19

Example 11.07

A random sample of 100 customers produces 42 customers who like brand A (as opposed

to not liking brand A). Another random sample of 225 customers produces 81 customers

who like brand B.

(a) Find a standard 95% confidence interval for the difference in population

proportions

pA  pB .

(b) Is there sufficient evidence to conclude, at a level of significance of five per cent,

that brand A is more popular than brand B?

ˆA = .42 (a) xA = 42 nA = 100 

pˆB = .36 xB = 81 nB = 225 

ˆAqˆAˆBqˆBpp.42.58.36.642ˆPˆVPs.002436.001024ABnAnB100225

= .003460

The 95% confidence interval estimate is

ˆApˆBz.025s.e..42.361.960.003460

p = .06  .

= [  5.5% , +17.5% ] (1 d.p.)

(b) The 95% confidence interval estimate includes pA  pB = 0

 insufficient evidence to conclude that pA  pB

But the effect for which evidence is being sought is pA  pB > 0, (not pA  pB).

Conduct an hypothesis test

Ho : pA  pB = 0 vs.

Ha : pA  pB > 0

xxB4281123ˆA.3784 Pooled sample proportion

pnAnB100225325Standard error

s



z11ˆqˆpnAnB11.37.62

0.05829100225ˆApˆBp.42.361.029

s.058

ENGI 3423 Two Sample Hypothesis Tests Page 11-20

z = z.050 =

z < z

Therefore do not reject

Ho : pA = pB

There is insufficient evidence (at a level of

significance of 5%) that brand A is more popular

than brand B.

Example 11.08 (not examinable except for bonus)

A manager wishes to find a 95% confidence interval for the difference in the proportions

of successful sales attempts between sales teams A and B. Random samples of n

sales attempts are examined for each team. How large must the sample sizes n be in

order to ensure that the confidence interval has a width of less than .10 ? [In other words,

find the minimum sample size nmin to estimate pA  pB to within five percentage

points either way nineteen times out of twenty.]

ˆAqˆApˆqˆpˆApˆBzBB The confidence interval estimate for pA  pB is

pnAnB2w/2ˆApˆB1ˆAqˆB1Maximum width occurs when

pq22

nA = nB = n

1111w12222zz

2nn2n22z2nn2/2

ww n  2 ( / 0.10)2 =

Therefore

2z/22nmin = 769

本文发布于:2024-09-23 10:24:16，感谢您对本站的认可！

本文链接：https://www.17tex.com/fanyi/28028.html

上一篇：null hypothesis定义

下一篇：SapirWhorfhypothesis萨丕尔沃尔夫假说

标签：观看电视剧免费

留言与评论（共有 0 条评论）