ii:.ANDEXPERIMENTAL
QUASI-EXPERIMENTALGENERALIZEDFOR
DESIGNS
INFERENCECAUSAL
ShadishWilliam
R.
op MEvPrrtsTrru UNIvERSITY
**.jr-*"-
,,
'"+.'-iLli"D. CookThomas
NonrrrwpsrERN
UNrvPnslrYfrCampbellDonald
T.
COMPANY
MIFFLIN
HOUGHTON
Boston New
York
andExperiments
CausalGeneralized
lnferencefrom 'i'ment
(ik-spEr'e-mant):
[Middle
English from Old French
per-
in Indo-European Roots.]experimentum,
from experiri,
to try; see
n. Abbr. exp.,
expt,
1. a. A test under
controlled conditions that
isa known
truth, examine
the validity of a hypothe-made to demonstrate
previously untried' b. Theof something
the efficacy
or determine
sis,
process of conducting
such a test; experimentation.
2' An innovative"Democracy
is only an experiment
in
gouernment"act or
procedure:
(.V{illiam Ralp
h lnge).(k6z):
[Middle
English from
Old French from
Latin causa' teason,Cause
result, or e.] n. 1. a. The
producer of an
effect,
b. The one, such
as a
person, an event' or a condition,
that is responsi-for; re-of or reason
ble for an action
or a result.
v. 1. To be the
cause
sult in.
2. To bring
about or compel
by authority or
experimenta-and
philosophers, the
increased emphasis
o MANv
historians
of modern sciencemarked
the emergence
tion in the
15th and L7th centuries
(1981)
citesfrom its roots
in natural
philosophy
(Hacking, 1983). Drake
'Water,
'1.6'!.2
inushering
or Moue in It as
Atop
Tbat Stay
Bodies
treatrse
Galileo's
science, but
earlier claims
can be made favoring
Tilliammodern experimental
Vinci'sLeonardo da
MagneticBodies,
and
Onthe Loadstone
1,600 study
Gilbert's
o-perhaps even the
Sth-century
and
(1,452-1.51.9)
many investigations,
to argue againstvarious
empirical demonstrations
pher Empedocles, who used
'1.969a,
sense of the term,
humans(Jones,
1'969b).In
the everyday
Parmenides
mo-ways of doing
things
from the earliest
with different
have been experimenting
tryingnatural a
part
of our
life as
is as
experimenting
Such
of their history.
ments
way of starting
a different
a new
recipe
z
|
1. EXeERTMENTs
IAND GENERALTzED
cAUsAL
INFERENcEHowever,
the scientific revolution of the 1.7th century departed
in three waysfrom the common use
of observation in natural philosophy atthat time. First, it in-creasingly used
observation to correct errors in theory. Throughout historg natu-ral philosophers
often used observation in their theories, usually to win
philo-sophical arguments
by finding observations that supported their r, they still
subordinated the use of observation to the
practice of
derivingtheories from
"first
principles,"
starting
points
that humans
know to be true by ournature or by divine revelation (e.g.,
the assumed
properties of the four basic
ele-ments of fire, water,
earth, and air in Aristotelian natural
philosophy). Accordingto some accounts,
this subordination of evidence to theory degenerated in the 17thcentury:
"The
Aristotelian principle of
appealing to experience
had degeneratedamong philosophers
into dependence on reasoning supported by casual examplesand the refutation of opponents by pointing to apparent exceptions
examined"
(Drake,
'1,98"1.,
not carefullyp. xxi).'Sfhen
some 17th-century
scholars then began
touse observation to correct apparent
errors in theoretical and
religious first princi-ples,
they came into conflict with religious or philosophical authorities, as in thecase
of the Inquisition's demands
that Galileo recant his account of the earth re-volving around the sun.
Given such hazards, the fact that the new experimental sci-ence tipped the balance
toward observation and ^way from dogma
is the time
Galileo died, the role of systematic
observation
was firmly entrenchedas a central feature of science, and it has remained so ever since
(Harr6,1981).Second,
before the 17th century, appeals
to experience
were usually based
onpassive
observation of ongoing systems
rather than on observation of what
hap-pens after a
system
is deliberately
changed.
After the scientific
revolution in theL7th centurS the
word experiment
(terms
in boldface in this book are defined
inthe Glossary)
came to connote taking a deliberate action followed by systematicobservation
of what occurred
noted of Francis Ba-con:
"He
afterward. As Hacking
(1983)
taught that not only must we observe
nature in the
also
'twist
raw, but that we mustthe lion's tale', that is, manipulate our world in order to
learn its
se-crets"
(p.
U9). Although passive
observation
reveals much about the world, ac-tive manipulation is required
to discover some of the world's
regularities and pos-sibilities
(Greenwood,,
1989). As a mundane example, stainless steel does notoccur naturally; humans must manipulate it into existence.
Experimental sciencecame to be concerned
with observing the effects
of such
, early
experimenters realized the desirability of controlling extraneousinfluences that might limit
or bias observation. So telescopes
were carried tohigher points
at which the air was clearer, the glass
for microscopes was
groundever more accuratelg
and scientists
constructed laboratories in which it was
pos-sible to use
walls to keep out potentially biasing ether waves and to use
(eventu-ally sterilized)
test tubes to keep out dust or bacteria. At first, these
controls weredeveloped
for astronomg chemistrg and
physics,
the natural sciences
in which in-terest in science
first bloomed. But when scientists
started to use experiments
inareas such as public health or education, in which extraneous influences areharder to control
(e.g.,
Lind
,
1,7 53lr, they found
that the controls used
in natural
AND CAUSATTON
EXPERTMENTS
I
Ithey
devel-worked
poorly in these new applications. So
in the laboratory
science
as random
influence, such assign-with extraneous
new methods of dealing
oped
control
group
(Coover
& Angell,a nonrandomized
1,925) or adding
ment
(Fisher,
across these set-accumulated
experience
As theoretical and
observational
1.907).
more methods were de-and
were identified
of bias
sources
topics, more
tings and
(Dehue,
2000).with them
veloped to cope
varyis still to deliberately
to all experiments
TodaS the
key feature common
else later-to discover theto something
what happens
to discover
so as
something
whatwe do this,
for example, to assess
As laypersons
causes.
effects of
presumed
more, to our weight
if we diet less,if we exercise
happens to our
blood
pressure
scientific experimenta-book. However,
a self-help
if we read
or ro our behavior
language, and tools, in-substance,
increasingly specialized
tion has developed
sciences that is the
pri-in the social
the
practice of field experimentation
cluding
mary focus of
this book.
This chapter begins
to explore these
matters by(2)
explaining the spe-test,
(1) of causation that
experiments
nature
the
discussing
quasi-experiments) that , randomized experiments,
cialized terminology
(3)
introducing the
problem of how to generalizesocial
experiments,
scribes
the ex-and
(4)
briefly situating
experiments,
from individual
causal connections
of the
nature
literature
periment within a larger
AN D CAUSATIONEXPERIMENTS
for talking aboutrequires both a vocabulary
of experiments
discussion
A sensible
that underlie
of key concepts
an understanding
causation and
RelationshipsCausal
Effect, and
Defining Cause,
For in-relationships in their
daily lives.
causal
recognize
Most
people intuitively
hitting
yours
was a cause of theautomobile's
you may say that another
stance,
you
spent
studying was
a cause ofof hours
that the number
damage to
your car;
of his was a cause
of food a friend
amount
your test grades; or that
the
noting that a lowcausal relationships,
point to more
complicated
You may even
studying, which
causedwhich
reduced subsequent
was demoralizing,
test
grade
(low grade) anbe both a cause and
can
grades. Here the same variable
even lower
(lowtwo variables
between
relationship
effect, and there
can be a reciprocal
and not studying) that
cause each
definitionrelationsbips, a
precise
familiarity with causal
Despite this
intuitive
Indeed, the definitionsfor centuries.l
philosophers
has eluded
effect
of cause and
discussions ofin ordinary
language, not the more detailed
of the word causation
1. Our analysis refldcts the use
in thisa host of works that
we reference
may consult
detail
in such
Readers interested
cause by
philosophers.
(1979).including Cook and Campbell
chapter,
4
|
1. EXPERTMENTS
AND
GENERALTZED
CAUSAL INFERENCEof terms
such
as cause and,
effectdepend partly
relationship
on each
other
in
and on the
which both
causalare
embedded.
Locke
So the 17th-century philosopher
said:
"That
which produces
any
general
simple or complex
idea,
we denote
by theJohn"
name
caLtse,
and that
which is produce
d, effect"
(1,97
A cAtrse
s, p. 32fl
is that
and also:which makes
any other thing,
either
mode,
simple idea, substance,
begin
to be;
orand
an effect is
that,
thing" (p.
which had its beginning from
325).
some otherSince then,
other
philosophers
and
definitions
scientists have given
of the
us
three
usefulkey ideas--cause,
more
effect, and
causal relationship-that
specific and
that better
areilluminate
how experiments
fend
work. We
any
would not
of these
de-as the true or
correct
philosophers
definition,
given
that the
latter
for millennia;
has
eludedbut we do claign
that these ideas help
entific
practice
to clarify
of probing
the onsider
the cause
of a forest
fire.
'We
know that fires
start in different
match
tossed from
ways-aa ca a lightning
ple.
strike,
or a smoldering campfire, for
None
of these causes
exam-is necessary
say'
because a forest fire can
a match
start
is not present.
even
when,Also,
none
of them is sufficient
all, a match
to start
must stay
"hot"
the fire. Afterlong enough
to start combustion; it
combustible
must contactmaterial
such
as
dry leaves;
there
must
be
oxygen for combustion
occur;
and the weather
tomust be
dry enough
so that the leaves
match
are dry and theis not
doused
by rain. So
the match
is
part
of a constellation
without which
of conditionsa fire
will not result,
although
ally taken
some of these conditions
for granted,
can
be usu-such
as the
fore,
availability
of oxygen. A lighted match
what Mackie (1,974)
is,
rhere-called an inus
redundant part
condition-"an insufficient
but non-of an unnecessary
but sufficient
inal). It
condition"
(p.
62; italics in
is insufficient
orig-because
a match
cannot start a fire without
ditions. It
the other con-is nonredundant
only if it adds
something fire-promoting
uniquely
different
that isfrom
what the other
factors
in the constellation
(e.g.,
dry leaves)
oxygen,contribute
to starting
a fire; after
all,it would be
the match
harder ro say
caused the
whetherfire if someone
else simultaneously
cigarette lighter.
tried starting
it with
It is part
aof a sufficient
condition to start a fire in
with the full
combinationconstellation
of factors.
But that condition
is not necessary
there
are
other
becausesets of
conditions
that can
A research
also start
e
of an inus
condition
for cancer.
concerns a new
potential
In the
treatmentlate 1990s,
a team
of researchers in Boston headed
Folkman
by Dr.
reported
that a new
drug
called Endostatin
shrank tumors
Judahtheir blood
by limitingsupply
(Folkman,
1996).
Other respected researchers
cate
could not repli-the
effect
even when
using drugs
shipped to them
from Folkman's lab.
tists
Scien-eventually
replicated
the results
after they had traveled
learn
how
to Folkman's lab
to properly
tomanufacture,
to inject
transport, store, and
handle the
it in the
drug
and howright location
at the
right
these
depth and angle. One
contingencies
observer labeledthe
"in-our-hands"
phenomenon, meaning
"even
we don't
CAUSATION
AND
EXPERIMENTS I
Stime to work
it out"know which
details
are important,
so
it might
take you some
causewas
an inus
condition.
It was insufficient
(Rowe, L999,
p.732). Endostatin
required
it to be
embedded
in a larger set
of condi-by itself, and
its effectiveness
fully understood
by the original
that were
not even
called
inus conditions.
Many factors are
usu-are
more accurately
Most causes
to occur,
but we
rarely
know
all of them and how
theyally
required
for an effect
inthat the
causal
relationships
we discuss
relate to each
other.
This is one
reason
willthe
probability that an effect
this book
are not
deterministic
but only
increase
relation-why a
given causal
Holland,
1,994).It
also explains
1,991,;
occur
(Eells,
hu-time, space,
ship
will occur
under
some
conditions
but
not universally
across
that are
more or lessand outcomes
-"r
pop,rlations,
or other
kinds
of treatments
all causal
relationships
are contextrelated
io those
studied.
To different
{egrees,
That isis always at
issue.
of experimental
effects
so the
generalization
dependent,
throughout
this book.*hy *.
return to
such
generahzations
Effect'We
is through
a counterfactual
model thatcan better
understand
what
an effect
'l'973'(Lewis,
Hume
David
philosopher
18th-century
to the
least
back at
goes
In an experiment,p. SSe
l.
A counterfactual
is something
that
is contrary
to fact.
a treatment.
The counterfac-what did
happez when
people received
ie obserue
to those
same
people if they si-tual is knowledge
of what
would
haue happened
whatbetween
treatment.
An effect
is the difference
multaneously
had not
received
what would
have
did happen
'We
cannot
actually
observe
a counterfactual.
Consider
phenylketonuriaunlessmental
retardation
that causes
disease
metabolic
(PKU),
a
genetically-based
thatof an enzyme
treated
during the
first few
weeks of
life. PKU
is the absence
toxic
to thewould otherwise
prevent a buildup
of
phenylalanine,
a substance
Vhen
a restricted
phenylalanine
diet
is begun early
and
main-nervous
system.
could be thought
of asIn this example,
the cause
tained,
reiardation
is
prevented.
disorder,
or as the
diet. Each
im-the underlying
genetic
defect, as
the enzymatic
plies a difierenicounterfactual.
For example,
if we say
that a
restricted
phenyl-mental
retardation
in infants who
arein PKU-based
a decrease
alanine
diet caused
at birth,
the counterfactual
is whatever
would
have happenedphenylketonuric
'h"d
logicphenylalanine diet.
The same
a restricted
infants
not received
same
But it is
impossible
forapplies to
the
genetic or enzymatic
version
of the
cause.
,"-i
infants
simultaneously
to both
have and
not have
the diet, the
ge-these
the
enzyme
netic disorder,
ap-So a central
task
for all cause-probing
research
is to create
reasonable
if it wereproximations to
this
physically impossible
counterfactual.
For instance,
ethical to do
so, we
might contrast
phenylketonuric
infants
who were
given thediet with
other
phenylketonuric
infants
who
wer€ not
given the diet
but who weresocioeco-gender, age,
similar
face)
similar
in many ways
to those
who were
(e.g.,
Or we
might
(if
it were ethical)
contrast
infants whohealth
status).
nomic status,
I6
I
1. EXPERIMENTS
AND GENERALIZED
CAUSAL INFERENCEwere not on the diet for
the first 3 months
of their lives with those same infantsafter they were put
on the diet starting in
the 4th month. Neither of these ap-proximations is
a true counterfactual. In
the first case, the individual infants in thetreatment condition are
different from those in the comparison condition; in
thesecond case,
the identities are
the same, but time has passed
and many changesother than the treatment have
occurred to the infants (including permanent
dam-age done by phenylalanine
during the first 3 months of life). So two central
tasksin experimental
design are creating
a high-quality but necessarily imperfect sourceof counterfactual inference
and understanding how this source differs from
thetreatment counterfactual
reasoning
is fundarnentally qualitative because causal
in-ference, even in experiments,
is fundamentally qualitative
(Campbell,
1975;Shadish, 1995a;
Shadish 6c Cook, 1,999).
However, some of these points havebeen
formalized
by statisticians into
a special case that is sometimes called Rubin'sCausal Model
(Holland,
1,986;Rubin,
"1.974,'1.977,1978,79861.
This
book is
notabout statistics,
so we do not
describe that model in detail
('West,
Biesanz,
&
Pitts[2000]
do so and relate it to the
Campbell tradition). A
primary
emphasis of Ru-bin's model is the analysis
of cause in experiments,
and its basic
premises
are con-sistent with those
of this book.2 Rubin's model has also been
widely used
to ana-lyze causal inference
in case-control
studies in public health and medicine(Holland
6c Rubin, 1988), in path
analysis in sociology
(Holland,1986),
and ina
paradox
that Lord
(1967)
introduced into psychology
(Holland
6c Rubin,1983); and it has generated
many statistical
innovations that we cover later in thisbook. It is new enough that
critiques of it are
just
now beginning to appear
(e.g.,Dawid, 2000;
Pearl, 2000).
tUfhat
is clear,
however, is that Rubin's is a very gen-eral model with obvious
and subtle implications. Both it and the critiques of it arerequired material
for advanced
students
and scholars
of cause-probing RelationshipHow do
we know if cause
and effect are related? In a classic
analysis
formalizedby the 19th-century philosopher
John
Stuart Mill, a causal
relationship exists if(1)
the cause preceded
the effect,
(2)
the cause was related to the effect, and
(3)
wecan find no plausible
alternative
explanation for the effect other than the three
characteristics
mirror what happens in experiments in which
(1)
wemanipulate the presumed
cause and
observe an outcome afterward;
(2)
we seewhether variation in
the cause
is related to variation in the effect; and
(3)
we usevarious methods
during the experiment
to reduce the
plausibility
of other expla-nations for
the effect, along with
ancillary methods to explore the
plausibility
ofthose we cannot
rule out
(most
of this book is about methods
for doing this).2. However, Rubin's model
is not intended to say much
about the matters of causal
generalization that
we addressin this book.
EXPERTMENTS
CAUSATTON
AND
|
7Isci-No other
causal relationships.
well-suited to
studying
are
Hence experiments
so causal relationships
the characteristics
matches
method regularly
entific
methods. In many correlationalof other
weakness
points to the
also
Mill's analysis
came first,to know
which of two variables
it is impossible
studies, for example,
thisis
precarious.
Understanding
between them
relationship
a causal
so defending
cause and effect, aresuch as
and how
its key terms,
relationships
logic of causal
-probing
to critique
helps researchers
defined
Confoundsand
Correlation,
Causation,
A well-known
maxim
in research is:
Correlation
does not
proue
causation. This isex-first nor whether
alternative
we may not
know which
variable came
so because
income and educa-suppose
exist.
For example,
planations for the
presumed effect
payyou can
a high income
before
to have
Do
you have
tion are correlated.
before
you can get a bet-to get a
good education
or do
you first have
for education,
But un-investigation.
true, and
so both need
may be
possibility
ter
paying
job?
Each
by the scholarly communiry
aand evaluated
are completed
til those
investigations
first. Correlations also
docame
which variable
not indicate
simple
correlation does
explanations
for a relationship
between two variableslittle to rule out alternative
at all but ratherThat relationship
may not be causal
and income.
such as education
or family so-due to a
third variable
(often
called a confound),
such as
intelligence
For example,and high
income.
both high
education
that causes
status,
cioeconomic
and on the
job,
then intelligent
peo-in education
causes success
if high intelligence
causes in-not because education
and incomes,
education
correlated
ple would have
Thus a cen-by intelligence.
both would
be caused
but because
come
(or
vice
versa)
tral task in the study
of experiments
is identifying
the different
kinds of confoundsandthe strengths
in a particular
research area
and understanding
that can operate
with themwith various
ways of dealing
associated
weaknesses
CausesNonmanipulable
and
Manipulable
people have, it makesthat most
of experimentation
understanding
In the intuitive
"Let's
to work"; butwelfare recipients
if we require
see what
happens
to say,
sense
"Let's
male into aif I change this adult
see what
happens
to say,
it makes no sense
exploreExperiments
in scientific experiments.
it is also
girl." And so
three-year-old
of a medicine, thesuch as
the dose
of things that
can be
manipulated,
the effects
of
psychotherapy or the numberthe kind or
amount
check,
amount of a welfare
(e.g., of a super-the explosion
events
Nonmanipulable
in a classroom.
of children
or their , people's their
raw
genetic material,
ages,
nova) or attributes
themvary
deliberately
we cannot
because
in experiments
be causes
cal sex) cannot
and
philosophers agreemost scientists
Consequently,
to see what
then happens.
effects of
nonmanipulable
to discover
harder
that it is much
I8
|
1. EXeERTMENTS AND GENERALTzED cAUsAL
TNFERENcETo be clear, we are not arguing that all causes
must be manipulable-only
thatexperimental causes
must be so. Many variables that we correctly
think of as
causesare not directly manipulable. Thus
it is well established that a
genetic
defect causesPKU even
though that defect is not directly
manipulable.'We can
investigate
suchcauses indirectly in nonexperimental studies
or even
in experiments
by manipulat-ing biological
processes
that
prevent the gene from exerting
its influence, asthrough the use
of diet to inhibit the
gene's biological consequences.
Both the non-manipulable
gene
and the manipulable diet can be
viewed as
causes-both covarywith PKU-based
retardation, both precede the retardation,
and
it is possible
to ex-plore other explanations for the gene's and the diet's
effects on cognitive
function-ing. However, investigating
the manipulablc diet as a
cause
has two important ad-vantages
over considering the nonmanipulable
genetic
problem as a cause.
First,only the diet
provides a direct action to solve the
problem; and second,
we will seethat studying manipulable agents
allows a higher
quality source of counterfactualinference through such
methods as random assignment.
fhen individuals with
thenonmanipulable genetic
problem are compared with
persons
without
it, the latterare likely to be different
from the former in many ways
other than the
genetic
de-fect. So the counterfactual
inference about what would
have happened to thosewith the PKU genetic defect
is much more difficult to
eless, nonmanipulable causes should
be studied
using
whatever meansare available
and seem useful.
This is true because such
causes eventually
help usto find manipulable agents that can then be
used to ameliorate
the problem athand. The PKU example illustrates this.
Medical researchers
did not discover
howto treat PKU effectively by
first trying different diets with
retarded children.
Theyfirst discovered the nonmanipulable biological
features of retarded children
af-fected with PKU, finding abnormally high
levels of phenylalanine
and its associ-ated metabolic and
genetic problems in those children. Those
findings
pointed incertain ameliorative directions and away
from others,
leading scientists
to exper-iment with treatments they thought might be effective
and
practical. Thus the newdiet resulted from a sequence of studies
with different
immediate
purposes, withdifferent forms, and with varying degrees of
uncertainty
reduction. Some
were ex-perimental, but others were r, analogue experiments can sometimes
be done on
nonmanipulablecauses, that is, experiments that manipulate an
agent that
is similar to the causeof interest.
Thus we cannot change a person's
race, but we can
chemically
induceskin pigmentation changes
in volunteer individuals-though
such analogues
donot match the reality of being Black every day and
everywhere
for an entire rly past events, which are normally nonmanipulable,
sometimes
constitutea natural experiment that may even
have been randomized,
as when the
1'970Vietnam-era draft lottery was used
to investigate
a variety of
outcomes
(e.g.,
An-grist, Imbens,
&
Rubin, 1.996a; Notz, Staw,
&
Cook,
l97l).Although experimenting on manipulable causes
makes the
job
of discoveringtheir effects easier, experiments are far from
perfect means
of investigating
causes.
EXPERIMENTS AND CAUSATION I 9Iexperiments modify the conditions
in
which testing occurs in a waySometimes
those conditions and the situation to which the resultsthat reduces the fit between
tellsAlso, knowledge of the
effects of manipulable causes
are to be
generalized.
Nor do experiments answer manyoccur.
effects
nothing about how and why those
other
questions relevant to the real world-for
example, which
questions
areis distributedworth asking, how strong
the need for treatment is, how a cause
through societg whether
the treatment is
implemented with theoretical fidelitSand what value should be
attached to the experimental
first manipulate a treatment and only then ob-In additioq, in experiments,
an effect, such as AIDS,we
first observe
other studies
its effects; but in some
serve
whether manipulable
or not. Experiments cannotfor its cause,
and then search
(1976)
likens such searches to detective work inScriven
help us with that search.
committed
(..d.,
" robbery),
the detectives observe a par-which a crime has
been
the robber wore a baseballsurrounding the
crime
(e.g.,
ticular
pattern of evidence
cap and a distinct
jacket
and used a certain
kind of
Bun),
and then the detectivessearch for criminals whose
known method of
operating
(their
modus ) includes this
pattern. A criminal whose
m.o. fits that pattern of evidencefurther. Epidemiologists use a similarto be investigated
then becomes a suspect
design
(Ahlbom
6c Norell, 1,990), in which they observemethod, the case-control
in brain tumors) that is not seen inan increase
a particular health outcome
(e.g.,
(e.g.,
increased
causes cellanother
group and then attempt to identify
associated
phone use). Experiments do
not aspire to answer all the
kinds of questions,
notquestions, social scientists
even all the types of
causal
Explanationand Causal
Causal Description
attrib-is in describing the consequences
The unique strength of experimentation
call this causal description. In con-varying a treatment.'We
utable to deliberately
through which andthe mechanisms
trast, experiments do
less well in clarifying
holds-what we call causalrelationship
which that causal
the conditions under
quickly learn the descriptive
causalchildren very
explanation. For example, most
illumination in a obtaining
relationship between flicking
a light switch
(or
even
why that light
goes
fully explain
adults)
However, few children
(the
act of flicking a lightthe treatment
to decompose
To do so, they would have
(e.g.,
closing an insulated circuit) andfeatures
into its causally efficacious
switch)
(e.g., is thrown by hand or a motionwhether the switch
its nonessential features
(either
for the effect incandescent orto do the same
detector). They
would have
whether thebut light
will still be
produced
fluorescent light can be
produced,
they would then have tolight fixture is recessed or not). For
full explanation,
influence the causallyefficacious parts of
the treatment
show how the causally
(e.g.,
processes
theparts of the outcome through identified
mediating
affected
1O I T.
ICXPTRIMENTS
AND GENERALIZED CAUSAL
INFERENCEpassage of electricity through the circuit, the excitation
of
photons).3
ClearlS thecause of the light
going on is a complex cluster
of many factors.
For those
philoso-phers who equate
cause
with identifying that constellation
of variables
that nec-essarily inevitably and
infallibly results in the effect
(Beauchamp,1.974),
talk ofcause
is not warranted until everything of
relevance
is known. For them,
there isno causal description without causal
explanation. Whatever
the
philosophic mer-its of their
position, though, it is not practical to expect
much current social
sci-ence
to achieve such complete practical importance of causal explanation
is brought home when theswitch fails to make the
light go on and when
replacing the
light bulb
(anothereasily learned manipulation)
fails to solva the
problem. Explanatory
knowledgethen offers clues
about how to
fix the problem-for
example, by detecting
and re-pairing
a short circuit. Or
if we wanted to create
illumination
in a place withoutlights and we had explanatory
knowledge, we would
know exactly which
featuresof the cause-and-effect
relationship are essential
to create
light and which are
ir-relevant. Our explanation might tell
us that there
must be a source
of electricitybut that that source
could take several
different
molar forms, such
as abattery, agenerator,
a windmill, or a solar array.
There must also
be a switch
mechanism toclose a circuit, but this could also
take many forms,
including the touching of
twobare wires or even a
motion detector that trips the
switch
when someone
entersthe room. So causal explanation
is an important
route to the
generalization ofcausal descriptions
because it tells us which
features of the
causal
relationship areessential to transfer to other benefit of causal explanation
helps elucidate
its priority and
prestige inall sciences and
helps
explain why, once a
novel and
important causal
relationshipis discovered, the bulk of basic scientific
effort turns
toward explaining
why andhow it happens. Usuallg this involves decomposing
the cause
into its causally ef-fective parts, decomposing the effects
into its causally
affected
parts, and identi-fying the processes through which the effective
causal
parts influence the causallyaffected outcome
examples also show the close
parallel between
descriptive
and explana-tory causation and molar and
molecular causation.a
Descriptive causation
usuallyconcerns
simple bivariate relationships between
molar treatments
and
molar out-comes,
molar here referring to a
package
that consists
of many different
parts. Forinstance, we may find that
psychotherapy decreases
depression,
a simple descrip-tive causal
relationship benveen
a molar treatment
package and
a molar r,
psychotherapy consists of such
parts as verbal
interactions,
placebo-3. However, the full explanation a
physicist would offer might be
quite different
from this electrician'sexplanation, perhaps invoking
the behavior of subparticles.
This difference
indicates
iust
how complicated is thenotion of explanation and how it can
quickly become quite complex
once
one
shifts
levels of analysis.4. By molar, we mean something taken as
a whole rather than in
parts. An analogy
is to
physics, in which molarmight refer to the
properties
or motions
of masses, as
distinguished
from those
of molecules
or atoms that make
upthose masses.
EXPERIMENTS AND CAUSATION I 11Itime constraints,
and payment forsetting characteristics,
procedures,
generating
consist of items
pertaining to themeasures
Similarly, many
depression
services.
aspects of
depression. Explan atory causationand affective
physiological, cognitive,
into their
molecular
parts so as
to learn, say,and effects
molar causes
these
breaks
changesfeatures of therapy both cause
that the verbal
interactions
and the
placebo
not dobut that
payment for services does
of depression,
symptoms
in the cognitive
though
it is
part of the molar treatment
even
If experiments
are less
able to
provide this highly-prized
explanatory causalsocial sci-to basic
so central
to science, especially
experiments
knowledge,
in which theory
and explanation
are often
the coin of the realm?
The answer isence,
clear in sci-that the dichotomy
ber'*reen descriptive
and explanatory
causation
is less
First, many causal ex-about causation.
discussions
than in abstract
entific
practice
thelinks in which one event causes
of chains
of descriptivi
causal
planatirons consist
experiments help dis-chain. Second,
help to test the
links in each
next. Experiments
for example, by test-the validity of
competing explanatory
theories,
tinguish between
ing competing
mediating links
proposed by those
theories. Third, some
experimentstest whether
a descriptive
causal relationship
varies in strength
or direction underCondition
B
(then
the condition
is a moderator variable that ex-Condition
A versus
addholds). Fourth, some
experiments
plains the conditions under
which the effect
quantitative or
qualitative observations
of the
links in the explanatory
chain
(medi-and study
explanations
for the descriptive
causal
generate
ator variables)
in which theExperiments are
also
prized in applied
areas
of social science,
greatergreat
or even
identification of
practical solutions to
social
problems has as
priority than explanations
of those solutions.
After all, explanation
is not alwaysrequired for
identifying
practical solutions.
Lewontin
(1997)
makes this
pointabout the
Human Genome
Project, a
coordinated
multibillion-dollar
researchthat
it is hoped eventually
will clarify the
ge-program ro map the human
genome
of this search:Lewontin is skeptical
about aspects
netic causes of
diseases.
'!ilhat
Manyand intervention.
between explanation
is the difference
here
is involved
protein,
aa normal
organism to
make
of the
failure
by the
explained
can be
disorders
that theBut interuention requires
of a
gene mutation.
that is the
consequence
failure
inright time and
right cells, at the
right
place in the
provided at the
protein be
normal
normal
cellularfound to
provide
way be
right amount, or
else that an alternative
the
awayabnormal protein
to keep the
necessary
even be
is worse, it
might
function.'What
theis served by knowing
objectives
moments. None
of these
cells at critical
from the
"1,997,
p.29)(Lewontin,
gene.
defective
of the
DNA sequence
by theoretical
-Practical applications
are
not immediately
revealed
of sim-of follow-up
work, including
tests
them may take
decades
stead, to
reveal
ple descriptive causal
relationships.
The same
point is illustrated by the cancerknew the action
of the drug
occurredScientists
drug Endostatin,
discussed earlier.
use the drug
to treatbut to successfully
through
cutting off tumor
blood supplies;
in mice required
administering
it at the
right
place, angle, and depth,
andcancers
were not
part of the usual
scientific
explanation of the
drug's
details
12
I
1. EXPERTMENTS
AND
IGENERALTZED
CAUSAL
TNFERENCEIn the
end, then, causal descriptions
and causal explanations
are
in delicate
bal-ance in experiments.'$7hat experiments
do best
is to improve causal descriptions;they do less well at explaining causal relationships. But most experiments
can bedesigned to
provide
better explanations than
is
typically the case today. Further, infocusing
on causal descriptions, experiments
often investigate
molar events thatmay be less
strongly related to outcomes than are more molecular
mediatingprocesses,
especially
those
processes
that are
closer to the outcome in the explana-tory chain.
However,
many causal
descriptions are still dependable
and strongenough to be useful,
to be worth making the building blocks
around which im-portant policies
and theories are created.
of suchcausal statements as
that school desegregation
Just
consider the dependability
causes
white flight, or that outgroupthreat causes ingroup
cohesion, or that psychotherapy
improves mental health,
orthat diet reduces
the
retardation due
to PKU.
Such dependable causal relationshipsare
useful to policymakers,
practitioners,
and
scientists DESCRIPTIONS
OF EXPERIMENTSSome of the terms used in describing
modern experimentation
(see
Table L.L) areunique, clearly defined,
and consistently used; others are blurred and
inconsis-tently used. The common attribute in all experiments is control of treatment(though
control can take many different forms). So Mosteller
(1990, p.
225)writes,
"fn
an experiment the investigator controls the application of the treat-ment"l and Yaremko, Harari, Harrison,
and Lynn
(1,986, p.72)
write,
"one
ormore independent
variables are manipulated to observe their effects on one ormore dependent
variables." However, over time many different experimental sub-types have developed in response
to the needs and histories of different sciences('Winston,
1990;
'Winston
6c Blais, 1.1 The
Vocabulary
of ExperimentsExperiment: A study in
which an intervention is
deliberately
introduced
to observe its ized
Experiment:
An experiment in which units are assigned
to receive the treatment oran alternative
condition by a random
process
such as the toss of a
coin or a table ofrandom
-Experiment:
An experiment in
which units are
not assigned to conditions
l
Experiment: Not
really an experiment because the cause
usually cannot bemanipulated;
a study that
contrasts
a naturally
occurring
event such
as an earthquake witha comoarison
ational
Study: Usually
synonymous
with nonexperimental
or observational study; a studythat
simply
observes the size and direction
of a relationship
among
variables.
EXPERIMENTS
OF
DESCRIPTIONS
MODERN
I
trIExperimentRandomized
creditedexperiment, widely
is the randomized
variant
described
clearly
The most
spreadbut
later
used in agriculture
(1,925,1926).It first
was
Fisher
Ronald
to Sir
of vari-it promised control
over extraneous sources
because
topic areas
to other
of the
laboratory. Its distinguishingthe
physical isolation
ation
without requiring
(in-contrasted
treatments being
and
important-that
the various
feature is clear
for ex-units' by chance,
to experimental
assigned
at all)
are
no treatment
cluding
correctlSnumbers. If
implemented
of random
of a table
toss or use
ample, by
coin
probabilisticallythat are
groups of units
two or
more
creates
,"rdo-
assignment
ob-that are
outcome differences
Hence, any
on the average.6
to .".h other
similar
to be due to treatment'likely
are
end,of a study
groups at the
those
served between
of the d at the
start
groups that already
between the
not to differences
yields
anmet, the
randomized experiment
are
assumptions
certain
Further, when
properties'desirable statistical
that
has
effect
of a treatment
estimate of
the size
falls within a definedof the
probability that
the true effect
with estimates
along
that in aare so highly
prized
features of experiments
These
interval.
confidence
referred to asis often
experiment
the randomized
medicine
as
such
area
research
research.'outcome
for treatment
the
gold standard
and in-is a more ambiguous
experiment
to the
randomized
related
Closely
withit synonymously
use
authors
Some
true experiment.
used term,
consistently
gener-it more
use
1991'). Others
(Rosenthal & Rosnow,
experiment
randomized
manip-is deliberately
variable
in which
an independent
to any
study
ally to refer
'We
notis assessed.
shall
variable
a dependent
(Yaremko
et al.,
1,9861and
ulated
tothat the
modifier true seems
and
given
its ambiguity
use the term
at all
given
mental
correct
to a single
claims
imply restricted
Quasi-Experimentand Stanleythat Campbell
of designs
Much of this
book
focuses on
a class
share with all otherquasi-experiments.s
(1,963)
popularized as
Quasi-experiments
anything else. Typically
in fieldbe
people, animals, time
periods, institutions, or
almost
5. Units can
a littleor work sites. In addition,
as classrooms
of people, such
aggregate
people
or some
they
are
experimentation
to units, soof treatments
as assignment
is the same
of units to treatments
that random
assignment
thought shows
interchangeably'phrases are
frequendy used
these
explained in
more detail
in Chapter
as
6. The word
probabilistically is crucial,
many fields and
in this book,this way
consistently
across
7. Although the rerm
randomized
experiment
is used
related term random
experiment
in a different way to
indicate experimentsuse the closely
sometimes
statisticians
(e.g.,
Hogg &
Tanis, 1988).predicted with certainry
be
cannor
for which the outcome
very
quickly; Rosenbaumdesigns but
changed terminology
(1957) these compromise
first called
8. Campbell
people it tomany use
a term we avoid because
(1965
refer to these as
observational studies,
(1995a
and Cochran
to(1997) qudsi-etcperiment
use
and Shroder
well. Greenberg
studies, as
or nonexperimental
refer to correlational
group-but we would
consider these
to conditions,
(e.g.,
groups communities)
assign
refer to studies that
randomly
(Murray' 1998).randomized experiments
I14
I
I1. EXPERIMENTS
AND GENERALIZED
CAUSAL INFERENCEexperiments
a similar purpose-to test descriptive
causal hypotheses about
manip-ulable causes-as well as many structural details, such as
the
frequent presence ofcontrol
groups
and pretest measures, to support a counterfactual
inference
aboutwhat would have happened in the
absence
of treatment. But, by definition,
quasi-experiments lack random assignment.
Assignment to conditions is by
means
of self-selection, by which units choose
treatment
for themselves, or by means
of adminis-trator selection,
by which teachers, bureaucrats, legislators, therapists,
physicians,or others decide
which persons should get which treatment. Howeveq researcherswho use quasi-experiments may still have considerable
control over selecting
andscheduling measures, over how nonrandom
assignment
is executed, over the kindsof comparison
groups
with which treatment,groups are compared,
and over someaspects of how treatment is scheduled. As Campbell and Stanley
note:There are many natural social
settings in which the research
person can introducesomething like experimental
design into his scheduling
of data collection ,
the uhen and to whom of measurement), even though
he lacks the full controlover the
scheduling of experimental stimuli
(the
when and
to wltom of exposure andthe ability to randomize exposures)
which makes a true experiment
possible.
Collec-tively, such
situations can be regarded as quasi-experimental
designs.
(Campbell
&StanleS
1,963,
p.
34)In quasi-experiments,
the cause
is manipulable and occurs before
the effect ismeasured. However, quasi-experimental
design
features usually create less com-pelling
support for counterfactual inferences. For example,
quasi-experimentalcontrol groups may differ from the treatment condition in many systematic
(non-random) ways other than the presence of
the treatment
Many of these ways couldbe alternative explanations for the observed effect, and so researchers have toworry about ruling them out in order to get a more valid estimate
of the treatmenteffect. By contrast, with random assignment the researcher does
not have to thinkas much about all these alternative explanations. If correctly done,
random as-signment makes most of the alternatives less likely as causes
of the observedtreatment effect at the start of the quasi-experiments, the researcher has
to enumerate
alternative
explanationsone by one, decide
which are plausible, and then use logic, design,
and measure-ment to assess
whether each one is operating in a way that
might explain any ob-served effect. The difficulties are that these
alternative
explanations
are
never com-pletely enumerable
in advance, that some of them are
particular to the contextbeing studied,
and that the methods needed to eliminate them
from contention willvary from alternative to alternative
and from study to study.
For example, supposetwo nonrandomly formed groups
of children are
studied, a volunteer treatmentgroup
that gets
a new reading program and a control
group of nonvolunteers whodo not get it. If the treatment group
does better,
is it because of treatment or be-cause the cognitive
development of the volunteers was increasing
more rapidly evenbefore treatment began?
(In
a randomized experiment,
maturation rates wouldtrl
OF EXPERIMENTS
1sMODERN DESCRIPTIONS
|
re-the
this alternative,
in both
groups.) To assess
equal
probabilistically
been
have
the
treat-trend before
maturational
pretests to reveal
multiple
might
add
searcher
that trend
with the
trend after
, and then
compare
in-control
group
nonrandom
that the
might be
explanation
alternative
Another
homes orin their
to books
less access
who had
children
disadvantaged
cluded more
bothexperiment'
often.
(In
a randomized
to them
less
who
read
who had
parents
this alter-To assess
children.)
of such
proportions
had
similar
groups would
have
timeparental
at home,
of books
the number
may
measure
nativi, the
experimenter
wouldresearcher
the
Then
to libraries.
perhaps trips
and
children,
spent readingto
in the hypothe-and control
groups
treatment
across
differed
variables
see if these
as theObviously,
effect.
treatment
that could
explain
the observed
direction
sized
of the
quasi-the design
increases,
explapations
of
plausible alternative
number
be-and complex---especially
demanding
.
experiment
more
intellectually
becomes
Theexplanations.
all the alternative
identified
we have
never certain
cause we
are
a woundto bandage
start
to look
like affempts
efforts of
the
quasi-experimenter
had been
if random assignment
less severe
been
that would
have
to a falsificationistrelated
is closely
hypotheses
The ruling out
of alternative
that asure
how hard it
is to be
noted
(1959).
Popper
popularized by
Popper
logic
ofbased on
a limited
set
white)
is correct
(e.g.,
,ll r*"ttr
are
g*.r"t conclusion
white).
After all,
future observa-were
I've seen
(e.g.,
all the
swans
observations
is log-So confirmation
swan).
I may see a
black
(e.g.,
some day
change
tions
may
(e.g.,
swan)a black
instance
a disconfirming
observing
By contrast,
difficult.
ically
that all
swans areto falsify
the
general conclusion
view,
in Popper's
is sufficient,
to
falsify the con-to try deliberately
scientists
urged
nopper
white.
Accordingly,
information
corroboratingclusions they
wiih
to draw
rather than
only
to seek
orbooks
in scientific
retained
are
falsification
that withstand
them.
Conciusions
comes along.
Quasi-journals
and treated
as
plausible until
better evidence
to
identify
aexperimenters
in that
it requires
is falsificationist
experimentation
explanationsplausible alternative
and examine
to
generate
then
and
causal claim
the might
falsify
KuhnPopper hoped.
as
as definitive
be
never
can
falsification
However, such
that can
neveron two
assumptions
depends
(7962) pointed out that
falsification
But that
isspecified.
claim
is perfectly
The first
is that the
causal
be fully
tested.
of both
the claim
and the test
of the claim
aremany
features
never ih. ."r..
So
thehow it
is measured,
is of interest,
which
outcome
debatable-for
example,
and all the
many other
decisionswho
needs treatment,
of treatment,
conditions
disconfir-As a result,
relationships.
causal
in testing
make
must
that
researchers
theories. For
exam-part of their causal
to respecify
leads theorists
mation often
to behold for their
theory
that must
conditions
novel
ple,
might
now specify
they
Sec-observations.
disconfirming
from the apparently
derived
irue and
that were
perfectly valid
reflections of
the the-that are
measures
requires
ond, falsification
ismost
philosophers maintain
that all
observation
However,
tested.
ory being
specific to
the
partiallynuances
both with
intellectual
It is laden
theorv-laden.
16 I 1. EXPERIMENTS
IAND GENERALIZED CAUSAL
INFERENCEunique
scientific understandings
of the theory
held by the
individual
or group de-vising the test and also with the
experimenters' extrascientific
wishes,
hopes,aspirations, and broadly shared
cultural assumptions
and understandings.
Ifmeasures are not independent
of theories, how can
they
provide independent the-ory tests, including
tests of causal theories?
If the
possibility of theory-neutral
ob-servations
is denied, with them
disappears
the
possibility
of definitive
knowledgeboth of what seems
to confirm a causal claim
and of what seems
to disconfirm
eless,
a fallibilist version of falsification
is
possible. It argues that stud-ies of causal hypotheses can
still usefully improve
understanding
of
general trendsdespite ignorance of all
the contingencies that
might
pertain
to those trends.
It ar-gues
that causal studies are
useful even
if w0 have
to respecify
the initial hypoth-esis
repeatedly to accommodate
new contingencies
and new understandings.
Af-ter all, those respecifications
are usually minor
in scope;
they rarely
involvewholesale overthrowing
of general trends
in favor of
completely opposite
ilist
falsification also assumes
that theory-neutral
observation
is impossiblebut that observations
can approach a more
factlike
status when
they
have been re-peatedly made
across different
theoretical conceptions
of a construct,
across mul-tiple kinds
of measurements,
and at multiple
times.
It also assumes
that observa-tions are imbued with multiple theories,
not
iust
one, and
that differentoperational
procedures do not share the same multiple
theories. As a result,
ob-servations that repeatedly occur
despite different
theories being
built into themhave a special factlike status even
if they can
never be
fully
justified
as completelytheory-neutral facts.
In summary, then,
fallible
falsification
is more than
just
see-ing whether observations disconfirm
a prediction.
It involves
discovering andjudging
the worth of ancillary
assumptions
about
the restricted
specificity
of thecausal hypothesis under
test and also
about the heterogeneity
of theories, view-points, settings, and times
built into the measures
of the cause
and effect
and ofany contingencies
modifying their is neither
feasible nor desirable to rule out
all
possible
alternative
interpre-tarions
of a causal relationship. Instead, only
plausible alternatives
constitute themajor focus.
This serves
partly to keep matters tractable
because
the number
ofpossible alternatives is endless. It also
recognizes
that many
alternatives
have noserious empirical or experiential
support and
so
do not warrant
special r, the lack of support
can sometimes
be
deceiving.
For example, the
causeof stomach ulcers was long thought
to be a combination
of lifestyle
(e.g.,
stress)and excess acid
production. Few scientists seriously
thought that ulcers
werecaused by a pathogen
(e.g.,
virus,
germ,
bacteria)
because
it was
assumed that
anacid-filled stomach would destroy
all living organisms.
However, in
L982 Aus-tralian researchers
Barry Marshall and
Robin
'Warren
discovered spiral-shapedbacteria, later
name d Helicobacter
pylori (H. pylori), in ulcer
patients' h
"1994,
this discovery, the
previously possible but implausible
became
plausible. Bya U.S. National Institutes
of Health Consensus
Development Conferenceconcluded that H.
pylori
was the major cause
of most
peptic ulcers.
So labeling ri-
IOF EXPERIMENTS
DESCRTPTONS
MODERN
I
ttInot
just
on what is logically
possible but onas
plausible depends
val hypotheses
and, empirical
experience
social consensus,
areas
de-Because such
factors are
often context
specific,
different substantive
to beare important
enough to need
velop their own
lore about
which alternatives
controlled, even
developing
their own
methods for
doing so. In early
psychologgwas invented to control forobservations
a control
group with
pretest
for example,
test con-alternative
explanation
that, by
giving practice
in answering
the plausible
would
produce gains in
performance even
in the absence of a treat-tent,
pretests
(Coover
6c
Thus the
focus on
plausibility is a two-edgedAngell, 1907).
ment effect
in quasi-experimentalto be considered
the range of
alternatives
sword: it
reduces
the resulting causal
inference vulnerable
to the discoverywork,
yet it also leaves
a likely causal
alternative
may later emerge
that an implausible-seeming
Natural Experimenta naturally-occurring
contrast between aThe term natural
experiment
describes
1990;
Meyer, 1995;Zeisel,1,ion
(Fagan,
and a comparison
treatment
Often the treatments
are not
even
potentially manipulable,
as when researchersdrops in
prop-in California caused
whether earthquakes
examined
retrospectively
erty values
(Brunette, 1.995; Murdoch, Singh,
6c Thayer,
1993). Yet plausibleabout
the effects
of earthquakes
are easy
to construct and de-causal inferences
on
property val-occurred
before the
observations
fend. After all,
the earthquakes
A use-are related
to
properfy values.
whether earthquakes
to see
and it is easy
ues,
can be
constructed by
examining
propertyof counterfactual
inference
ful source
thatlocale before
the earthquake
or by studying similar
locales
in the same
values
did not experience
an earthquake
during
the bame time.
If property valuesin the earthquake
condition
but not in the com-dropped
right after the
earthquake
parison condition,
it is difficult to
find an alternative
explanation
for that have recently
gained a high
profile in economics.
Natural experiments
had great faith
in their
ability to
produce valid causal in-the 1990s economists
between treat-for initial
nonequivalence
ferences through
statistical
adjustments
of
job
training
programson the
effects
ment and control
groups. But two studies
that were
not close to thoseshowed that those
adjustments
produced estimates
tests
of thegenerated from a randomized
experiment
and were
unstable across
Lalonde, 1986).
Hence, in theirmodel's sensitivity
(Fraker
6c Maynard,
1,987;
came to do
natural experiments,many economists
for alternative
methods,
search
that occurred
in the Miami
job
marketsuch as the economic
study of the
effects
from Cuban
jails
and allowed to come
to thewere released
when many
prisoners
of
prisoners
(or
the tim-(Card,
1990). They
assume that
the release
United States
ing of an earthquake)
is independent
of the ongoing
processes that usually affectunemployment
rates
(or
housing values).
Later we
explore the validity
of thisassumption-of
its desirability
there can
be little
question.
18
I
1. EXPERIMENTS
AND GENERALIZED
CAUSAL
INFERENCENonexperimental DesignsThe terms correlational
design,
passive
observational design, and
nonexperimentaldesign
refer to situations
in which a presumed cause and effect are identified andmeasured but in which other structural features of experiments are missing. Ran-dom assignment is not part of the design, nor are
such design elements as
pretestsand control
groups from
which researchers might
construct a useful counterfactualinference. Instead, reliance is placed
on measuring alternative explanations
indi-vidually and then statistically
controlling for them.
In cross-sectional studies inwhich all the data
are
gathered on
the respondents
at one time, the researcher
maynot even know if the cause
precedes
the dffect.
When these
studies are used forcausal purposes,
the missing design features can be
problematic unless much
is al-ready known about
which alternative interpretations are
plausible, unless thosethat are
plausible can
be validly measured, and unless the substantive
model
usedfor statistical adjustment is
well-specified. These are difficult
conditions to meet inthe real
world of research
practice,
and therefore many
commentators
doubt thepotential
of such designs
to support strong causal inferences
in most MENTS
AND
THE GENERALIZATION
OFCAUSAL
CONNECTIONSThe strength of experimentation is its ability to illuminate causal inference.
Theweakness of experimentation is doubt about the extent to which that causal
rela-tionship generalizes.
'We
hope that an innovative feature of this
book is its focuson
generalization.
Here we introduce the general issues that are expanded
in Experiments
Are Highly Local But HaveGeneral AspirationsMost experiments
are highly localized and
particularistic. They are almost alwaysconducted in a restricted range
of settings, often
just
one, with a
particular
ver-sion of one type of treatment
rather than, say, a sample of all
possible y they
have several measures-each with theoretical assumptions
that aredifferent from those present in
other
measures-but
far from a complete set of allpossible
measures.
Each
experiment nearly always
uses a convenient sample ofpeople
rather than
one that reflects a well-described
population;
and it will in-evitably
be conducted at a particular
point
in time that rapidly becomes
readers of experimental results
are rarely concerned
with what happenedin that particular, past,
local study. Rather, they usually
aim to learn
either abouttheoretical constructs
of interest or about alarger
policy. Theorists often want to
CONNECTIONS
CAUSAL
OF
THE GENERALIZATION
AND
EXeERTMENTS
I
t'connect
experimental
results to
theories
with broad
conceptual
applicability,which ,.q,rir.,
generalization
at the
linguistic level
of constructs
rather than
at thethese constructs
in a
given of the
operations
used to
represent
than are
rep-to more
people and settings
They nearly
always
want to
generallze
theoryto a substantive
experiment.
Indeed,
the value assigned
in a single
resented
SimilarlSof
phenomena the theory
covers.
on how
broad a
range
usually
depends
policymakers
may be
interested
in whether
a causal
relationship
would holdthe
many sites
at which
it would be
implemented
as aiprobabilistically)
across
beyond the
original experimentalthat
requires
generalization
policS an inference
probably value
the
perceptual and cogni-all human
beings
stody contexr.
Indeed,
Otherwise,
the world
might ap-tive stability
that
is fostered
by
generalizations.
requiring constant
cognitivecacophony
of isolqted
instances
pear as a btulzzing
processing that would
overwhelm
our
limited more broadly
ap-a problem,
we do
not assume
as
In defining
generalization
physi-(Greenwood, 1989).
For example,
more desirable
plicable resulti
are always
may not expect
that itto discover
new elements
cists -ho
use
particle accelerators
into the world.
Similarly, social
scien-would be desiiable
to introduce
such
elements
aim to
demonstrate
that an
effect
is possible and
to understand
itstists sometimes
Forcan be
produced more
generally.
without
expecting
that the
effect
mechanisms
"sleeper
effect"
occurs
in an attitude
change study
involving
per-instance, when
a
is manifest after
a time delaycommunications,
the
implication
is that
change
suasive
turn out
tounder
which this
effect occurs
but not
immediately
so.
The circumstances
interest other
than to
show that
thebe
quite limited and
unlikely to
be of any
general
may not be
wrong
(Cook,theory
predicting
it
(and
many
other ancillary
theories)
Gruder,
Hennigan &
Flay ments
that demonstrate
limited
generaliza-broad
demonstrate
as those
valuable
tion
may be
just
as
nature of the
causalthe localized
to exist
berween
a conflict
seems
Nonetheless,
causalprovide and
the more
generalized
knowledge
that
individual
experiments
(Cronbach
et al.,to attain.
Cronbach
and his
colleagues
aspires
goals that research
made this
argument
most forcefully
and their
worksCronbach,
19821have
f
gSO;
have contributed
much to
our thinking
about
causal
generalization. Cronbachbeing
con-of
units that
receive the
experiences
consists
experiment
noted that
each
on the units,
and of thetrasted,
of the
treaiments
themselves
,
of obseruations
made
settings
in which
the study
is conducted.
Taking
the first
letter from each
of these"instances
on which
datafour iords,
he defined
the acronym
utos to
refer to the
"1.982,p.
measures'treatments'
78)-to
the actual
people,
are collected"
(Cronb
ach,
two
problems ofHe then defined
in the experiment.
that were
sampled
and settings
"domain
about
which
[the]
question is
asked"generalizition:
(1)
generaliiing
to the
"units,
treatments,
variables,(2) generalizing to
(p.7g),which
he called
UTOS; and
oUTOS.e*hi.h
he called
"nd r.r,ings
not directly
observed"
(p.
831,
capital S,only used
Cronbach
reasons. For example,
for pedagogical
presentation here
Cronbach's
9. We oversimplify
consistent definitionsand not always
only to
,tos, not
utos. He offered
diverse
,eferred
not small s, so
that his system
broad way we
do
the word
generalization in the
not use
And he does
of UTOS and
*UTOS,
in
particular.
I20 I 1. EXPERIMENTS AND GENERALIZED CAUSAL
INFERENCEOur theory of causal
generalization,
outlined below and
presented
in more de-tail in Chapters LL through 13, melds Cronbach's thinking
with our own
ideasabout
generalization
from previous works
(Cook,
1990, t99t;
Cook 6c Camp-bell,1979), creating a theory that is different
in modest ways
from both of thesepredecessors.
Our theory
is influenced by Cronbach's work in
two ways.
First, wefollow him by describing experiments consistently
throughout
this book as
con-sisting of the elements of units, treatments, observations,
and settingsrlo
thoughwe frequently substitute
persons
for units
given
tfield experimentation
isconducted with humans as
participants.
:We
hat most
also
often
substitute outcome
ob-seruations
given
the centrality of observations
about outcome
when examiningcausal relationships.
Second, we acknowledge
that researchers
are often
interestedin two kinds lization about each
of these five
elements,
and that thesetwo types are inspired bg but not identical to, the
Cronbach defined.
'We
two kinds
of
generalization
thatcall these
construct validity
generalizations
(inferencesabout
the constructs that research operations
represent)
and external validity
gen-eralizations
(inferences
about whether the causal
relationship holds
over
variationin persons, settings, treatment, and measurement
variables).Construct Validity: Causal Generalizationas
RepresentationThe first causal generalization
problem
concerns
how to
go from the
particularunits, treatments, observations, and settings
on which data
are collected to thehigher order constructs these
instances represent.
These constructs are
almost al-ways couched in terms that are more abstract
than the
particular instances
sam-pled in an experiment.
The labels may
pertain to the individual
elements of the ex-periment
(e.g.,
is the outcome measured by
a
given test best described asintelligence or as achievement?). Or the labels
may
pertain to the nature of
rela-tionships among elements,
including causal relationships,
as when cancer
treat-ments are classified
as cytotoxic or cytostatic
depending
on whether
they kill tu-mor cells directly or delay tumor
growth by modulating their
er a randomized experiment by Fortin
and Kirouac
(1.9761.
The treatmentwas a brief educational course administered by several
nurses,
who
gave
a tour oftheir hospital and covered some basic facts
about surgery
with individuals
whowere to have elective abdominal or thoracic surgery
1-5 to 20 days
later in a sin-gle Montreal hospital. Ten specific
outcome measures
were used
after the surgery,such as an activities of daily living scale
and a count
of the analgesics
used
to con-trol
pain.
Now compare this study with its
likely t^rget constructs-whether10.
We
occasionally
refer
to time as a separate feature of experiments,
following Campbell
(79571
and Cook andCampbell
(19791,
because time can cut across the other
factors
independently.
Cronbach
did not include time
inhis notational system, instead incorporating time into treatment
(e.g.,
the scheduling
of treatment), ,
when measures are administered), or setting
(e.g.,
the historical
context
of the experiment).
coNNEcrtoNS
oF cAUsAL
THE GENERALIZATIoN
AND
EXnERTMENTs |
,,Ipromotes
physical recovery
(the
targ€t effect)patient education
(the
target
cause)
"*ong
surgical
patients
(the
target
population of units)
in hospitals
(the
targetin which the
ques-research,
in basic
univeise
ofiettings).
Another
example
occurs
usedtion frequently
aiises
as to
whether
the actual
manipulations
and measures
and effect
constructs
specifiedin an experiment
really
tap into
the specific
cause
to a theory
is simply toan empirical
challenge
by the
theory. One
way to
dismiss
as they are
spec-the concepts
that the
data
do not
really
represent
make the
case
ified in the
initial understandingto change
Empirical
resnlts
often
force researchers
to aleads
the reconceptuahzation
of whaithe
domain
under
study
is. Sometimes
about what
has been
studied.
Thus the
planned causalmore restricted
inference
agent
in the
Fortin and
Kirouac
(I976),study-patie,nt
education-might
need
toed as
informational
patient education
if the information
component
ofb! respecifi
the treatment
proved to be
causally
related
to recovery
from surgery
but the
tourto thinklead researchers
sometimes
of the
hospital
did not.
Conversely
data can
than those
withthat are
more
general
and categories
in terms
o?,"rg.,
constructs
program.
Thus
the creative
analyst
of
patient educa-a research
which they
began
of interventions
thattion studies
mlght
surmise
that
the treatment
is a subclass
"perceived
control"
or that
recovery
from surgery
can befunction
by increasing
;'
coping."
Subsequent
readers of the
study cantreated as
a subclas
of
control
is re-claiming
that
perceived
even add their
own
interpietations,
perhaps
construct.
There
is aof the even
more
general
self-efficacy
case
ally
just
a special
intendedthe researcher
sobtie
interplay
over time
among
the original
categories
the study
as it was
actually
conducted,
the study
results,
and
subse-to
represeni,
thinking
i interpretations.
This interplay
can
change
the researcher's
at a
more conceptual
level, as
canwhat the
siudy
particulars
actually
achieved
occur'
the first
problemfromreaders.
But whatever
reconceptualizations
feedback
from a sam-How can
we
generalize
is always
the same:
of causal
generaltzation
with
them to the
particular tar-and
the data
patterns associated
ple of instances
get constructs
they represent?Extrapolationas
Generalization
Causal
Validity:
External
relationshipis to infer
whether
a causal
problem of
generalization
The second
For example,outcomes.
and
treatments,
in
, settings,
variations
over
holds
of a kindergartenon the effects
of an experiment
the
results
reading
someone
of
poorscores
test
reading
grammar school
subsequent
on the
progiam
Start
Head
may want
to know
if aduring
the 1980s
in Memphis
children
African
American
goals_woulddevelopment
social
and
cognitive
overlapping
with
partially
program
childrenscores of
poor Hispanic
test
thi
mathematics
in improving
effective
be
as
be
implemented
were
if this
program
in Dallas
is not a synonym
forThis exampl.
again
reminds
us that
generahzation
city andis from one
city to
another
H.r.,
generahzation
applicatiorr.
broader
1. EXPERIMENTS AND GENERALIZED
CAUSAL
INFERENCEfrom one kind of clientele to another kind, but there is no presumption that Dal-las is somehow broader than Memphis or that Hispanic children constitute abroader
population
than African American
children. Of course, some
general-izations are from narrow to broad. For example, a researcher
who randomlysamples experimental
participants
from a national
population may generalize(probabilistically)
from the sample to all the other unstudied
members of thatsame
population.
Indeed,
that is the rationale
for choosing random selection inthe
first place. Similarly when policymakers consider whether
Head Start shouldbe continued on a national basis, they are
not so
interested in what happened inMemphis. They are more interested
in what would happen on the average
acrossthe United States, as its many local
programs
still differ from each
other despiteefforts in the 1990s to standardize much of what happens
to Head Start childrenand
parents. But generalization
can
also
go from the broad
to the narrow. Cron-bach
(1982)
gives
the example of an experiment
that studied differences
betweenthe
performances
of groups of students attending
private
and
public schools. Inthis case, the concern of individual
parents is to know which
type of school is bet-ter for their particular child, not for the whole
group.
Thether
from narrow tobroad, broad to narroq or across units at about
the same level
of aggregation,all these examples of external validity questions share
the same need-to infer
theextent to which the effect holds over variations
in persons, settings, treatments,or ches
to Making Causal
GeneralizationsThichever way the causal generalization issue
is framed, experiments do notseem at first glance to be very useful. Almost
invariablS a
given experiment usesa limited set of operations to represent units, treatments,
outcomes,
and high degree
of localization is not unique
to the experiment;
it also charac-terizes case studies, performance monitoring
systems, and opportunistically-administered marketing questionnaires
given to, say,
a haphazard sample
of re-spondents
at local shopping centers
(Shadish,
1995b). Even when
questionnairesare administered to nationally representative
samples,
they are ideal
for repre-senting that particular population of
persons but have little relevance
to citizensoutside of that nation. Moreover, responses may also
vary by the setting
in whichthe interview took place
(a
doorstep, a living
room, or a work site),
by the timeof day at which it was administered, by how each
question was framed, or by theparticular race,
age, and gender combination of
interviewers.
But the fact that theexperiment is not alone in its vulnerability to
generalization issues does
not makeit any less
a
problem.
So what is it that
justifies
any belief that an
experiment canachieve a better fit between the sampling
particulars of a study and more
generalinferences
to constructs or over variations
in persons, settings,
treatments, andoutcomes?
coNNEcrtoNs
cAUsAL
oF
GENERALtzATtoN
AND THE
EXeERTMENTs I
ttGeneralizationand
Causal
Sampling
of for-fit is the use
this close
for achieving
recommended
often
most
The
method
or set-observations,
of units,
treatments,
of instances
mal
probabiliry
sampling
clearlythat we
have
This
presupposes
L983).
tings
(Rossi, Vlright, &
Anderson,
and that
we can
sample with
known
probabilityof each
populations
deiineated
selectionthe random
this entails
In effect,
populations.
of these
from within
each
ear-discussed
assignment
from
random
distinguished
carefully
to be
of instances,
to repre-by chance
cases
selecting
involves
selection
Random
lier in this
chapter.
to mul-cases
involves assigning
assignment
random
whereas
sent that
popuiation,
tiple
samples of
indi-that
is not experimental,
research
In cause-probing
ofPanel Study
as the
such
surveys
longitudinal
"r. oft.n
nr.d.
Large-scale
viduals
theto represent
used
Survey are
or the
National
Longitudinal
Dynamics
Income
within it-and measuresbrackets
age
States-or
certain
United
of the
population
intime
lags
other using
to each
are then
related
and
effects
Lf
causes
inAll this is done
nonequivalence.
for
group
controls
statistical
,nr^".-ent
and
However, casesachieves.
experiment
what a
randomized
hopes of
approximating
ielection
from a broad
population
followed
by random
assignmentof random
Also12 for examples).
Chapter
much rarer
(see
are
from within
this
population
Suchby a
quality
quasi-experiment.
followed
oi t".rdotn
selection
rare are studies
thatcontrol
of logistical
a degree
and
of resources
level
a high
require
experiments
of nonsta-prefer to rely
on an
implicit set
researchers
so
many
is iarely feasible,
more explicit
and sys-to
make
that we
hope
for
generalization
heuristics
tistical
in this c
andoutcomes,
more rarely
with treatments'
even
occurs
Random
selection
Howexperiment.
in an
observed
outcomes
the
with
people. Consider
than
settings
ofmodel
grant that
the domain
sampling
ofterrlre
they
raniomly
sampled?'We
used tothat the
items
assumes
1994)
Bernstein,
6c
iheory
(Nunnally
test
classical
from a domain
of all
possiblerandomly
sampled
been
have
measure a
construct
randomlyever
few researchers
practice
experimental
in actual
However,
items.
ma-choosing
when
do so
Nor do they
measures.
constructing
when
items
sample
to be
sampled,will
not agree
many settings
For instance,
or settings.
nipulations
certainlywill almost
sampled
that
agree to
be
randomly
of the
settings
"rid ,o1n.
listno definitive
For treatments,
to conditions.
assigned
randomly
to be
not agree
in areas in
which treat-as
is most
obvious
exists,
usually
treatments
of
poisible
Inas
in AIDS
research.
such
rapidly,
and
developed
discovered
-*,,
are being
con-but it
is only rarely
and
desirable,
is always
sampling
random
general, then,
tly
"However,
pur-not the
only option.
Two informal,
are
methods
formal
sampling
of heteroge-useful-purposive
sampling
methodrare
sometimes
posive sampling
In the former
case' theof typical
instances.
sampling
purposive
and
neous instances
on
presumptivelydiversity
to reflect
deliberately
inrLni.r
chosen
aim
is to include
In the
latteris not
formally
random.
the sample
even though
important
dimensions,
24 I
.l.
TxpEnIMENTS
AND GENERALIZED
CAUSAL
INFERENCEcase, the aim is to explicate
the kinds of units, treatments,
observations,
and settingsto which one
most wants to
generalize andthen to select
at least
one
instance of eachclass
that is impressionistically
similar to the class
mode. Although
these
purposivesampling methods are
more
practical than formal
probability sampling,
they are notbacked by a statistical
logic that
justifies
formal
generalizations.
Nonetheless,
theyare
probabty the most commonly used of
all sampling
methods
for facilitating
gen-eralizations. A task
we set ourselves
in this book
is to explicate
such
methods and todescribe
how they can be used
more often than is the
case
r, sampling
methods of any
kind are insufficient
to solve
either
prob-lem of generalization.
Formal
probability sampling
requires specifying
a targetpopulation from which sampling
then takes
place, but defining such
populationsis difficult for some targets
of
generalization such as
treatments.
Purposive sam-pling of heterogeneous
instances
is differentially
feasible
for different
elements ina study;
it is often more feasible
to make measures
diverse
than it
is to obtain di-verse
settings,
for example. Purposive
sampling
of typical
instances is often
feasi-ble when target
modes, medians, or means
are known,
but it leaves
questionsabout
generalizations to a wider range
than is typical.
Besides, as
Cronbach
pointsout, most challenges
to the causal
generalization of
an experiment
typicallyemerge
after a study
is done. In such cases,
sampling
is relevant
only if the
in-stances in the original study
were sampled
diversely
enough
to
promote responsi-ble reanalyses of
the data to see
if a treatment effect
holds across
most or all of
thetargets about which
generahzation has been challenged.
But
packing so manysources
of variation into a single
experimental
study
is rarely
practical and will al-most certainly conflict
with other
goals
of the experiment.
Formal sampling
meth-ods usually offer only
a limited solution
to causal
generalization
problems. A the-ory of
generalized
causal
inference
needs additional
tools.A Grounded
Theory of Causal
GeneralizationPracticing scientists
routinely make
causal
generalizations
in their research,
andthey
almost
never use
formal probability
sampling
when
they
do.
In this book,
wepresent a theory of causal
generalization that is
grounded in the
actual
practice ofscience
(Matt,
Cook, 6c Shadish,
2000). Although
this theory
was originally
de-veloped from ideas that
were
grounded in the construct
and external
validiry
lit-eratures
(Cook,
1990,1991.),we
have since found
that
these
ideas are
common
ina diverse literature
about scientific
generalizations
(e.g.,
Abelson,
1995;
Campbell& Fiske,
1.959; Cronbach & Meehl,
1955;
Davis,
1994;
Locke,
1'986; Medin,1989;Messick, 1ggg,1'995;
Rubins, 1.994;'Willner,
1,991';'$7ilson,
Hayward,
Tu-nis, Bass, & Guyatt,
1,995];t.
7e
provide
more
details
about
this
grounded theoryin Chapters 1L
through L3, but in brief
it suggests
that scientists
make
causal
gen-eralizations in their work by
using
five closely related
principles:"L.
Surface Similarity.
They assess
the apparent
similarities
between
study
opera-tions and the
prototypical characteristics
of the target
of generalization.
IZSCONNECTIONS
CAUSAL
OF
THE GENERALIZATION
AND
EXPERIMENTS I
IThey identify those things
that are
irrelevant because2. Ruling Out
lrreleuancies.
a
do not
change
3. Making
Discriminations.
They clarify
k.y
discriminations
that
limitgeneralization.4. Interpolation
and Extrapolation.
They make
interpolations
to unsampled
val-and,
much more difficult, theyues within
the range
of the sampled
instances
explore extrapolations
beyond the
sampled
the
pat-and test explanatory
theories
Explanation.
They develop
5 . Causal
causes, and
mediational
processes that
are essential to the
trans-tern of effects,
of a causal
five princi-these
can and
do use
In this book, we
want to show
how scientists
theconclusions
dbout
a causal connection.
Sometimes
ples to draw
generalized
conclusion
is about the
higher order
constructs
to use
in describing an obtainedfive principles have
analogues orthese
In this sense,
level.
at the sample
connection
with construct content,
withboth in the
construct
validity literature
(e.g.,
parallels
and discriminant
validity, and
with the
need for theoretical
rationalesand
philosophy literatures that
studyand
in the cognitive
science
for consrructs)
(e.g.,
the rolesconcerning
fall into a category
whether
instances
decide
how
people
and surface
versus
deep similarity
play in deter-that
protorypical characteristics
mining category
membership).
But at other
times, the
conclusion about
general-ization refers
to whether
a connection
holds broadly
or narrowly
over variationsHere, too, the
principles have ana-or outcomes.
treatments,
settings,
in
persons,
from scientific
theory and
practice, as inthat
we can recognize
or
parallels
logues
(a
form of
interpolation-extrapolation)
orrelationships
the study of
dose-response
from animals
to humans
(ain generalizing
to explanatory
mechanisms
the appeal
explanation).form of causal
of re-five principles
almost
constantly
during all
phases
use rhese
Scientists
when they
read a
published study
and wonder
if some
varia-For example,
search.
about similari-particulars would
work in their
lab, they think
tion on the
study's
'$7hen
they conceptualizeties of the
published study
to what they
propose to do.
they
plan to study will match
thethey anticipate
how the instances
the new study,
about
which they
are curious.
They may de-of the constructs
prototypical features
that certain
variations
will be irrelevant
to it butiign their study
on the
assumption
that others
will
point to key discriminations
over which
the causal
relationshipThey may
includedoes not
hold or the very
character
of the
constructs
changes.
to clarify
how the
intervention
key
theoretical
mechanisms
measures
they test all
these hypotheses
and adjust
their construct
de-During data
analysis,
happened in the study.
The intro-scriptions
to match
better what
the data suggest
onthat the study
bears
the
reader
tries to convince
of their
articles
duction section
about how
resultsspecific
constructs,
and the discussion
sometimes
speculates
and es,
-igttt extrapolate
to different
units, treatments,
that theydo all this
not
just
with single
studies
Further,
practicing scientists
read or conduct
but also with
multiple studies.
They nearly
always think about
26
|
1. EXPERTMENTS AND GENERALIZED
CAUSAL
INFERENCEhow their own studies
fit into a larger
literature
about both the
constructs beingmeasured and the variables that
may or may not bound
or explain
a causal connec-tion, often
documenting this
fit in the introduction
to their study.
And they
apply allfive
principles
when they conduct
reviews
of the
literature, in which
they make
in-ferences about the kinds
of
generalizations
that a body
of research
can hout
this book, and especially
in Chapters 11 to
L3, we
provide moredetails
about this
grounded
theory of causal
generalization
and
about
the scientificpractices
that it suggests.
Adopting this
grounded theory
of
generalization
does notimply a
rejection of formal
probability ,
we
recommend
such sam-pling
unambiguously
when
it is feasible,
along
with
purposive sampling schemes
toaid
generalization
when
formal
random selection
methods cannot
be
we also show
that sampling is
just
one
method
that
practicing scientists
use tomake causal
generalizations, along
with
practical logic,
application
of diverse sta-tistical
methods, and use
of features of design
other
than
MENTS
AND METASCIENCEExtensive
philosophical debate sometimes
surrounds
experimentation.
Here webriefly summarize
some
key features of these
debates,
and
then
we discuss
someimplications
of these debates
for experimentation.
However,
there
is a sense inwhich all this
philosophical debate is
incidental to
the
practice of mentation
is as old as humanity
itself, so
it preceded
humanity's
philo-sophical efforts
to understand
causation and
genenlization
by
thousands
of
over
just
the past
400 years of scientific
experimentation,
we can
see someconstancy of experimental
concept
and method,
whereas diverse
philosophicalconceptions
of the experiment
have come and
gone. As
Hacking
(1983)
said,
"Ex-perimentation
has
a life of its own"
(p.
150).
It has
been
one of science's
mostpowerful methods for discovering
descriptive
causal
relationships,
and
it has doneso well in so many
ways that its
place in science
is
probably assured
forever. Tojustify
its practice todag a scientist
need not resort
to sophisticated
philosophicalreasoning about
eless,
it does help scientists
to understand
these
philosophical example,
previous
distinctions
in this chapter
between
molar and
molecularcausation, descriptive and explanatory
cause, or
probabilistic
and deterministiccausal
inferences all help both
philosophers and scientists
to understand
betterboth the
purpose
and the results of
experiments
(e.g., Bunge,
1959; Eells,
1991';Hart & Honore,
1985;
Humphreys,"t989;
Mackie,
1'974;
Salmon,
7984,1989;Sobel,
1993; P. A. X/hite,1990).
Here we focus
on a different
and broader
set
ofcritiques of science
itself, not only from
philosophy
but also
from the history,
so-ciologS and
psychology
of science
(see
useful
general reviews
by Bechtel,
1988;H. I. Brown, 1977; Oldroyd, 19861.
Some
of these
works
have been explicitlyabout the nature
of experimentation,
seeking to
create
a
justified
role for it
(e.g.,
IAND METASCIENCE
EXPERIMENTS I
27'1.990;
l98l; Gergen,Drake,
S.
Danziger,
1982,,1988;
L975; Campbell,
Bhaskar,
Pinch,6c Schaffer,L989; Gooding,
Houts,
Neimeyer,6d
Shadish,
1,973; Gholson,
'Woolgar,Latour 6c
L989; Hacking,
L983; Latour,
1'987;
1,989b; Greenwood,
L994;&
Fuller,
RosenthaL,1.966;Shadish
1.962;R.
1988; Orne,
1.979;Morawski,
limits of experimenta-to see some
help scientists
These critiques
1,9941.
Shapin,
and
in both science
CritiqueKuhnian
The
partly incommensu-and
different
revolutions as
scientific
described
Kuhn
(1962
grad-in which the
in time and
each other
succeedgd
paradigms that abruptly
rable
(1958),
Hanson Polanyia chimera.
was
knowledge
of scientific
ual accumulation
(L975),
and
Quine
(1'95t'Feyerabend
('J.959),
Toulmin
(1'961),
(1958),
Popper
mis-in part by exposing the
gross
to the critical
momentum,
1,969) contributed
of science based on re-to build a
philosophy
positivism's attempt
in logical
takes
denied anyas
physics. All these critiques
science such
a successful
constructing
(so, do notexperiments
by extension,
knowledge
for scientific
firm foundations
positivists hoped to achieve founda-The logical
firm causal knowledge).
provide
tions on
which to build
knowledge by
tying all theory
tightly to theory-free ob-conceptsservation through
predicate logic.
But this
left out important scientific
to recognize that all ob-it failed
and
tied tightly
to observation;
not be
that could
makingtheory,
and methodological
impregnated with
substantive
are
servations
ry-free
to conduct
it impossible
(often
referred to as theobservation
The impossibility
of theory-neutral
(and
so any singletest
of any single
that the results
implies
thesis)
Quine-Duhem
They could be
disputed, for example, onambiguous.
inevitably
are
experiment)
measure werebuilt into the outcome
assumptions
grounds that the theoretical
a fatity
assumption about
how high a treatmentwrong or that
the study
made
de-small, easily
are
assumptions
Some of these
effective.
required to be
dose was
gives the wrong reading becausewhen a voltmeter
such as
correctable,
tected, and
('$fil-meter
higher than that of the
much
was
voltage source
of the
the
impedance
paradigmlike, impregnating a theorymore
are
But other assumptions
L952).
son,
(e.g.,
theno sense without them
make
parts theory
of the
completely that other
so
in pre-Galilean astronomy).universe
is
the
center of
the
earth
that the
assumption
in any scientific test
is very large,involved
of assumptions
Because the number
to fault or can
even
posit newfind some
assumptions
can easily
researchers
"Even
the fatheron empirical
data:
of positivists
the reliance
us nor to overstate
11. However,
Holton
(1986)
reminds
sort by which to link
phenomena to somehad written . . . that
without a theory
of some
Comte,
Auguste
of positivism,
'it
conclusions, weobservations and
draw any useful
the isolated
principles would not only
be impossible to
combine
them, and, for the
most
part, the fact would not be
noticed by our eyes"'
(p.
32).to remember
be
able
would not even
in logicaldebate
historical analysis of the
protocol sentence
(1992)
provides a more detailed
Similarly, Uebel
such as ons
held by key
players
nonstereorypical
positivism, some surprisingly
showing
28
|
r. rxeenlMENTs
AND
GENERALIZED
CAUSAL
INFERENCEassumptions
(Mitroff
&
Fitzgerald,
1.977).In
this way,
substantive
theories
areless testable than their
authors
originally
conceived.
How can
a theory
be
testedif it is made
of clay
rather than
granite?For reasons we
clarify
later, this critique
is more true
of single
studies and
lesstrue of
programs of research.
But even
in the latter
case, undetected
constant biases."tt t.r,tlt
in flawed
inferences
about cause and
its genenlization.
As a result,
no ex-periment is
ever
fully certain,
and extrascientific
beliefs
and
preferences
always
haveioo- to influence the
many
discretionary
judgments
involved
in all scientific
Social
Psychological
CritiquesSociologists
working
within traditions
variously called
social
constructivism,
epis-temological
relativism,
and
the
strong
program
(e.g., Barnes,1974;
Bloor,
1976;Collins,
l98l;Knorr-Cetina,
L981-;
Latour 6c'Woolgar,1.979;Mulkay,
1'979)haveshown those
extrascientific
processes at work
in science.
Their empirical
studiesshow that scientists
often
fail to adhere to
norms
commonly
proposed as
part ofgood
science
(e.g.,
objectivity
neutrality,
sharing
of
information).
They
have
alsorho*n how that which
comes to be
reported as
scientific
knowledge
is partly de-termined by social and
psychological
forces and
partly by
issues
of economic
andpolitical power
both
within science
and
in the
larger
society-issues
that are
rarelymention;d in
published research
reports.
The most
extreme
among
these
sociolo-gists
attributes all
scientific
knowledge to
such
extrascientific
processes, claimingihat
"the
natural world
has a small
or nonexistent
role in the
construction
of sci-entific
knowledge"
(Collins,
"l'98I,
p. 3).Collins does
not deny ontological
,
that
real entities
exist
in the
,
he denies epistemological
(scientific)
realism,
that whatever
external
real-ity may exist
can constrain
our scientific
theories.
For example,
if atoms
really ex-ist, do they affect
our scientific
theories
at all?
If our theory
postulates
an
atom,
isit describin
g a realentity that
exists roughly
as
we describe
it? Epistetnologi,cal
rel-atiuists such
as Collins
respond
negatively to
both
questions, believing
that themost important
influences
in science
are
social,
psychological,
economic,
and
po-litical, "ttd th"t these
might even
be
the only
influences
on scientific
theories-
Thisview is not widely endorsed
outside
a small
group of sociologists,
but
it is a use-ful counterweight
to naive
assumptions
that
scientific
studies
somehow
directly
re-veal
natur. to r.r,
(an
assumptiorwe
call
naiue realism).
The results
of all
studies,including experiments,
are
profoundly subject
to these
extrascientific
influences,from their conception to
reports of their
e
and
TrustA standard image of the
scientist
is as a skeptic,
a
person
who only
trusts results
thathave been personally verified.
Indeed, the
scientific
revolution
of the'l'7th centuryI
AND METASCIENCE
EXPERIMENTS I
29IIclaimed
that trust,
particularly trust
in authority and dogma,
was antithetical toevery dogma,
was to be open
to ques-Every authoritative
assertion,
good science.
was to
do that
, and
the
job
of science
in trustThat image
is partly wrong.
Any single scientific
study is an exercise
developedtrust the vast
majority
of already
(Pinch,
1986; Shapin,
1,994). Studies
that they use
when they
test a new
s,
findings, and
concepts
For example,
statistical
theories and
methods are usually
taken on faith
ratherthan
personally verified,
as are
measurement
instruments.
The ratio of trust toskepticism
in any
given study is
more llke 99%
trust to
1% skepticism than
thethe single scientist
trusts
muchopposite.
Even in
lifelong
programs of
research,
-or. than
he or she ever
doubts.
Indeed,
thoroughgoing
skepticism
is probablypo
iudge
from what
we know of the
psy-impossible
for the
individual scientist,
6c Fuller,
1'9941. Finall5 skepti-(Gholson
et al.,
L989; Shadish
chology
of science
of
past scientific revolutions;
Shapincharacterrzation
an accurate
cism is
not even
"gentlemanly
trust"
in L7th-century
England was(1,994)
shows that
the
role of
science, de-Trust pervades
of experimental
science.
central to the
establishment
spite
its rhetoric of
Experimentslmplications
offor the equivocality
is a
greater appreciation
criticisms
of these
The net result
naturewindow that
reveals
is not a clear
experiment
The
knowledge.
all scientific
knowl-yield hypothetical and
fallible
experiments
contrary,
To the
to us.
directly
unstated theoret-with
many
and
imbued
on context
dependent
edge that
is often
to thoseare
partly relative
results
experimental
Consequently
ical assumprions.
or con-with new assumptions
and might
well change
and contexts
assumptions
constructivists
epistemological
are
all scientists
In this sense,
texts.
relativistsStrong
they are
strong or
weak
relativists.
is whether
The difference
our s
influence
position that only
extrascientific
Collins's
share
'Weak
world and
the worlds of
ideol-that both
the ontological
relativists believe
of scien-play a role
in the construction
hopes, and wishes
og5
interests, values,
would
probablyourselves,
including
scientists,
Most
practicing
tiiic knowledge.
relativists.l2realists but
weak epistemological
", Lrrtological
themselves
describe
to us, it
is through a very cloudednature
reveal
that experiments
To the extent
(Campbell,
1988).windowpane
As re-needed.
badly
were
of experiments
views
to naive
Such counterweights
probablywas
in science
role of the
experiment
the central
as 30
years ago,
cently
been raisedthat have
of other
philosophical issues
to a host
we could
exrend this
discussion
If space permitred,
1.2.
that the experiment
isversus confirmation,
incorrect assertions
as
its role in discovery
about the experiment,
such
mistakes that arepositivism
or
pragmatism, and the various
as logical
philosophy such
specific
tied to some
1985; Shadish,Campbell,
(e.g.,
ell, 1982,1988;
Cook,
1991; Cook 6<
Campb
i., such discussions
made
frequently
1.995a.
I30
|
1. EXPERTMENTS AND GENERALTZED CAUSAL
INFERENCEtaken more for
granted than is the case today. For example, Campbell and
Stan-ley
(1.9631
described themselves
as:committed to the experiment: as the only means
for settling disputes
regarding educa-tional practice, as the only way of verifying
educational improvements,
and as the onlyway of establishing a cumulative tradition
in which improvements can
be introducedwithout the danger
of a faddish discard
of old wisdom in favor of
inferior novelties.
(p.
2)Indeed,
Hacking
(1983)
points
out that
"'experimental
method'
used to be
iust
an-other name for scientific
method"
(p.149);
and experimentation
was then a morefertile
ground for examples illustrating basic
philosophical issues
than it was asource of contention
Not so today.
'We
itself.
,now understand better that the experiment
is a profoundlyhuman endeavor, affected by all the same
human foibles as
any other human en-deavor,
though with well-developed
procedures
for partial control
of some of thelimitations that have been
identified to date. Some
of these
limitations are com-mon to all science, of course.
For example, scientists
tend to
notice evidence
thatconfirms their
preferred hypotheses
and to overlook
contradictory
evidence. Theymake routine cognitive errors of
judgment
and have
limited capacity to
processlarge amounts of information.
They react to peer
pressures to agree
with accepteddogma and to social
role pressures in their relationships
to students,
participants,and other scientists.
They are partly motivated
by sociological
and economic
re-wards for their work
(sadl5
sometimes
to the
point of fraud), and
they display all-too-human
psychological needs
and irrationalities
about their
work. Other
limi-tations have unique relevance
to experimentation.
For example,
if causal resultsare ambiguous, as
in many weaker quasi-experiments,
experimenters
may attrib-ute causation or causal
generalization
based
on study
features
that have little todo with orthodox
logic or method. They may
fail to pursue all
the alternativecausal explanations because of a
lack of energS
a need
to achieve closure,
or a biastoward accepting
evidence that confirms their
preferred hypothesis.
Each experi-ment is also a social situation,
full of social roles
(e.g.,
participant, experimenter,assistant)
and social expectations
(e.g.,
that
people should
provide true
informa-tion) but with a uniqueness
(e.g.,
that the experimenter
does not always
tell thetruth) that can lead to
problems when social cues
are misread
or deliberatelythwarted by either
party. Fortunately these
limits are
not insurmountable,
as for-mal training can help overcome some of them
(Lehman, Lempert, & Nisbett,1988). Still, the relationship between scientific
results and the
world that sciencestudies
is neither simple nor fully social and psychological analyses
have taken
some of the
luster fromthe experiment as a centerpiece
of science.
The experiment
may have a
life of itsown, but it is no longer life on a
pedestal. Among scientists,
belief in the experi-ment as
the only means
to settle disputes about
causation
is gone,
though it is stillthe preferred method in many circumstances.
Gone, too,
is the belief that thepower experimental methods often displayed
in the laboratory
would transfer eas-ily to applications in field settings.
As a result of
highly publicized science-related
WITHOUT EXPERIMENTS OR CAUSES?
A WORLD
I
gTIIthe disputes overnuclear disaster,
Chernobyl
results of the
tragic
such as the
events
trials, and the
failure
to findin the O.J. Simpson
levels of DNA testing
certainty
effort, theof highly
publicized and funded
cancers after
decades
for most
a cure
the limits
of tands
general public
now better
too far. Those who argue againstcritiques
Yet we should
not take these
out
just
asexperiment will come
that every
seem to suggest
tests often
theory-free
to the experience ofis totally contrary
This expectation
wishes.
the experimenter
frustrating and disap-is often
who find
instead that experimentation
researchers,
results may not speakthey loved
so much.
Laboratory
pointing for the theories
and
for one's hopes
do not speak
but they
certainly
for themselves,
"stubborn
in facts" withbelief
laboratory scientist's
in the
find much to value
We
tries to ex-theories with which one
a life span that
is greater than
the
fluctqating
arethey
are the same, whether
many basic results about
gravity
Thus
plain them.
by Newton or by
Einstein; and no suc-developed
within a framework
contained
it could account for most ofplausible unless
would be
to Einstein's
theory
cessor
may not be
pure facts,There
about
falling bodies.
findings
factlike
the stubborn
they were
as
worth treating
are clearly
but some observations
Some theorists
of science-Hanson,
Polanyi, Kuhn, and
Feyerabendthe
role of theory
in science as to make experi-included-have
so exaggerated
that wereexperiments
But exploratory
irrelevant.
seem almost
mental
evidence
tangential todiscoveries
experimental
and unexpected
unguided by
formal theory
scientificthe source of
great
repeatedly been
motivations have
initial research
the
replicable re-dependable,
provided many stubborn,
have
Experiments
advances.
physicists feel that theirExperimental
subject of theory.
the
become
sults that
then
honest,theoretical counterparts
speculative
keep their more
help
data
laboratory
stubbornOf course, these
role in science.
giving experiments an indispensable
presumptions and trust
in many well-facts often involve both
commonsense
in ques-of the science
core of belief
up the shared
theories that
make
established
prove areundependable,
to be
facts sometimes
these stubborn
tion. And of course,
focal the-laden with a dominant
so
artifacts, or
are
as experimental
reinterpreted
But this is not the case withthat theory
is replaced.
disappear once
ory that they
rel-over
dependable
base, which
remains reasonably
great
bulk of the
factual
the
of
periods
atively
OR CAUSES?EXPERIMENTS
WITHOUT
A WORLD
imagine that the slatesfrom Maclntyre
(1981),
To borrow a thought
experiment
to construct our un-that we had
and
philosophy were wiped
clean
of science and
would we reinventAs
part of that reconstruction,
of the world anew.
derstanding
of the
practicallargely because
7e think so,
cause?
the notion of a manipulable
and
for our ability to survive
manipulanda
utility that dependable
IUTould
as a method
for investigating such
causes?the experiment
we reinvent
32
I|
1. EXPERTMENTS
AND GENERALTZED
CAUSAL TNFERENCEAgain yes,
because
humans will always be trying to better
know how well thesemanipulable causes work. Over
time,
they will refine
how they conduct those ex-periments
and so will again be drawn to
problems of counterfactual
inference,
ofcause
preceding
effect, of alternative explanations,
and of all of the
other
featuresof causation that we have discussed in this chapter. In the
end, we would
proba-bly end up with the experiment or something very
much like it. This book is onemore step in that ongoing
process of refining experiments.
It is about improvingthe
yield
from experiments that take
place in complex
field
settings,
both the
qual-ity of causal inferences they
yield
and
our ability to
generalize these inferences toconstructs and
over
variations in persons, settings,
treatments,
and outcomes.
456A
Critical
Assessment
ofOur
(e-simp'shen):
[Middle
English assumpcion,
from Latin as-sumpti, assumptin-
adoption, from assumptus, past participle
of ass-mere,
te adopt;
see assume.] n. 1. The act of taking
to or upon
oneself:assumption of an
obligation. act of taking overi
assumption ofcommand.
3. The act of taking for granted:
assumption of a
ory. 4. Something taken
false
the-for
granted
or accepted as true without proof;a supposition:
a ualid assumption. 5. Presumption;
arrogance.
.
A minor
BooK covers
five central
topics across its 13 chapters. The first
topic|
(Chapter
1) deals
with our general
understanding of descriptive causation
andI experimentation.
The second
(Chapters
2 and 3) deals
with the types
of valid-ity and the
specific
validity threats
associated with this understanding. The third(Chapters
4 through
7) deals
with quasi-experiments
and illustrates how combin-ing design features
can facilitate
better causal inference. The fourth
(Chapters
8through L0) concerns
randomized
experiments and stresses the factors that im-pede
and promote their implementation.
The fifth
(Chapters
11 through L3)
dealswith causal
generalization,
both theoretically and as concerns
the conduct of in-dividual studies
and programs
of research. The purpose of this last chapter is
tocritically assess
some of the
assumptions
that have gone into these five topics, es-pecially
the assumptions
that critics have found obiectionable
or that we antici-pate they
will find objectionable.
'We
organize the discussion around each of thefive topics
and then briefly
justify
why we did not deal more extensively
with non-experimental methods
for assessing
causation.I7e do not
delude
ourselves
that we can be the best
explicators of our own as-sumptions. Our
critics can do
that task better.
But we want to be as comprehen-srve
sive
an(l as explclt
nd
as explicitas we can. I nrs
can. This is in part
ls rn part because we are convrnced
because we are convinced of the
ot the acl-ad-vantages
of falsification
as a major component
of any epistemology for the socialsciences, and forcing
out one's assumptions
and confronting them is one part
offalsification.
But it is also
because we would like
to stimulate critical debate aboutthese assumptions
so that we can learn
from those
who would challenge our think-
rctEXPERIMENTATION
AND
CAUSATION
|
further
forward
the tradi-even
a future
book that
carried
to be
were
ing.
If there
to this
book,via Cook
and
Campbell
and
Stanley
from
Campbell
tion emanating
for building
upon
all thebe all
the better
then
that
futuie
book
*o,rld
probably
on
partic-either
with
us,
who
do not
agree
from
those
coming
justified
criticisms
cau-of descriptive
to the
analysis
taken
we have
whole
approach
,rlu6
o, on
the
the at-not only
to
model
would
like
this chapter
its
generayzition.'We
and
sation
butmake
must
inevitably
all scholars
the assumprions
about
cr"
i.-p,
to be
might behow they
and
assumptions
these
to think
about
others
to encourage
also
work'or theoretical
in fuiure
empirical
addressed
ENTATIONAND
EXPERIM
CAUSATION
Pretzelsand
Arrows
Causal
Experiments
test
the influence
of one
or at
most
a small
subset
of descriptivefewIf statistical
interactions
are
involved,
they tend
to be
among
very
causes.
of moderator
variables'treatment
and a
limited
set
a single
or between
treatments
knowledge
that
results
from this
typicalthat
the causal
believe
Many
researchers
af-that simultaneously
forces
.*-..rtal
structure
fails to
map the
many
causal
(e.g.,
Cronbach
et al',fe.t
"ny
given outcome
in compiex
and
nonlinear
ways
prioritize on
ar-that experiments
critics
assert
2000).
These
19g0;
Magnusson,
an explanatoryto describe
A to
B when
they
should
instead
seek
,o*,
.ing
that
most
causalas
it were.
They also
believe
pretzels,
of intersecting
pretzel
or set
whetherielationships
vary
across
,rttitt,
settings,
and
times,
and so
they
doubt
(e.g.,
Cronbach
6c Snow,there
".. "ny
constant
bivariate
causal
relationships
sta-in the
data
may simply
reflect
to be
dependable
1977).Those
that
do appear
to reveal
thetistically
i
irr,,
of modeiators
or
mediators
that failed
mightrelationships.
True-variation
in effect
sizes
true
underlying
complex
causal
or thetheory
is underspecified,
also be
"d
b.c"rrs.
the
relevant
substantive
orare
partially
invalid,
or the
treatment
contrast
is attenuated,
outcome
measures
(McClellandimplicated
variables
afe truncated
in how
they
are sampled
causally
6c
Judd,
1993).for experi-the case
they
do not
invalidate
are,
obiections
these
As valid
as
e-explain-
is not
to completely
The
purpose of
experiments
ments.
makesof variables
set
or small
variable
a
particular
non; it
is to
ldentify
whether
all the
other
forces
affect-over
and
above
outcome
in some
difference
a margirral
notas
the
preceding have
doubts
such
ontological
Moreover,
ing that outcome.
as
though
manyfrom acting
iausal
theories
in more
complex
believers
stJpped
or asmain
effects
dependable
as
characterized
usefully
be
can
.r,rol
relationships
In thisto be_u_seful.
enough
very
simpl.
nonlin."rities
that
are
also
dependable
wherein the
United
States,
from
education
examples
some
consider
connection,
4s8
I|
14. A CRTTTCAL ASSESSMENT
OF OUR ASSUMPTTONSobjections to experimentation are probably the most prevalent and virulent. Feweducational researchers seem
to object to the following substantive
conclusions ofthe form that A dependably causes
B: small schools are better than large ones;time-on-task raises
achievement; summer school raises test scores;
school deseg-regation hardly affects
achievement but does
increase White flight; and assigningand
grading homework
raises
achievement.
The critics also do not seem
to objectto other conclusions involving very simple causal contingencies:
size increases achievement,
but only if the amount of change
is
"sizable"
reducing classand to alevel under 20; or Catholic schools
are superior to
public ones,
but only in the in-ner city and not in the suburbs
and then most noticeably in
graduation rates
ratherthan in achievement
test scores.
,The primary
iustification
for such oversimplifications-and for the use
of theexperiments
that test them-is that some moderators of effects
are of minor rele-vance to policy and theory even
if they marginally improve explanation.
The mostimportant contingencies
are usually those that modify the sign of a causal
rela-tionship rather than its magnitude. Sign
changes
imply that a treatment is benefi-cial in some circumstances
but might be harmful in others.
This is quite differentfrom identifying
circumstances that influence
just
how positive an effect
might -makers
are often willing to advocate
an overall change,
even if they suspectit has different-sized positive
effects
for different groups, as long as
the effects arerarely negative. But if
some
groups
will be
positively affected
and others nega-tively political actors are loath
to
prescribe
different treatments
for differentgroups
because rivalries and
jealousies
often ensue.
Theoreticians also
probablypay
more attention to causal relationships that differ in causal sign because thisresult implies that one can identify
the boundary conditions
that impel such a dis-parate data course, we do not advocate ignoring all causal
contingencies.
For exam-ple, physicians
routinely prescribe
one of several
possible
interventions
for a givendiagnosis. The exact choice may depend
on the diagnosis,
test
results,
patient pref-erences, insurance resources,
and the availability of treatments
in the patient'sarea. However, the costs
of such a contingent system
are high. In
part to limit
thenumber of relevant contingencies, physicians
specialize,
andwithin their own spe-cialty they undergo extensive
training to enable them to make
these contingent de-cisions. Even then, substantial
judgment
is still required to cover
the many situa-tions in which causal contingencies
are ambiguous or
in dispute. In many otherpolicy
domains it would also be costly to implement the financial, management,and cultural changes that a truly contingent system would require even
if the req-uisite knowledge were
available. Taking such a contingent approach
to its logicalextremes would entail in education, for
example, that
individual tutoring becomethe order the order
of the day. ts
and
instructors would have
have to be carefullyarefully matchedmatchedfor overlap in teaching and learning
skills and in the
curriculum supports theywould hin
limits, some moderators
can be studied
experimentallS either bymeasuring
the moderator so it can be tested during analysis or by deliberately
AND
EXPERIMENTATION
CAU5ATION I
Ot'
本文发布于:2024-09-22 07:26:27,感谢您对本站的认可!
本文链接:https://www.17tex.com/fanyi/2205.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
留言与评论(共有 0 条评论) |