- ...history.
- What is ``ecological'' about the aggregate
data from which individual behavior is to be inferred? The name has
been used at least since the late 1800s and stems from the word ecology,
the science of the interrelationship of living things and their
environments. Statistical measures taken at the level of the
environment, such as summaries of geographic areas or other
aggregate units, are widely known as ecological data. Ecological
inference is the process of using ecological data to learn about the
behavior of individuals within these aggregates.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...module.
- Gauss
is available from Aptech Systems, Inc.; 23804 S.E. Kent-Kangley
Road; Maple Valley, Washington 98038; (206) 432-7855;
sales@aptech.com.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...(p. 414).
- In 1919, the possibility of what
has since come to be known as the ``gender gap'' was a central issue
for academics and a nontrivial concern for political leaders seeking
reelection: Not only were women about to have the vote for the first
time nationwide; because women made up slightly over fifty percent
of the population, they were about to have most of the
votes.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...women.
- That is, given these aggregate
numbers, a minimum of 0% of females in precinct 1 and 20% in
precinct 2 (for an average of 10%) could have opposed the
referenda, whereas a maximum of 40% of males in each precinct could
have opposed it. Chapter
provides easy graphical
methods of making calculations like these.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...earnest.
- Other early works
that recognized the ecological inference problem include Allport
(1924), Bernstein (1932), Gehlke and Biehl (1934), Thorndike (1939),
Deming and Stephan (1940), and Yule and Kendall (1950). Robinson
(1950) cited several of these studies as well as Ogburn and Goltra.
Scholars writing even earlier than Ogburn and Goltra (1919) made
ecological inferences, even though they did not recognize the
problems with doing so. In fact, even the works usually cited as
the first statistical works of any kind, which incidentally
concerned political topics, included ecological inferences (see
Graunt, 1662, and Petty, 1690, 1691). See Achen and Shively (1995)
for other details of the history of ecological inference research.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...times.
- This is a vast underestimate, as it depends on data
from the Social Science Citation Index, which did not even
begin publishing (or counting) until six years after Robinson's
article appeared.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...level.
- There are even several
largely independent lines of research that give conditions under
which aggregate data is not worse than individual-level data for
certain purposes. In political science, see Kramer (1983); in
epidemiology, see Morgenstern (1982); in psychology, see Epstein
(1986); in economics, see Grunfeld and Griliches (1960), Fromm and
Schink (1973), Aigner and Goldfeld (1974), and Shin (1987); and in
input-output analysis, a field within economics, see Malinvaud
(1955) and Venezia (1978).
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...voters.
- In
this book, I use ``African American'' and ``black'' interchangeably and, when
appropriate or for expository simplicity, often define ``white'' as non-black or
occasionally as a residual category such as non-black and non-Hispanic.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...white.
- In some states, precincts must be aggregated to a somewhat
higher geographical level to match electoral and census data.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...example.)
- The litigation
based on the Voting Rights Act is vast; see Grofman, Handley, and Niemi (1992)
for a review.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...1994).
- Most epidemiological questions require relatively certain
answers and thus, in most cases, large-scale, randomized experiments on
individuals. Because each such experiment can cost hundreds of millions of
dollars, a valid method of ecological inference would probably be of primary
use in this field for helping scholars (and funding agencies) choose which
experiments to conduct.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...level.
- I had a small role in this case as a consultant to the
state of Ohio and therefore witnessed the following story firsthand.
My primary task in the case was to evaluate the relative fairness of
the state's redistricting plan to the political parties, using
methods developed in King and Browning (1987), King (1989b), and
Gelman and King (1990, 1994a, b).
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...bounds.
- That is, although the row total is 55,054, the
total number of people in the upper left cell of Table
cannot exceed 19,896, or it would contradict its column marginal.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...answer.
- This estimate of the
number of times authors in the ecological inference literature have
made themselves vulnerable to being wrong is based on counting data
sets original to this literature. Individual cross-tabulations that
were used to study the method of bounds are excluded since no
uncertainty, and thus no vulnerability, exists. I obviously also
exclude studies that use data sets previously introduced to this
literature. A list of data sets and the studies in which they were
first used are as follows: Race and illiteracy from the 1930 U.S.
Census (Robinson, 1950); race by domestic service from community
area data (Goodman, 1959; used originally to study bounds by Duncan
and Davis, 1953); infant mortality by race and by urbanicity in
U.S. states (Duncan et al., 1961: 71-72); 1964-1966 voter
transitions in British constituencies (Hawkes, 1969); a voter
transition between Democratic primaries in Florida (Irwin and
Meeter, 1969); a 1961 German survey (Stokes, 1969); voter transition
in England from Butler and Stokes (1969) data (Miller, 1972); survey
of first-year university students (Hannan and Burstein, 1974); vote
for Labour by worker category (Crewe and Payne, 1976); voter
transition in England compared to a poll (McCarthy and Ryan, 1977);
voter transition February to October 1974 in England compared to a
poll (Upton, 1978); voter transition from a general election in 1983
to an election to the European parliament in 1984 compared to an ITN
poll (Brown and Payne, 1986); one comparison based on twenty-four
observations from Lee County, South Carolina, comparing registration
and turnout by race (Loewen and Grofman, 1989); two comparisons of a
survey to Swedish election data (Ersson and Wörlund, 1990); twenty
comparisons of aggregate electoral data in California and nationally
compared to exit polls, comparisons using census data, and official
data on registration and voter turnout (Freedman et al., 1991);
eight voter transition studies in Denmark compared to survey data
(Thomsen et al., 1991); race and registration data from Matthews and
Prothro (1966) (Alt, 1993); race and literacy from the 1910 U.S.
Census (Palmquist, 1994); housing tenure transitions from 1971 to
1981 in England from census data (Cleave, Brown, and Payne, 1995).
If you know of any work that belongs on this list that I missed, I
would appreciate hearing from you.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...methods.
- Surveys are also very
underused in this literature, perhaps in part since many scholars
came to this field because of their skepticism of public opinion
polls.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...procedure.)
- The 3,262 evaluations of the model in this
section are from the same data set and, as such, are obviously
related. However, each comparison between the truth and an estimate
provides a separate instance in which the model is vulnerable to being
wrong. These model evaluations simulate the usual situation in which
the ecological analyst has no definite prior knowledge about whether
the parameters of interest are dependent, unrelated, or all
identical.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...model.
- As an
analogy, consider how much information could be added to the usual
linear regression if we knew for certain a different narrow range
within which each observation's
must fall.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.