AP-GfK Poll Methodology

Overview
The Associated Press and GfK are dedicated to producing unbiased, reliable polls and providing full details of their methodology so the public can assess the credibility of the surveys.

The Associated Press-GfK Poll, which began in 2008, follows practices endorsed by the American Association for Public Opinion Research and the Council of American Survey Research Organizations, which represent the polling industry and set standards for it. Data presented with each AP-GfK Poll includes the dates it was conducted; the size of each sample; the proportion of people who responded to the survey; the design effect and the average length of time a questionnaire takes to complete. It also includes the margin of sampling error, for which AP-GfK Polls use a 95 percent confidence level and incorporate the design effect.

In an effort to best use evolving technology and help advance survey knowledge, we constantly examine and experiment with our methodology. We contribute to peer-reviewed journals and make presentations at professional conferences. Please address any questions to Linda McPetrie at GfK (Linda.McPetrie@gfk.com) or
Jennifer Agiesta at the Associated Press (jagiesta@ap.org).

Samples

The survey was conducted using the web-enabled KnowledgePanel®, a probability-based panel designed to be representative of the U.S. population. At inception, participants were chosen scientifically by a random selection of telephone numbers and since 2009 through Address-based sampling using the post office’s delivery sequence file. Persons in these households are then invited to join and participate in the web-enabled KnowledgePanel®. For those who agree to participate, but do not already have Internet access, GfK provides at no cost a laptop and ISP connection. People who already have computers and Internet service are permitted to participate using their own equipment. Panelists then receive unique log-in information for accessing surveys online, and then are sent emails throughout each month inviting them to participate in research.

Prior to October 2013, AP GfK polls were conducted via telephone using random digit dial landline and cell phone samples. Beginning with the October 2013 survey, polls are completed online via KnowledgePanel®.

Interviewing and Procedures

Respondents fill out the questionnaire on their own, at their own pace.

Each poll has a consistent screening procedure — that is, the same introduction and rules of eligibility, such as requiring that subjects be at least 18 years old; the same tracking questions like whether the country is heading in the right direction; and the same demographic questions about each person surveyed.
Because the topics of polls vary, the average length to complete surveys also varies. Each questionnaire is pretested before actual interviewing occurs.

The English version of the questionnaire is translated into Spanish. Respondents can select the language they prefer to be interviewed in and this is recorded in the dataset.

Most polls are completed over five days, typically starting Thursday and finishing Monday. Over the course of the five day field period, respondents, who have not yet completed the survey, are sent a reminder email. A total of two reminders are sent throughout the field period.

Sampling error and other Survey Error

Probability samples are subject to some degree of sampling error. That is, data obtained from a sample can be expected to vary, within a known margin of error, from data that would be obtained from a survey of the entire target population. Each survey will have its margin of sampling error posted based on the effective sample size or design effect.

Sampling error is just one source of potential error in surveys. Errors can arise from question wording, the order in which questions are asked, low response rates or nonresponse by subjects being contacted, and other potential sources.

Weighting

Weights– the adjustments made to ensure a poll’s subjects are comparable by age, race and other demographic qualities to the overall population — are computed as follows.

KnowledgePanel® members receive a base weight that reflects their selection probability based on the recruit method.
Base weights become a multiplier entering the demographic balancing through rim weighting, adjusted by:

1) Age by sex where age
categories are 18-29, 30-49, 50-64, 65 and up;
2) Race as black and all other races;
3) Census region
4) Phone service
5) Hispanic and non-Hispanic;
6) Educational attainment as high school graduate or less, some college or technical
school, and four-year college graduate or higher.

These benchmarks all come from the most recent Current Population Survey benchmarks with the exception of telephone service which comes from Media Research & Intelligence’s fall wave of The American Consumer Survey.

The data are simultaneously adjusted to these marginal distributions through an iterative proportional fitting or raking procedure. Final estimates for the polls are based on these weights and processes. The distribution of final weights for polls is analyzed and decisions about the range of the weights and trimming or capping are sometimes taken to make sure there isn’t one subject whose characteristics don’t mar the data. This is typically one percent of the largest weights.

Once weights are final, the effective, or weighted, sample size – rather than the actual sample size – is used to compute the survey’s margin of sampling error.

Reading notes

Estimates reported from our polls have percentage points rounded to the nearest whole number. As a result, percentages in a given table column may not total exactly 100 percent. In questions permitting multiple responses, columns may total significantly more than 100 percent, depending on the number of different responses offered by each respondent.

Prior to October 2013, the AP-GfK polls were conducted via telephone. The telephone methodology follows:

Samples

Since its inception, the AP-GfK Poll has used a dual-frame design — that is, two separate samples of landline and cell phone numbers that are subsequently combined into one. The random digit dial, or RDD, method is used to reach subjects in each sample.

Before 2010, the final sample was composed of 80 percent landline and 20 percent cell numbers. Due to the continued increase in households using only cells, in 2010 the frame allocation was changed to 70 percent landline and 30 percent cell numbers and as of August 2012 the frame allocation is 60% landline and 40% cell numbers.

The increased number of cell phone users has allowed improved coverage of people who no longer use landline numbers, and of people with both landline and cell phones who only answer one of those phones. It also allows the use of probability sampling to make projections from the data to the entire U.S. population.
Both samples are produced by Survey Sampling International LLC of Shelton, Conn.

 

RDD Landline
SSI produces an EPSEM sample –_ that is, an equal probability of selection method
sample aimed at ensuring that each person in the sample has an equal chance of being
called. Phone numbers of businesses are called to include those that share a telephone
with a household. A link to the details of this selection and the sample frame can be
found at:

RDD Landline Sample Methodology.pdf

RDD Cell
SSI produces an equal probability cell number sample. The link below provides details
to this process.

Wireless Sample Methodology.pdf

The AP-GfK Poll is based on a nationally representative RDD sample of at least 1,000
adults, ages 18 and older, living in the 50 states. The landline RDD sample of
households called is stratified, or divided, by census region with targets set for the
number of complete calls per region. Cell numbers are not geographically stratified, other
than in selecting the initial sample.

 

Interviewing and Procedures
All interviews were conducted by Interviewing Service of America, under GfK supervision, from their Van Nuys, CA facilities.

All interviews are conducted by live callers. They use computer-assisted telephone
interviewing, or CATI, software, CfMC Survent, to administer the survey. Each poll has a
consistent screening procedure — that is, the same introduction and rules of eligibility, such
as requiring that subjects be at least 18 years old; the same tracking questions like whether
the country is heading in the right direction; and the same demographic questions about
each person surveyed.

Because the topics of polls vary, the average length to complete surveys also varies. Each
questionnaire is pretested among the U.S. population before actual interviewing occurs.
The English version of the questionnaire is translated into Spanish. Respondents can select
the language they prefer to be interviewed in and this is recorded in the dataset.

Experiments have shown that leaving voice messages has done no harm and in some
ways has helped the survey process. In early 2010, leaving a voice message on the first
call attempt became standard practice.
Most polls are completed over five days, typically starting Thursday evenings and
finishing Monday evenings. Callers are required to try reaching phone numbers each day
and at various times of day. Callers must try each telephone number, unless the number
is resolved to a final status, up to eight times before it is no longer dialed . The
interviewer lets each number ring up to eight times before hanging up and trying the next
telephone number in the sample.

Interviewers are monitored daily by quality control monitors at the phone center and by
the supervisory or quality assurance staffs in GfK partner phone centers. In addition, the
survey is monitored by the GfK field manager and by the research team for the first
couple of days in field. All interviewers who work on the AP-GfK Poll are individually
approved by GfK. They are scored on their professionalism, survey flow, survey
control, survey quality and tone of voice, and ability to administer an unbiased survey.

 

Frame-specific procedures
When dialing landline numbers, we randomly select an adult in each household to
interview. Initially, we ask an adult in the household the number of adults 18 or older
that live in the household. If the answer is one, that person is interviewed. If the answer
is two adults, then the CATI software randomly selects either the adult on the phone or
the other adult in the household.

If the other adult is not available, a call back is scheduled. When three or more adults are
in the household, which occurs about 15 percent of the time, the CATI system randomly
requests the adult with either the last or next birthday. If this is the person on the phone
the interview begins. Otherwise the person selected comes to the phone if available or a
call back is scheduled.

For details and rationale see “More Research on a Hybrid Respondent Selection Method”
by Paul Lavrakas, Trevor Tompson, Robert Benford and Christopher Fleury, a work in
progress presented most recently at the 2009 Midwest Association for Public Opinion
Research in Chicago and at the 2010 WAPOR Conference in Chicago.

For cell phones, we assume that adults we reach are the only owners of that number so
we do not ask for other adults. We ask respondents whether it is currently safe for them to
answer questions, such as whether they are driving a car, and schedule call backs as
needed. We offer a $5 reimbursement for the minutes our call will use. Thus, additional
questions are needed in screening respondents and collecting information to pay the
reimbursement.

 

Weighting
Weights– the adjustments made to ensure a poll’s subjects are comparable by age, race
and other demographic qualities to the overall population — are computed in two stages.
First, an initial weight, or pre-weight, is computed to make sure all subjects in the sample
have an equal chance of being called. This is followed by demographic balancing using a
rim weighting procedure. The landline sample takes into account the number of adults 18
or older in the household as well as the number of lines that can connect to any one adult.

Further, since separate samples of people with landlines and cell phones are used, a
multiplicity adjustment is needed to account for the overlap in the two samples because
some households have both landlines and cells:

Landline Pre-weight Dual Frame:

# Adults 1

__________________ x λ dual

# Landlines1

 

Cell Phone Pre-weight Dual Frame:

1

__________________ x (1-λ) dual

# Cells 1

1 Capped at 2 for all pre-weights.

Pre-weights as computed above for the dual frame approach use the mixing parameter λ,
where λ is equal to the proportion of frame overlap. Frame overlap is about 60%
landline; therefore, λ is set at 0.6.

Pre-weights become a multiplier entering the demographic balancing through rim
weighting, adjusted by:

1) Age by sex as determined by the Current Population Survey (CPS) where age
categories are 18-29, 30-49, 50-64, 65 and up;

2) Race per CPS with race as black and all other races;

3) Census region by phone service per Media Research & Intelligence’s fall wave
for the current year – 12. MRI is a member of the GfK Group;

4) Hispanic and non-Hispanic per CPS;

5) Educational attainment as high school graduate or less, some college or technical
school, and four-year college graduate or higher.

The data are simultaneously adjusted to these marginal distributions through an iterative
proportional fitting or raking procedure. Final estimates for the polls are based on these
weights and processes. The distribution of final weights for polls is analyzed and
decisions about the range of the weights and trimming or capping are sometimes taken to
make sure there isn’t one subject whose characteristics don’t mar the data.

2 For example, data collected and weighted in 2009 would be based on Fall 2008 MRI data.

Once weights are final, the effective, or weighted, sample size – rather than the actual
sample size – is used to compute the survey’s margin of sampling error.

Sampling error and other Survey Error
Probability samples are subject to some degree of sampling error. That is, data obtained
from a sample can be expected to vary, within a known margin of error, from data that
would be obtained from a survey of the entire target population. Each survey will have its
margin of sampling error posted based on the effective sample size.

Sampling error is just one source of potential error in surveys. Errors can arise from
question wording, the order in which questions are asked, low response rates or nonresponse
by subjects being contacted, and other potential sources.

Reading notes
Estimates reported from our polls have percentage points rounded to the nearest whole
number. As a result, percentages in a given table column may not total exactly 100
percent. In questions permitting multiple responses, columns may total significantly more
than 100 percent, depending on the number of different responses offered by each
respondent.

 

(1-λ) dua

Before 2010, the final sample was composed of 80 percent landline and 20 percent cell numbers. Due to the continued increase in households using only cells, in 2010 the frame allocation was changed to

70 percent landline and 30 percent cell numbers and as of August 2012 the frame allocation is 60% landline and 40% cell numbers.