Why Phone and Web Survey Results Aren't the Same

by Jenny Marlar

Researchers have many methods they can choose from when conducting surveys -- telephone, face-to-face, interactive voice response (IVR), web, and paper and pencil. Deciding which method best fits their study's objectives is one of the most important choices they will make and is crucial to constructing a research design.

As they weigh this decision, researchers often consider which mode provides the best coverage of the target population, the quality and type of contact information on the available sample frame(s), projected response rates and sample size requirements, the types of questions, and timing and budget constraints.

Because of recent challenges and innovations in the survey research industry, they may also consider trying or switching to new data collection methods that may offer better coverage, more timely data collection, improved response rates or less expensive data collection than ones they have used before.

Trying a new method or switching from one method to another is no small decision, particularly if the researchers have established data trends. Data collection modes can affect substantive survey results, and transitioning to a new data collection method can present difficulties for researchers. This can happen for a few main reasons: 1) differences in the way respondents answer questions by mode; 2) differences in who is covered by each mode; and 3) who is willing to respond.

The survey research community has researched mode differences extensively. The following examples highlight a few differences Gallup has found in its own research. For a more in-depth review of mode issues, readers may wish to consider the sources at the end of this article.

Respondents Answer Questions Differently by Mode

Respondents process information differently when questions are interviewer-administered (aural) versus self-administered (visual). In an interviewer-administered survey, respondents have the questions read to them aloud, and they must hear and remember the question and response options to be able to produce an answer. There is also an interviewer involved who can help engage respondents but controls the survey pace, and respondents may be less forthcoming when giving an interviewer a response to a sensitive topic.

In a self-administered survey, such as web and mail, respondents control the survey pace and are able to see all of the questions and response options on the screen or paper.

Because of these differences, when respondents answer questions that an interviewer reads aloud to them, research has consistently shown that respondents tend to give more extreme, positive responses to attitudinal items and more socially desirable responses than when the same questions are administered to the same population via web or mail.

Gallup's Findings

In a Gallup experiment conducted in 2014, Gallup Panel members were randomly assigned to answer a survey by either web or phone. The survey included various attitudinal and behavioral questions related to customer engagement and well-being. Only panel members who could respond to web and phone surveys were eligible to be included in the experiment, which allowed us to focus on measurement differences and control for coverage differences. Both samples were post-stratified to control for differential nonresponse between treatment groups.

All 11 attitudinal questions that were related to customer engagement scored more positively on phone than on web. Respondents were asked to evaluate each question based on a five-point agreement scale. Net differences between web and phone responses on the "strongly agree" response ranged from a low of four to a high of 16 percentage points.

The results for a selection of questions are presented in the first table below. Arguably, some results, such as overall satisfaction, may not be meaningfully different. But differences for some items, such as "always treats me fairly," are more significant and may prompt those interpreting results to reach different conclusions or recommend a different course of action, depending on the method used.

Differences in Customer Engagement Results by Phone and Web

	Phone	Web	Difference
	% 5	% 5	pct. pts.
Question
Overall satisfaction	50	46	+4
Perfect for people like me	51	40	+11
Proud to be a customer	48	37	+11
Always delivers on what they promise	52	40	+12
Always treats me fairly	62	46	+16
GALLUP

Similar differences were found on five key attitudinal questions related to well-being. These items were also asked on a five-point agreement scale. As with the customer engagement questions, responses were more positive on all items when asked via phone.

Differences in Well-Being Results by Phone and Web

	Phone	Web	Difference
	% 5	% 5	pct. pts.
Question
Received recognition for helping improve city	19	15	+4
In the last seven days you have worried about money	46	40	+6
Physical health is near perfect	51	40	+11
City or area where you live is perfect place for you	59	47	+12
Someone always encourages you to be healthy	78	66	+12
GALLUP

Those Who Respond Are Demographically Different

Differences in answers by survey mode also occur because of coverage. For example, approximately 97% of U.S. adults have either a landline or a cellular phone, while 77% of U.S. households have internet access. For customer or employee lists, other differences may exist. For example, a company may choose to keep email addresses up-to-date for all customers in a database, while not collecting home telephone numbers for all customers on the list or not updating them regularly.

Differences can also occur because some groups of respondents are easier to reach and persuade to complete the survey using different modes. For example, a survey sent in the mail may be more likely to gain one respondent's attention and carry a weight of importance and legitimacy than an email. Another respondent may never check the mailbox, but check email multiple times daily.

Gallup's Findings

Gallup examined the effect of population differences and mode on employee engagement (Q¹²) scores using its proprietary measurement. The analysis was based on more than 1 million records in the Gallup database and included surveys conducted via paper and pencil, web, and IVR. IVR respondents were given a paper help sheet, which included the questions and response categories and minimized the response differences between aural and visual processing (described previously).

Gallup analyzed the Q¹² results by mode and found significant differences in the scores. However, these differences could all be attributed to respondent demographics. But even after controlling for demographic differences between respondents, such as length of service and level of responsibility, Gallup found no differences in the percentage of items scoring a "5" on a 1-to-5 scale. There were slight differences in the GrandMean scores, even after adjustments were made, but these were not substantial.

Differences in Employee Engagement Results, After Controlling for Demographic Differences

	Measure
	IVR versus paper and pencil	IVR versus web	Paper and pencil versus web
Difference in % of items scoring a 5	1 pct. pt.	0 pct. pts.	-1 pct. pt.
Difference in adjusted Q12 GrandMean score	+.03	-.09	-.12
GALLUP

Conclusions

It is important to be aware of the effect that mode can have on survey-based results, and one should be cautious when comparing results of work conducted using different modes. Researchers who are considering mixing modes or contemplating a transition to a new method should be especially mindful of mode effects. Switching to a new mode can significantly affect the trends of an existing survey, and it is advisable to field both methods in parallel before making a switch, to understand the impact of the change on trends.

Which mode reflects the true or more accurate estimate? This can often be a complex determination.

Researchers must consider several different factors, including the coverage, nonresponse and measurement bias introduced by each mode. Methodological designs often involve making trade-offs to minimize these biases, but they still exist in some form. It is important to consider which mode most successfully minimizes all sources of potential bias. In some cases, such as the first example in the sections above, both surveys covered the same population and had nearly identical response rates, yet differences persisted. This is frequently observed in attitudinal items, but is far less common in non-sensitive behavioral questions.

For most attitudinal questions, respondents have never considered how they feel about a topic until they are asked. These attitudes may not have a "truth" and can be fluid over time, even from one hour to the next. Both answers reflect the respondent's true feeling at the time the question was asked, based on how it was asked.

See the sources below for more information about mode issues.

Dillman, D. A., et al. (1996). Understanding differences in people's answers to telephone and mail surveys. In M. T. Braverman & J. K. Slater (Eds.), Advances in Survey Research (pp. 45-62). San Francisco: Jossey-Bass.
de Leeuw, E. (2005). To mix or not to mix data collection modes in surveys. Journal of Official Statistics, 21(2), 233-255.
Dillman, D. A. (1991). The design and administration of mail surveys. Annual Review of Sociology, 17, 225-249.
Dillman, D. A., et al. (2009). Response rate and measurement differences in mixed-mode surveys using mail, telephone, interactive voice response and the internet. Social Science Research, 38(1), 1-18.
Blumberg, S. J., & Luke, J. V. (2017, May). Wireless substitution: Early release of estimates from the National Health Interview Survey, July-December 2016. National Center for Health Statistics. Available from: https://www.cdc.gov/nchs/data/nhis/earlyrelease/wireless201705.pdf
Ryan, C., & Lewis, J. (2017, September). Computer and internet use in the United States: 2015. American Community Survey Reports. Available from: https://www.census.gov/content/dam/Census/library/publications/2017/acs/acs-37.pdf

Jennifer Marlar is a Methodologist and Director of Research for the Gallup Panel.