week 3


Respond to the following in a minimum of 175 words:

Post a survey that you find on the internet. Try to find one that is relatively brief- 10 questions or less. Refer to the table presented on page 127 of Methods in Behavioral Research (2009) to help you evaluate your survey on the following points:

  • negative wording
  • complexity (note: good questions are simple and straightforward)
  • double-barreled
  • loaded
  • grammatically incorrect

After evaluating your survey, discuss the importance of writing good survey questions. How can poorly-written questions yield biased results?


Week Three Homework Exercise

Answer the following questions covering material from Ch. 6 & 7 of Methods in Behavioral Research:

1. What is reactivity? Explain how reactivity impacts measurement.

2. What are the key features of an experimental design, or ‘true experiment’? How does this compare to case studies?

3. What is survey research and when is it most useful?

4. What issues should be considered when constructing surveys? What are the implications of double-barreled, loaded, and negative questions?

5. What are some survey administration methods? When are each of these methods most appropriate?

6. Define interview bias and provide an example.

7. What is the difference between probability and non-probability sampling techniques?

8. A researcher attends an art reception in a major metropolitan city. She decides to approach people over the age of 50 and ask them to fill out a brief survey about purchasing artwork. Is this a probability or a non-probability sampling technique? What type of sampling procedure is this—simple random, stratified random, cluster, haphazard, purposive, or quota?

9. What is the relationship between sample size and survey results? What are some techniques to evaluate potential sampling bias?





  • Compare quantitative and qualitative methods of describing behavior.
  • Describe naturalistic observation and discuss methodological issues such as participation and concealment.
  • Describe systematic observation and discuss methodological issues such as the use of equipment, reactivity, reliability, and sampling.
  • Describe the features of a case study.
  • Describe archival research and the sources of archival data: statistical records, survey archives, and written records.

Page 118ALL SCIENTIFIC RESEARCH REQUIRES CAREFUL OBSERVATION. In this chapter, we will explore a variety of observational methods including naturalistic observation, systematic observation, case studies, and archival research. Because so much research involves surveys using questionnaires or interviews, we cover the topic of survey research separately in Chapter 7. Before we describe these methods in detail, it will be helpful to understand the distinction between quantitative and qualitative methods of describing behavior.


Observational methods can be broadly classified as primarily quantitative or qualitative. Qualitative research focuses on people behaving in natural settings and describing their world in their own words; quantitative research tends to focus on specific behaviors that can be easily quantified (i.e., counted). Qualitative researchers emphasize collecting in-depth information on a relatively few individuals or within a very limited setting; quantitative investigations generally include larger samples. The conclusions of qualitative research are based on interpretations drawn by the investigator; conclusions in quantitative research are based upon statistical analysis of data.

To more concretely understand the distinction, imagine that you are interested in describing the ways in which the lives of teenagers are affected by working. You might take a quantitative approach by developing a questionnaire that you would ask a sample of teenagers to complete. You could ask about the number of hours they work, the type of work they do, their levels of stress, their school grades, and their use of drugs. After assigning numerical values to the responses, you could subject the data to a quantitative, statistical analysis. A quantitative description of the results would focus on such things as the percentage of teenagers who work and the way this percentage varies by age. Some of the results of this type of survey are described in Chapter 7.

Suppose, instead, that you take a qualitative approach to describing behavior. You might conduct a series of focus groups in which you gather together groups of 8 to 10 teenagers and engage them in a discussion about their perceptions and experiences with the world of work. You would ask them to tell you about the topic using their own words and their own ways of thinking about the world. To record the focus group discussions, you might use a video or audio recorder and have a transcript prepared later, or you might have observers take detailed notes during the discussions. A qualitative description of the findings would focus on the themes that emerge from the discussions and the manner in which the teenagers conceptualized the issues. Such description is qualitative because it is expressed in nonnumerical terms using language and images.

Other methods, both qualitative and quantitative, could also be used to study teenage employment. For example, a quantitative study could examine data collected from the state Department of Economic Development; a Page 119qualitative researcher might work in a fast-food restaurant as a management trainee. Keep in mind the distinction between quantitative and qualitative approaches to describing behavior as you read about other specific observational methods discussed in this chapter. Both approaches are valuable and provide us with different ways of understanding behavior.


Naturalistic observation is sometimes called field work or simply field observation (see Lofland, Snow, Anderson, & Lofland, 2006). In a naturalistic observation study, the researcher makes observations of individuals in their natural environments (the field). This research approach has roots in anthropology and the study of animal behavior and is currently widely used in the social sciences to study many phenomena in all types of social and organizational settings. Thus, you may encounter naturalistic observation studies that focus on employees in a business organization, members of a sports team, patrons of a bar, students and teachers in a school, or prairie dogs in a colony in Arizona.

Sylvia Scribner’s (1997) research on “practical thinking” is a good example of naturalistic observation research in psychology. Scribner studied ways that people in a variety of occupations make decisions and solve problems. She describes the process of this research: “… my colleagues and I have driven around on a 3 a.m. milk route, helped cashiers total their receipts and watched machine operators logging in their production for the day … we made detailed records of how people were going about performing their jobs. We collected copies of all written materials they read or produced—everything from notes scribbled on brown paper bags to computer printouts. We photographed devices in their working environment that required them to process other types of symbolic information—thermometers, gauges, scales, measurement instruments of all kinds” (Scribner, 1997, p. 223). One aspect of thinking that Scribner studied was the way that workers make mathematical calculations. She found that milk truck drivers and other workers make complex calculations that depend on their acquired knowledge. For example, a delivery invoice might require the driver to multiply 32 quarts of milk by $.68 per quart. To arrive at the answer, drivers use knowledge acquired on the job about how many quarts are in a case and the cost of a case; thus, they multiply 2 cases of milk by $10.88 per case. In general, the workers that Scribner observed employed complex but very efficient strategies to solve problems at work. More important, the strategies used could often not be predicted from formal models of problem solving. The Scribner research had a particular emphasis on people making decisions in their everyday environment. Scribner has since expanded her research to several different occupations and many types of decisions.

Other naturalistic research may examine a narrower range of behaviors. For example, Graham and her colleagues observed instances of aggression that Page 120occurred in bars in a large city late on weekend nights (Graham, Tremblay, Wells, Pernanen, Purcell, & Jelley, 2006). Both the Scribner and the Graham studies are instances of naturalistic research because the observations were made in natural settings and the researchers did not attempt to influence what occurred in the settings.

Description and Interpretation of Data

The goal of naturalistic observation is to provide a complete and accurate picture of what occurred in the setting, rather than to test hypotheses formed prior to the study. To achieve this goal, the researcher must keep detailed field notes—that is, write or dictate on a regular basis (at least once each day) everything that has happened. Field researchers rely on a variety of techniques to gather information, depending on the particular setting. In the Graham et al. (2006) study in bars, the observers were alert to any behaviors that might lead to an incident of aggression. They carefully watched and listened to what happened. They immediately made notes on what they observed; these were later given to a research coordinator. In other studies, the observers might interview key “informants” to provide inside information about the setting, talk to people about their lives, and examine documents produced in the setting, such as newspapers, newsletters, or memos. In addition to taking detailed field notes, researchers conducting naturalistic observation usually use audio or video recordings.

The researcher’s first goal is to describe the setting, events, and persons observed. The second, equally important goal is to analyze what was observed. The researcher must interpret what occurred, essentially generating hypotheses that help explain the data and make them understandable. Such an analysis is done by building a coherent structure to describe the observations. The final report, although sensitive to the chronological order of events, is usually organized around the structure developed by the researcher. Specific examples of events that occurred during observation are used to support the researcher’s interpretations.

A good naturalistic observation report will support the analysis by using multiple confirmations. For example, similar events may occur several times, similar information may be reported by two or more people, and several different events may occur that all support the same conclusion.

The data in naturalistic observation studies are primarily qualitative in nature; that is, they are the descriptions of the observations themselves rather than quantitative statistical summaries. Such qualitative descriptions are often richer and closer to the phenomenon being studied than are statistical representations. However, it is often useful to also gather quantitative data. Depending on the setting, data might be gathered on income, family size, education levels, age, or gender of individuals in the setting. Such data can be reported and interpreted along with qualitative data gathered from interviews and direct observations.

Page 121

Participation and Concealment

Two related issues facing the researcher are whether to be a participant or non-participant in the social setting and whether to conceal his or her purposes from the other people in the setting. Do you become an active participant in the group or do you observe from the outside? Do you conceal your purposes or even your presence, or do you openly let people know what you are doing?

A nonparticipant observer is an outsider who does not become an active part of the setting. In contrast, a participant observer assumes an active, insider role. Because participant observation allows the researcher to observe the setting from the inside, he or she may be able to experience events in the same way as natural participants. Friendships and other experiences of the participant observer may yield valuable data. A potential problem with participant observation, however, is that the observer may lose the objectivity necessary to conduct scientific observation. Remaining objective may be especially difficult when the researcher already belongs to the group being studied or is a dissatisfied former member of the group. Remember that naturalistic observation requires accurate description and objective interpretation with no prior hypotheses. If a researcher has some prior reason to either criticize people in the setting or give a glowing report of a particular group, the observations will likely be biased and the conclusions will lack objectivity.

Should the researcher remain concealed or be open about the research purposes? Concealed observation may be preferable because the presence of the observer may influence and alter the behavior of those being observed. Imagine how a nonconcealed observer might alter the behavior of high school students in many situations at a school. Thus, concealed observation is less reactive than nonconcealed observation because people are not aware that their behaviors are being observed and recorded. Still, nonconcealed observation may be preferable from an ethical viewpoint: Consider the invasion of privacy when researchers hid under beds in dormitory rooms to discover what college students talk about (Henle & Hubbell, 1938)! Also, people often quickly become used to the observer and behave naturally in the observer’s presence. This fact allows documentary filmmakers to record very private aspects of people’s lives, as was done in the 2009 British documentary Love, Life, and Death in a Day. For the death segments, the filmmaker, Sue Bourne, contacted funeral homes to find families willing to be filmed throughout their grieving over the death of a loved one.

The decision of whether to conceal one’s purpose or presence depends on both ethical concerns and the nature of the particular group and setting being studied. Sometimes a participant observer is nonconcealed to certain members of the group, who give the researcher permission to be part of the group as a concealed observer. Often a concealed observer decides to say nothing directly about his or her purposes but will completely disclose the goals of the research if asked by anyone. Nonparticipant observers are also not concealed when they gain permission to “hang out” in a setting or use interview techniques to gather Page 122information. In actuality, then, there are degrees of participation and concealment: A nonparticipant observer may not become a member of the group, for example, but may over time become accepted as a friend or simply part of the ongoing activities of the group. In sum, researchers who use naturalistic observation to study behavior must carefully determine what their role in the setting will be.

You may be wondering about informed consent in naturalistic observation. Recall from Chapter 3 that observation in public places when anonymity is not threatened is considered exempt research. In these cases, informed consent may not be necessary. Moreover, in nonconcealed observation, informed consent may be given verbally or in written form. Nevertheless, researchers must be sensitive to ethical issues when conducting naturalistic observation. Of particular interest is whether the observations are made in a public place with no clear expectations that behaviors are private. For example, should a neighborhood bar be considered public or private?

Limits of Naturalistic Observation

Naturalistic observation obviously cannot be used to study all issues or phenomena. The approach is most useful when investigating complex social settings both to understand the settings and to develop theories based on the observations. It is less useful for studying well-defined hypotheses under precisely specified conditions or phenomena that are not directly observable by a researcher in a natural setting (e.g., color perception, mood, response time on a cognitive task).

Field research is also very difficult to do. Unlike a typical laboratory experiment, field research data collection cannot always be scheduled at a convenient time and place. In fact, field research can be extremely time-consuming, often placing the researcher in an unfamiliar setting for extended periods. In the Graham et al. (2006) investigation of aggression in bars, observers spent over 1,300 nights in 118 different bars (74 male–female pairs of observers were required to accomplish this feat).

Also, in more carefully controlled settings such as laboratory research, the procedures are well defined and the same for each participant, and the data analysis is planned in advance. In naturalistic observation research, however, there is an ever-changing pattern of events, some important and some unimportant; the researcher must record them all and remain flexible in order to adjust to them as research progresses. Finally, the process of analysis that follows the completion of the research is not simple (imagine the task of sorting through the field notes of every incident of aggression that occurred on over 1,300 nights). The researcher must repeatedly sort through the data to develop hypotheses to explain the data and then make sure all data are consistent with the hypotheses. Although naturalistic observation research is a difficult and challenging scientific procedure, it yields invaluable knowledge when done well.

Page 123


Systematic observation refers to the careful observation of one or more specific behaviors in a particular setting. This research approach is much less global than naturalistic observation research. The researcher is interested in only a few very specific behaviors, the observations are quantifiable, and the researcher frequently has developed prior hypotheses about the behaviors. We will focus on systematic observation in naturalistic settings; these techniques may also be applied in laboratory settings.

For example, Bakeman and Brownlee (1980; also see Bakeman, 2000) were interested in the social behavior of young children. Three-year-olds were videotaped in a room in a “free play” situation. Each child was taped for 100 minutes; observers viewed the videotapes and coded each child’s behavior every 15 seconds, using the following coding system:

Unoccupied: Child is not doing anything in particular or is simply watching other children.

Solitary play: Child plays alone with toys but is not interested in or affected by the activities of other children.

Together: Child is with other children but is not occupied with any particular activity.

Parallel play: Child plays beside other children with similar toys but does not play with the others.

Group play: Child plays with other children, including sharing toys or participating in organized play activities as part of a group of children.

Bakeman and Brownlee were particularly interested in the sequence or order in which the different behaviors were engaged in by the children. They found, for example, that the children rarely went from being unoccupied to engaging in parallel play. However, they frequently went from parallel to group play, indicating that parallel play is a transition state in which children decide whether to interact in a group situation.

Coding Systems

Numerous behaviors can be studied using systematic observation. The researcher must decide which behaviors are of interest, choose a setting in which the behaviors can be observed, and most important, develop a coding system, such as the one described, to measure the behaviors. Rhoades and Stocker (2006) describe the use of the Marital Interaction Video Coding System. Couples are recorded for 10 minutes as they discuss an area of conflict; they then discuss a positive aspect of their relationship for 5 minutes. The video is later coded for hostility and affection displayed during each 5 minutes of the interaction. To code hostility, the observers rated the frequency of behaviors such as Page 124“blames other” and “provokes partner.” Affection behaviors that were coded included “expresses concern” and “agrees with partner.”

Methodological Issues

Equipment We should briefly mention several methodological issues in systematic observation. The first concerns equipment. You can directly observe behavior and code it at the same time; for example, you could directly observe and record the behavior of children in a classroom or couples interacting on campus using paper-and-pencil measures. However, it is becoming more common to use video and audio recording equipment to make such observations because they provide a permanent record of the behavior observed that can be coded later.

An interesting method for audio recording is called the Electronically Activated Recorder (EAR) that was used to compare sociability behaviors of Americans and Mexicans (Ramirez-Esparza, Mehl, Alvarez-Bermúdez, & Pennebaker, 2009). The EAR is a small audio recorder that a subject wears throughout the day. It is set to turn on periodically to record sounds in the subject’s environment. The study examined frequency of sociable behaviors. Previous research had found the Americans score higher than Mexicans on self-report measures of sociability, contradicting stereotypes that Mexicans are generally more sociable. Coders applied the Social Environment of Sound Inventory to code the sounds as alone, talking with others in a public environment, or on the phone. When sociability was measured this way, the Mexican subjects were in fact more sociable than the Americans.

Reactivity A second issue is reactivity—the possibility that the presence of the observer will affect people’s behaviors (see Chapter 5). Reactivity can be reduced by concealed observation. Using small cameras and microphones can make the observation unobtrusive, even in situations in which the participant has been informed of the recording. Also, reactivity can be reduced by allowing time for people to become used to the observer and equipment.

Reliability Recall from Chapter 5 that reliability refers to the degree to which a measurement reflects a true score rather than measurement error. Reliable measures are stable, consistent, and precise. When conducting systematic observation, two or more raters are usually used to code behavior. Reliability is indicated by a high agreement among the raters. Very high levels of agreement are reported in virtually all published research using systematic observation (generally 80% agreement or higher). For some large-scale research programs in which many observers will be employed over a period of years, observers are first trained using videotapes, and their observations Page 125during training are checked for agreement with results from previous observers.

Sampling For many research questions, samples of behavior taken over an extended period provide more accurate and useful data than single, short observations. Consider a study on the behaviors of nursing home residents and staff during meals (Stabell, Eide, Solheim, Solberg, and Rustoen, 2004). The researchers were interested in the frequency of different resident behaviors such as independent eating, socially engaged eating, and dependent eating in which help is needed. The staff behaviors included supporting the behaviors of the residents (e.g., assisting, socializing). The researchers could have made observations during a single meal or two meals during a single day. However, such data might be distorted by short-term trends—the particular meal being served, an illness, a recent event such as a death among the residents. The researchers instead sampled behaviors during breakfast and lunch over a period of 6 weeks. Each person was randomly chosen to be observed for a 3-minute period during both meals on 10 of the days of the study. A major finding was that the staff members were most frequently engaged in supporting dependent behavior with little time spent supporting independent behaviors such as socializing. Interestingly, part-time nursing student staff were more likely to support independence.


A case study is an observational method that provides a description of an individual. This individual is usually a person, but it may also be a setting such as a business, school, or neighborhood. A naturalistic observation study is sometimes called a case study, and in fact the naturalistic observation and case study approaches sometimes overlap. We have included case studies as a separate category in this chapter because case studies do not necessarily involve naturalistic observation. Instead, the case study may be a description of a patient by a clinical psychologist or a historical account of an event such as a model school that failed. A psychobiography is a type of case study in which a researcher applies psychological theory to explain the life of an individual, usually an important historical figure (Schultz, 2005). Thus, case studies may use such techniques as library research and telephone interviews with persons familiar with the case but no direct observation at all (Yin, 2014).

Depending on the purpose of the investigation, the case study may present the individual’s history, symptoms, characteristic behaviors, reactions to situations, or responses to treatment. Typically, a case study is done when an individual possesses a particularly rare, unusual, or noteworthy condition. One famous case study involved a man with an amazing ability to recall information (Luria, 1968). The man, called “S.,” could remember long lists and passages Page 126with ease, apparently using mental imagery for his memory abilities. Luria also described some of the drawbacks of S.’s ability. For example, he frequently had difficulty concentrating because mental images would spontaneously appear and interfere with his thinking.

Another case study example concerns language development; it was provided by “Genie,” a child who was kept isolated in her room, tied to a chair, and never spoken to until she was discovered at the age of 13½ (Curtiss, 1977). Genie, of course, lacked any language skills. Her case provided psychologists and linguists with the opportunity to attempt to teach her language skills and discover which skills could be learned. Apparently, Genie was able to acquire some rudimentary language skills, such as forming childlike sentences, but she never developed full language abilities.

Individuals with particular types of brain damage can allow researchers to test hypotheses (Stone, Cosmides, Tooby, Kroll, & Knight, 2002). The individual in their study, R.M., had extensive limbic system damage. The researchers were interested in studying the ability to detect cheaters in social exchange relationships. Social exchange is at the core of our relationships: One person provides goods or services for another person in exchange for some other resource. Stone et al. were seeking evidence that social exchange can evolve in a species only when there is a biological mechanism for detecting cheaters; that is, those who do not reciprocate by fulfilling their end of the bargain. R.M. completed two types of reasoning problems. One type involved detecting violations of social exchange rules (e.g., you must fulfill a requirement if you receive a particular benefit); the other type focused on nonsocial precautionary action rules (e.g., you must take this precaution if you engage in a particular hazardous behavior). Individuals with no brain injury do equally well on both types of measures. However, R.M. performed very poorly on the social exchange problems but did well on the precautionary problems, as well as other general measures of cognitive ability. This finding supports the hypothesis that our ability to engage in social exchange relationships is grounded in the development of a biological mechanism that differs from general cognitive abilities.

Case studies are valuable in informing us of conditions that are rare or unusual and thus providing unique data about some psychological phenomenon, such as memory, language, or social exchange. Insights gained through a case study may also lead to the development of hypotheses that can be tested using other methods.


Archival research involves using previously compiled information to answer research questions. The researcher does not actually collect the original data. Instead, he or she analyzes existing data such as statistics that are part of public records (e.g., number of divorce petitions filed), reports of anthropologists, the Page 127content of letters to the editor, or information contained in databases. Judd, Smith, and Kidder (1991) distinguish among three types of archival research data: statistical records, survey archives, and written records.

Statistical Records

Statistical records are collected by many public and private organizations. The U.S. Census Bureau maintains the most extensive set of statistical records available, but state and local agencies also maintain such records. In a study using public records, Bushman, Wang, and Anderson (2005) examined the relationship between temperature and aggression. They used temperature data in Minneapolis that was recorded in 3-hour periods in 1987 and 1988; data on assaults were available through police records. They found that higher temperature is related to more aggression; however, this effect was limited to data recorded between 9:00 p.m. and 3:00 a.m.

There are also numerous less obvious sources of statistical records, including public health statistics, test score records kept by testing organizations such as the Educational Testing Service, and even sports organizations. Major League Baseball is known for the extensive records that are kept on virtually every aspect of every game and every player. Abel and Kruger (2010) took advantage of this fact to investigate the relationship between positive emotions and longevity. They began with photographs of 230 major league players published in 1952. The photographs were then rated for smile intensity to provide a measure of emotional positivity. The longevity of players who had died by the end of 2009 was then examined in relation to smile intensity. The results indicated that these two variables are indeed related. Further, ratings of attractiveness were unrelated to longevity.

Survey Archives

Survey archives consist of data from surveys that are stored on computers and available to researchers who wish to analyze them. Major polling organizations make many of their surveys available. Also, many universities are part of the Inter-university Consortium for Political and Social Research (ICPSR; http://www.icpsr.umich.edu/), which makes survey archive data available. One very useful data set is the General Social Survey (GSS; see their website at http://www3.norc.org/GSS+Website/), a series of surveys funded by the National Science Foundation. Each survey includes over 200 questions covering a range of topics such as attitudes, life satisfaction, health, religion, education, age, gender, and race.

Survey archives are now becoming available online at sites that enable researchers to analyze the data online. Survey archives are extremely important because most researchers do not have the financial resources to conduct surveys of randomly selected national samples; the archives allow them to access such samples to test their ideas. A study by Robinson and Martin (2009) Page 128illustrates how the GSS can be used to test hypotheses. The study examined whether Internet users differed from nonusers in their social attitudes. Clearly, the findings would have implications for interpreting the results of surveys conducted via the Internet. The results showed that although Internet users were somewhat more optimistic, there were no systematic differences between those who use and do not use the Internet.

Written and Mass Communication Records

Written records are documents such as diaries and letters that have been preserved by historical societies, ethnographies of other cultures written by anthropologists, and public documents as diverse as speeches by politicians or discussion board messages left by Internet users. Mass communication records include books, magazine articles, movies, television programs, and newspapers.

An example of archival research using such records is a study of 487 anti-smoking ads that was conducted by Rhodes, Roskos-Ewoldsen, Eno, and Monahan (2009). They found that there were an increasing number of ads attacking the tobacco industry over time and that many of the ads emphasized the negative health impact of smoking. However, few ads attacked claims for the benefits of smoking such as stress reduction or preventing weight gain.

Content analysis is the systematic analysis of existing documents. Like systematic observation, content analysis requires researchers to devise coding systems that raters can use to quantify the information in the documents. Sometimes the coding is quite simple and straightforward; for example, it is easy to code whether the addresses of the applicants on marriage license applications are the same or different. More often, the researcher must define categories in order to code the information. In the study of smoking ads, researchers had to define categories to describe the ads, for example, attacks tobacco companies or causes cancer. Similar procedures would be used in studies examining archival documents such as speeches, magazine articles, television shows, and reader comments on articles published on the Internet.

The use of archival data allows researchers to study interesting questions, some of which could not be studied in any other way. Archival data are a valuable supplement to more traditional data collection methods. There are at least two major problems with the use of archival data, however. First, the desired records may be difficult to obtain: They may be placed in long-forgotten storage places, or they may have been destroyed. Second, we can never be completely sure of the accuracy of information collected by someone else.

This chapter has provided a great deal of information about important qualitative and quantitative observational methods that can be used to study a variety of questions about behavior. In the next chapter, we will explore a very common way of finding out about human behavior—simply asking people to use self-reports to tell us about themselves.

Page 129


Happiness, according to Aristotle, is the most desirable of all things. In the past few decades, many researchers have been studying predictors of happiness in an attempt to understand the construct.

Mehl, Vazire, Holleran, and Clark (2010) conducted a naturalistic observation on the topic of happiness using electronically activated recorders (a device that unobtrusively records snippets of sound at regular intervals, for a fixed amount of time). In this study, 79 undergraduate students wore the device for 4 days; 30-second recordings were made every 12.5 minutes. Each snippet was coded as having been taken while the participant was alone or with people. If the participant was with somebody, the recordings were also coded for “small talk” and “substantial talk.” Other measures administered were well-being and happiness.

First, acquire and read the article:

Mehl, M. R., Vazire, S., Holleran, S. E., & Clark, C. S. (2010). Eavesdropping on happiness: Well-being is related to having less small talk and more substantive conversations. Psychological Science, 21, 539–541. doi:10.1177/0956797610362675

Then, after reading the article, consider the following:

1. What is the research question for this study?

2. Is the basic approach in this study qualitative or quantitative?

3. Is this study an example of concealed or nonconcealed observation? What are the ethical issues present in this study?

4. Do you think that participants would be reactive to this data collection method?

5. How reliable were the coders? How did the authors assess their reliability?

6. How did the researchers operationally define small talk, substantive talk, well-being, and happiness? What do you think about the quality of these operational definitions?

7. Does this study suffer from the problem involving the direction of causation (p. 79)? How so?

8. Does this study suffer from the third-variable problem (p. 83)? How so?

9. Do you think that this study included any confounding variables? Provide examples.

10. Given the topic of this study, what other ways can you think of to conduct this study using an observational method?

Page 130

Study Terms

Archival research (p. 126)

Case study (p. 125)

Coding system (p. 123)

Content analysis (p. 128)

Naturalistic observation (p. 119)

Participant observation (p. 121)

Psychobiography (p. 125)

Reactivity (p. 124)

Systematic observation (p. 123)

Review Questions

1. What is naturalistic observation? How does a researcher collect data when conducting naturalistic observation research?

2. Why are the data in naturalistic observation research primarily qualitative?

3. Distinguish between participant and nonparticipant observation; between concealed and nonconcealed observation.

4. What is systematic observation? Why are the data from systematic observation primarily quantitative?

5. What is a coding system? What are some important considerations when developing a coding system?

6. What is a case study? When are case studies used? What is a psychobiography?

7. What is archival research? What are the major sources of archival data?

8. What is content analysis?


1. Some questions are more readily answered using quantitative techniques, and others are best addressed through qualitative techniques or a combination of both approaches. Suppose you are interested in how a parent’s alcoholism affects the life of an adolescent. Develop a research question best answered using quantitative techniques and another research question better suited to qualitative techniques. A quantitative question is, “Are adolescents with alcoholic parents more likely to have criminal records?” and a qualitative question is, “What issues do alcoholic parents introduce in their adolescent’s peer relationships?”

2. Devise a simple coding system to do a content analysis of print advertisements in popular magazines. Begin by examining the ads to choose the content dimensions you wish to use (e.g., gender). Apply the system to an issue of a magazine and describe your findings.

3. Read each scenario below and determine whether a case study, naturalistic observation, systematic observation, or archival research was used.Page 131



  • Discuss reasons for conducting survey research.
  • Identify factors to consider when writing questions for interviews and questionnaires, including defining research objectives and question wording.
  • Describe different ways to construct questionnaire responses, including closed-ended questions, open-ended questions, and rating scales.
  • Compare the two ways to administer surveys: written questionnaires and oral interviews.
  • Define interviewer bias.
  • Describe a panel study.
  • Distinguish between probability and nonprobability sampling techniques.
  • Describe simple random sampling, stratified random sampling, and cluster sampling.
  • Describe haphazard sampling, purposive sampling, and quota sampling.
  • Describe the ways that samples are evaluated for potential bias, including sampling frame and response rate.

Page 133

SURVEY RESEARCH EMPLOYS QUESTIONNAIRES AND INTERVIEWS TO ASK PEOPLE TO PROVIDE INFORMATION ABOUT THEMSELVES— their attitudes and beliefs, demographics (age, gender, income, marital status, and so on) and other facts, and past or intended future behaviors. In this chapter we will explore methods of designing and conducting surveys, including sampling techniques.


Surveys are a research tool that is used to ask people to tell us about themselves. They have become extremely important as society demands data about issues rather than only intuition and anecdotes.

Surveys are being conducted all the time. Just look at your daily newspaper, local TV news broadcast, or the Internet. The Centers for Disease Control and Prevention is reporting results of a survey of new mothers asking about breast-feeding. A college survey center is reporting the results of a telephone survey asking about political attitudes. If you look around your campus, you will find academic departments conducting surveys of seniors or recent graduates. If you make a major purchase, you will likely receive a request to complete a survey that asks about your satisfaction. If you visit the American Psychological Association website, you can read a report called Stress in America that presents the results of an online survey of over 1,300 adults that was conducted in 2010.

Surveys are clearly a common and important method of studying behavior. Every university needs data from graduates to help determine changes that should be made to the curriculum and student services. Auto companies want data from buyers to assess and improve product quality and customer satisfaction. Without collecting such data, we are totally dependent upon stories we might hear or letters that a graduate or customer might write. Other surveys can be important for making public policy decisions by lawmakers and public agencies. In research, many important variables—including attitudes, current emotional states, and self-reports of behaviors—are most easily studied using questionnaires or interviews.

We often think of survey data providing a snapshot of how people think and behave at a given point in time. However, the survey method is also an important way for researchers to study relationships among variables and ways that attitudes and behaviors change over time. For example, the Monitoring the Future project (http://monitoringthefuture.org) has been conducted every year since 1975—its purpose is to monitor the behaviors, attitudes, and values of American high school and college students. Each year, 50,000 8th, 10th, and 12th grade students participate in the survey. Figure 7.1 shows a typical finding: Each line on the graph represents the percentage of survey respondents who reported using marijuana in the past 12 months. Note the trend that shows the peak of marijuana popularity occurring in the late 1970s and the least reported use in the early 1990s. Recent years have seen a steady increase in use though.

Page 134



Percentage of survey respondents who reported using marijuana in the past 12 months, over time

Adapted from Monitoring the Future, http://monitoringthefuture.org/data/10data/fig10_3.pdf

Survey research is also important as a complement to experimental research findings. Recall from Chapter 2 that Winograd and Soloway (1986) conducted experiments on the conditions that lead to forgetting where we place something. To study this topic using survey methods, Brown and Rahhal (1994) asked both younger and older adults about their actual experiences when they hid something and later forgot its location. They reported that older adults take longer than younger adults to find the object and that older adults hide objects from potential thieves, whereas younger people hide things from friends and relatives. Interestingly, most lost objects are eventually found, usually by accident in a location that had been searched previously. This research illustrates a point made in previous chapters that multiple methods are needed to understand any behavior.

An assumption that underlies the use of questionnaires and interviews is that people are willing and able to provide truthful and accurate answers. Researchers have addressed this issue by studying possible biases in the way people respond. A response set is a tendency to respond to all questions from a particular perspective rather than to provide answers that are directly related to the questions. Thus, response sets can affect the usefulness of data obtained from self-reports. The most common response set is called social desirability, Page 135or “faking good.” The social desirability response set leads the individual to answer in the most socially acceptable way—the way that “most people” are perceived to respond or the way that would reflect most favorably on the person. Thus, a social desirability response set might lead a person to underreport undesirable behaviors (e.g., alcohol or drug use) and overreport positive behaviors (e.g., amount of exercise). However, it should not be assumed that people consistently misrepresent themselves. If the researcher openly and honestly communicates the purposes and uses of the research, promises to provide feedback about the results, and assures confidentiality, then the participants can reasonably be expected to give honest responses.

We turn now to the major considerations in survey research: constructing the questions that are asked, choosing the methods for presenting the questions, and sampling the individuals taking part in the research.


A great deal of thought must be given to writing questions for questionnaires and interviews. This section describes some of the most important factors to consider when constructing questions.

Defining the Research Objectives

When constructing questions for a survey, the first thing the researcher must do is explicitly determine the research objectives: What is it that he or she wishes to know? The survey questions must be tied to the research questions that are being addressed. Too often, surveys get out of hand when researchers begin to ask any question that comes to mind about a topic without considering exactly what useful information will be gained by doing so. This process will usually require the researcher to decide on the type of questions to ask. There are three general types of survey questions (Judd, Smith, & Kidder, 1991).

Attitudes and beliefs Questions about attitudes and beliefs focus on the ways that people evaluate and think about issues. Should more money be spent on mental health services? Are you satisfied with the way that police responded to your call? How do you evaluate this instructor?

Facts and demographics Factual questions ask people to indicate things they know about themselves and their situation. In most studies, asking some demographic information is necessary to adequately describe your sample; thus, questions about age, gender, and ethnicity are typically asked. Depending on the topic of the study, questions on marital status, employment status, and number of children might be included. Obviously, if you are interested in making comparisons among groups, such as males and females, you must ask the relevant information about group membership. You may also need such Page 136information to adequately describe the sample. It is unwise and even unethical to ask people to respond to questions if you have no real reason to use the information, however.

Other factual information you might ask will depend on the topic of your survey. Each year, Consumer Reports magazine asks readers to tell them about the repairs that have been necessary on many of the products that the readers owned, such as cars and dishwashers. Factual questions about illnesses and other medical information would be asked in a survey of health and quality of life.

Behaviors Other survey questions can focus on past behaviors or intended future behaviors. How many days last week did you exercise for 20 minutes or longer? How many children do you plan to have? Have you ever been so depressed that you called in sick to work?

Question Wording

A great deal of care is necessary to write the very best questions for a survey. Cognitive psychologists have identified a number of potential problems with question wording (see Graesser, Kennedy, Wiemer-Hastings, & Ottati, 1999). Many of the problems stem from a difficulty with understanding the question, including (a) unfamiliar technical terms, (b) vague or imprecise terms, (c) ungrammatical sentence structure, (d) phrasing that overloads working memory, and (e) embedding the question with misleading information. Here is a question that illustrates some of the problems identified by Graesser et al.:

Did your mother, father, full-blooded sisters, full-blooded brothers, daughters, or sons ever have a heart attack or myocardial infarction?

This is an example of memory overload because of the length of the question and the need to keep track of all those relatives while reading the question. The respondent must also worry about two different diagnoses with regard to each relative. Further, the term myocardial infarction may be unfamiliar to most people. How do you write questions to avoid such problems? The following items are important to consider when you are writing questions.

Simplicity The questions asked in a survey should be relatively simple. People should be able to easily understand and respond to the questions. Avoid jargon and technical terms that people will not understand. Sometimes, however, you have to make the question a bit more complex—or longer—to make it easier to understand. Usually this occurs when you need to define a term or describe an issue prior to asking the question. Thus, before asking whether someone approves of Proposition J, you will probably want to provide a brief description of the content of this ballot measure. Likewise, if you want to know about the frequency of alcohol use in a population, asking, “Have you had a Page 137drink of alcohol in the past 30 days?” may generate a slightly different answer than “Have you had a drink of alcohol (meaning one full can of beer, shot of liquor, or glass of wine) in the past 30 days?” The latter case is probably closer to what you would be interested in knowing.

Double-barreled questions Avoid double-barreled questions that ask two things at once. A question such as, “Should senior citizens be given more money for recreation centers and food assistance programs?” is difficult to answer because it taps two potentially very different attitudes. If you are interested in both issues, ask two questions.

Loaded questions A loaded question is written to lead people to respond in one way. For example, the questions “Do you favor eliminating the wasteful excesses in the public school budget?” and “Do you favor reducing the public school budget?” will likely elicit different answers. Or consider that women are less likely to say they have been raped than forced to have unwanted sex (Hamby & Koss, 2003). Questions that include emotionally charged words such as rape, waste, immoral, ungodly, or dangerous influence the way that people respond and thus lead to biased conclusions; more neutral, behavior-based terminology is preferable.

Negative wording Avoid phrasing questions with negatives. This question is phrased negatively: “Do you feel that the city should not approve the proposed women’s shelter?” Agreement with this question means disagreement with the proposal. This phrasing can confuse people and result in inaccurate answers. A better format would be: “Do you believe that the city should approve the proposed women’s shelter?”

“Yea-saying” and “nay-saying” When you ask several questions about a topic, a respondent may employ a response set to agree or disagree with all the questions. Such a tendency is referred to as “yea-saying” or “nay-saying.” The problem here is that the respondent may in fact be expressing true agreement, but alternatively may simply be agreeing with anything you say. One way to detect this response set is to word the questions so that consistent agreement is unlikely. For example, a study of family communication patterns might ask people how much they agree with the following statements: “The members of my family spend a lot of time together” and “I spend most of my weekends with friends.” Similarly, a measure of loneliness could phrase some questions so that agreement means the respondent is lonely (“I feel isolated from others”) and others with the meaning reversed so that disagreement indicates loneliness (e.g., “I feel part of a group of friends”). Although it is possible that someone could legitimately agree with both items, consistently agreeing or disagreeing with a set of related questions phrased in both standard and reversed formats is an indicator that the individual is “yea-saying” or “nay-saying.”

Page 138

TABLE 7.1 Question wording: What is the problem?


Graesser and his colleagues have developed a computer program called QUAID (Question Understanding Aid) that analyzes question wording. Researchers can try out their questions online at the QUAID website (http://mnemosyne.csl.psyc.memphis.edu/quaid/quaidindex.html). You can test your own analysis of question wording using the examples in Table 7.1.


Closed- Versus Open-Ended Questions

Questions may be either closed- or open-ended. With closed-ended questions, a limited number of response alternatives are given; with open-ended questions, respondents are free to answer in any way they like. Thus, you could ask a Page 139person, “What is the most important thing children should learn to prepare them for life?” followed by a list of answers from which to choose (a closed-ended question), or you could leave this question open-ended for the person to provide the answer.

Using closed-ended questions is a more structured approach; they are easier to code and the response alternatives are the same for everyone. Openended questions require time to categorize and code the responses and are therefore more costly to conduct and more difficult to interpret. Sometimes a respondent’s response cannot be categorized at all because the response does not make sense or the person could not think of an answer.

Still, an open-ended question can yield valuable insights into what people are thinking. Open-ended questions are most useful when the researcher needs to know what people are thinking and how they naturally view their world; closed-ended questions are more likely to be used when the dimensions of the variables are well defined.

Schwarz (1999) points out that the two approaches can sometimes lead to different conclusions. He cites the results of a survey question about preparing children for life. When “To think for themselves” was one alternative in a closed-ended list, 62% chose this option; however, only 5% gave this answer when the open-ended format was used. This finding points to the need to have a good understanding of the topic when asking closed-ended questions.

Number of Response Alternatives

With closed-ended questions, there are a fixed number of response alternatives. In public opinion surveys, a simple “yes or no” or “agree or disagree” dichotomy is often sufficient. In other research, it is often preferable to provide more quantitative distinctions—for example, a 5- or 7-point scale ranging from strongly agree to strongly disagree or very positive to very negative. Such a scale might appear as follows:

Strongly agree _____ _____ _____ _____ _____ _____ _____ Strongly disagree

Rating Scales

Rating scales such as the one shown above are very common in many areas of research. Rating scales ask people to provide “how much” judgments on any number of dimensions—amount of agreement, liking, or confidence, for example. Rating scales can have many different formats. The format that is used depends on factors such as the topic being investigated. Perhaps the best way to gain an understanding of the variety of formats is simply to look at a few examples. The simplest and most direct scale presents people with five or seven response alternatives with the endpoints on the scale labeled to define the extremes. The response choices might be lines to mark on a paper Page 140questionnaire, check boxes, or radio buttons in an online survey form. For example,

Students at the university should be required to pass a comprehensive examination to graduate.


How confident are you that the defendant is guilty of attempted murder?


Graphic rating scale A graphic rating scale requires a mark along a continuous 100-millimeter line that is anchored with descriptions at each end.


A ruler is then placed on the line to obtain the score on a scale that ranges from 0 to 100.

Semantic differential scale The semantic differential scale is a measure of the meaning of concepts that was developed by Osgood and his associates (Osgood, Suci, & Tannenbaum, 1957). Respondents are asked to rate any concept—persons, objects, behaviors, ideas—on a series of bipolar adjectives using 7-point scales, as follows:


Research on the semantic differential shows that virtually anything can be measured using this technique. Ratings of specific things (marijuana), places (the student center), people (the governor, accountants), ideas (death penalty, marriage equality), and behaviors (attending church, using public transit) can be obtained. A large body of research shows that the concepts are rated along three basic dimensions: the first and most important is evaluation (e.g., adjectives such as good–bad, wise–foolish, kind–cruel); the second is activity (active–passive, slow–fast, excitable–calm); and the third is potency (weak–strong, hard–soft, large–small).

Nonverbal scales for children Young children may not understand the types of scales we have just described, but they are able to give ratings. Think back to the example in Chapter 4 (page 75) that uses drawings of faces to aid Page 141in the assessment of the level of pain that a child is experiencing. Similar face scales can be used to ask children to make ratings of other things such as a toy.

Labeling Response Alternatives

The examples thus far have labeled only the endpoints on the rating scale. Respondents decide the meaning of the response alternatives that are not labeled. This is a reasonable approach, and people are usually able to use such scales without difficulty. Sometimes researchers need to provide labels to more clearly define the meaning of each alternative. Here is a fairly standard alternative to the agree-disagree scale shown above:


This type of scale assumes that the middle alternative is a “neutral” point half-way between the endpoints. Sometimes, however, a perfectly balanced scale may not be possible or desirable. Consider a scale asking a college professor to rate a student for a job or graduate program. This particular scale asks for comparative ratings of students:

In comparison with other graduates, how would you rate this student’s potential for success?


Notice that most of the alternatives ask people to make a rating within the top 25% of students. This is done because students who apply for such programs tend to be very bright and motivated, and so professors rate them favorably. The wording of the alternatives attempts to force the raters to make finer distinctions among generally very good students.

Labeling alternatives is particularly interesting when asking about the frequency of a behavior. For example, you might ask, “How often do you exercise for at least 20 minutes?” What kind of scale should you use to let people answer this question? You could list (1) never, (2) rarely, (3) sometimes, (4) frequently. These terms convey your meaning but they are vague. Here is another set of alternatives with greater specificity (Schwarz, Knauper, Oyserman, & Stich, (2008):

  • less than twice a week
  • about twice a week
  • about four times a week
  • about six times a week
  • at least once each day

Page 142A different scale might be:

  • less than once per month
  • about once a month
  • about once every 2 weeks
  • about once a week
  • more than once per week

Schwarz et al. (2008) call the first scale a high-frequency scale because most alternatives indicate a high frequency of exercise. The other scale is referred to as low frequency. Schwarz et al. point out that the labels should be chosen carefully because people may interpret the meaning of the scale differently, depending on the labels used. If you were actually asking the exercise question, you might decide on alternatives different from the ones described here. Moreover, your choice should be influenced by factors such as the population you are studying. If you are studying people who generally exercise a lot, you will be more likely to use a higher-frequency scale than you would if you were studying people who generally do not exercise a great deal.


Formatting the Questionnaire

The printed questionnaire should appear attractive and professional. It should be neatly typed and free of spelling errors. Respondents should find it easy to identify the questions and the response alternatives to the questions. Leave enough space between questions so people do not become confused when reading the questionnaire. If you have a particular scale format, such as a 5-point rating scale, use it consistently. Do not change from 5- to 4- to 7-point scales, for example.

It is also a good idea to carefully consider the sequence in which you will ask your questions. In general, it is best to ask the most interesting and important questions first to capture the attention of your respondents and motivate them to complete the survey. Roberson and Sundstrom (1990) obtained the highest return rates in an employee attitude survey when important questions were presented first and demographic questions were asked last. In addition, it is a good idea to group questions together when they address a similar theme or topic. Doing so will make your survey appear more professional, and your respondents will be more likely to take it seriously.

Refining Questions

Before actually administering the survey, it is a good idea to give the questions to a small group of people and have them think aloud while answering them. Page 143The participants might be chosen from the population being studied, or they could be friends or colleagues who can give reasonable responses to the questions. For the think-aloud procedure, you will need to ask the individuals to tell you how they interpret each question and how they respond to the response alternatives. This procedure can provide valuable information that you can use to improve the questions. (The importance of pilot studies such as this is discussed further in Chapter 9.)


There are two ways to administer surveys. One is to use a written questionnaire, either printed or online, wherein respondents read the questions and indicate their responses on a form. The other way is to use an interview format. An interviewer asks the questions and records the responses in a personal verbal interaction. Both questionnaires and interviews can be presented to respondents in several ways. Let’s examine the various methods of administering surveys.


With questionnaires, the questions are presented in written format and the respondents write their answers. There are several positive features of using questionnaires. First, they are generally less costly than interviews. They also allow the respondent to be completely anonymous as long as no identifying information (e.g., name, Social Security number, or driver’s license number) is asked. However, questionnaires require that the respondents be able to read and understand the questions. In addition, many people find it boring to sit by themselves reading questions and then providing answers; thus, a problem of motivation may arise. Questionnaires can be administered in person to groups or individuals, through the mail, on the Internet, and with other technologies.

Personal administration to groups or individuals Often researchers are able to distribute questionnaires to groups of individuals. This might be a college class, parents attending a school meeting, people attending a new employee orientation, or students waiting for an appointment with an advisor. An advantage of this approach is that you have a captive audience that is likely to complete the questionnaire once they start it. Also, the researcher is present so people can ask questions if necessary.

Mail surveys Surveys can be mailed to individuals at a home or business address. This is a very inexpensive way of contacting the people who were selected for the sample. However, the mail format is a drawback because of potentially low response rates: The questionnaire can easily be placed aside and forgotten among all the other tasks that people must attend to at home and work. Even if people start to fill out the questionnaire, something may happen Page 144to distract them, or they may become bored and simply throw the form in the trash. Some of the methods for increasing response rates are described later in this chapter. Another drawback is that no one is present to help if the person becomes confused or has a question about something.

Online surveys Online surveys are increasingly being used by academic researchers (Buchanan & Hvizdak, 2009). It is very easy to design a questionnaire for online administration using one of several online survey software services. Both open- and closed-ended questions can be included. After the questionnaire is completed, the responses are immediately available to the researcher. One of the first problems to consider is how to sample people—how does the researcher provide people with a link to the online survey? Major polling organizations have built databases of people interested in participating in surveys. Online survey software services have mailing lists that can be purchased. There are online special interest groups for people with a particular illness or of a particular age that may allow the researcher to post a recruitment message. One concern about online data collection is whether the results will be similar to what might be found using traditional methods. One particular issue is related to response rates (the percentage of people that are asked to complete a survey that actually complete a survey). One study found that online surveys had an 11% lower response rate than other strategies (Manfreda, Bosnjak, Berzelak, Haas, & Vehovar, 2008). This could directly impact the validity of the data generated by such a survey.

Relatedly, another problem with Internet data is the inherent ambiguity about the characteristics of the individuals providing information for the study. To meet ethical guidelines, the researcher will usually state that only persons 18 years of age or older are eligible; yet how is that controlled? People may also misrepresent their age, gender, or ethnicity. We simply do not know if this is a major problem. However, for most research topics it is unlikely that people will go to the trouble of misrepresenting themselves on the Internet to a greater extent than they would with any other method of collecting data. The ethical issues of Internet research are described in detail by Kraut et al. (2004), Buchanan and Hvizdak (2009), and Buchanan and Williams (2010).


The fact that an interview requires an interaction between people has important implications. First, people are often more likely to agree to answer questions for a real person than to answer a mailed questionnaire. Good interviewers become quite skilled in convincing people to participate. Thus, response rates tend to be higher when interviews are used. The interviewer and respondent often establish a rapport that helps motivate the person to answer all the questions and complete the survey. People are more likely to leave questions unanswered on a written questionnaire than in an interview. An important advantage of an interview is that the interviewer can clarify any problems the person might Page 145have in understanding questions. Further, an interviewer can ask follow-up questions if needed to help clarify answers.

One potential problem in interviews is called interviewer bias. This term describes all of the biases that can arise from the fact that the interviewer is a unique human being interacting with another human. Thus, one potential problem is that the interviewer could subtly bias the respondent’s answers by inadvertently showing approval or disapproval of certain answers. Interviewer characteristics such as race, sex, or age can influence responses, especially when asking about sensitive topics. Imagine how you might respond differently if a male or female interviewer is asking about your sexual history. Another problem is that interviewers may have expectations that could lead them to “see what they are looking for” in the respondents’ answers. Such expectations could bias their interpretations of responses or lead them to probe further for an answer from certain respondents but not from others—for example, when questioning Whites but not people from other groups or when testing boys but not girls. Careful screening and training of interviewers help to limit such biases.

We can now examine three methods of conducting interviews: face-to-face, telephone, and focus groups.

Face-to-face interviews Face-to-face interviews require that the interviewer and respondent meet to conduct the interview. Usually the interviewer travels to the person’s home or office, although sometimes the respondent goes to the interviewer’s office. Such interviews tend to be quite expensive and time-consuming. Therefore, they are most likely to be used when the sample size is fairly small and there are clear benefits to a face-to-face interaction.

Telephone interviews Almost all interviews for large-scale surveys are done via telephone. Telephone interviews are less expensive than face-to-face interviews, and they allow data to be collected relatively quickly because many interviewers can work on the same survey at once. Also, computerized telephone survey techniques lower the cost of telephone surveys by reducing labor and data analysis costs. With a computer-assisted telephone interview (CATI) system, the interviewer’s questions are prompted on the computer screen, and the data are entered directly into the computer for analysis.

Focus group interviews An interview strategy that is often used in industry is the focus group interview. A focus group is an interview with a group of about 6 to 10 individuals brought together for a period of usually 2–3 hours. Virtually any topic can be explored in a focus group. Often the group members are selected because they have a particular knowledge or interest in the topic. Because the focus group requires people to both spend time and incur some costs traveling to the focus group location, participants usually receive some sort of monetary or gift incentive.

Page 146The questions tend to be open-ended, and they are asked of the whole group. An advantage here is that group interaction is possible: People can respond to one another, and one comment can trigger a variety of responses. The interviewer must be skilled in working with groups both to facilitate communication and to deal with problems that may arise, such as one or two persons trying to dominate the discussion or hostility between group members.

The group discussion is usually recorded and may be transcribed. The tapes and transcripts are then analyzed to find themes and areas of group consensus and disagreement. Sometimes the transcripts are analyzed with a computer program to search for certain words and phrases. Researchers usually prefer to conduct at least two or three discussion groups on a given topic to make sure that the information gathered is not unique to one group of people. However, because each focus group is time-consuming and costly and provides a great deal of information, researchers do not conduct very many such groups on any one topic.


Surveys most frequently study people at one point in time. On many occasions, however, researchers wish to make comparisons over time. For example, local newspapers often hire firms to conduct an annual random survey of county residents. Because the questions are the same each year, it is possible to track changes over time in such variables as satisfaction with the area, attitudes toward the school system, and perceived major problems facing the county. Similarly, a large number of first-year students are surveyed each year at colleges throughout the United States to study changes in the composition, attitudes, and aspirations of this group (Pryor, Eagan, Blake, Hurtado, Berdan, & Case (2012)). First-year college students today, for instance, come from more ethnically diverse backgrounds than those in the 1970s (90.9% of respondents in 1971 were White whereas in 2012, 69.7% were). Political attitudes have also shifted over time among this group: Trends in opinions about paying taxes and abortion rights can be seen. Finally, the percentage of new students who think that their “emotional health” is above average or in the “top 10%” is at a 25-year low in 2012: In 1985, 64% of respondents reported good emotional health; in 2012, 52% of students did.

Another way to study changes over time is to conduct a panel study in which the same people are surveyed at two or more points in time. In a two-wave panel study, people are surveyed at two points in time; in a three-wave panel study, three surveys are conducted; and so on. Panel studies are particularly important when the research question addresses the relationship between one variable at “time 1” and another variable at some later “time 2.” For example, Chandra et al. (2008) examined the relationship between exposure to sexual content on television and teen pregnancy over time. Data were collected from over 2,000 teens over a 3-year period. Exposure to sexual content on television was assessed using a survey that asked the participants to report on their television viewing habits, along with their sexual knowledge, attitudes, and behavior. Participants were surveyed three times over the course of 3 years. Chandra and her colleagues found that higher levels of exposure to sexual content on television were, indeed, predictive of higher rates of teen pregnancy—as shown in Figure 7.2. Indeed, they reported that “high rates of exposure corresponded to twice the rate of observed pregnancies seen with low rates of exposure” (p. 1052).

Page 147



Probability of pregnancy at “time 3” related to exposure to low, medium, or high levels of sexual content on television at “time 1”

Adapted from “Does watching sex on television predict teen pregnancy? Findings from a national longitudinal survey of youth,” by A. Chandra, S. C. Martino, R. L. Collins, M. N. Elliott, S. H. Berry, D. E. Kanouse, and A. Miu, 2008, Pediatrics, 122, pp. 1047–1054.


Most research projects involve sampling participants from a population of interest. The population is composed of all individuals of interest to the researcher. One population of interest in a large public opinion poll, for instance, might be all eligible voters in the United States. This implies that the population of interest does not include people under the age of 18, convicted prisoners, visitors from other countries, and anyone else not eligible to vote. You might conduct a survey in which your population consists of all students at your college or university. With enough time and money, a survey researcher could conceivably contact everyone in the population. The United States attempts to do this every 10 years with an official census of the entire population. With a relatively small population, you might find it easy to study the entire population.

In most cases, however, studying the entire population would be a massive undertaking. Fortunately, it can be avoided by selecting a sample from Page 148the population of interest. With proper sampling, we can use information obtained from the participants (or “respondents”) who were sampled to estimate characteristics of the population as a whole. Statistical theory allows us to infer what the population is like, based on data obtained from a sample (the logic underlying what is called statistical significance will be addressed in Chapter 13).

Confidence Intervals

When researchers make inferences about populations, they do so with a certain degree of confidence. Here is a statement that you might see when you read the results of a survey: “The results from the survey are accurate within ±3 percentage points, using a 95% level of confidence.” What does this tell you? Suppose you asked students to tell you whether they prefer to study at home or at school, and the survey results indicate that 61% prefer to study at home. Using the same degree of confidence, you would now know that the actual population value is probably between 58% and 64%. This is called a confidence interval—you can have 95% confidence that the true population value lies within this interval around the obtained sample result. Your best estimate of the population value is the sample value. However, because you have only a sample and not the entire population, your result may be in error. The confidence interval gives you information about the likely amount of the error. The formal term for this error is sampling error, although you are probably more familiar with the term margin of error. Recall the concept of measurement error discussed in Chapter 5. When you measure a single individual on a variable, the obtained score may deviate from the true score because of measurement error. Similarly, when you study one sample, the obtained result may deviate from the true population value because of sampling error.

The surveys you often read about in newspapers and the previous example deal with percentages. What about questions that ask for more quantitative information? The logic in this instance is very much the same. For example, if you also ask students to report how many hours and minutes they studied during the previous day, you might find that the average amount of time was 76 minutes. A confidence interval could then be calculated based on the size of the sample; for example, the 95% confidence interval is 76 minutes plus or minus 10 minutes. It is highly likely that the true population value lies within the interval of 66 to 86 minutes. The topic of confidence intervals, including how to calculate them, is discussed again in Chapter 13.

Sample Size

It is important to note that a larger sample size will reduce the size of the confidence interval. Although the size of the interval is determined by several factors, the most important is sample size. Larger samples are more likely to yield data that accurately reflect the true population value. This statement should make intuitive sense to you; a sample of 200 people from your school should yield more accurate data about your school than a sample of 25 people.

Page 149

TABLE 7.2 Sample size and precision of population estimates (95% confidence level)


How large should the sample be? The sample size can be determined using a mathematical formula that takes into account the size of the confidence interval and the size of the population you are studying. Table 7.2 shows the sample size needed for a sample percentage to be accurate within plus or minus 3%, 5%, and 10%, given a 95% level of confidence. Note first that you need a larger sample size for increased accuracy. With a population size of 10,000, you need a sample of 370 for accuracy within ±5%; the needed sample size increases to 964 for accuracy within ±3%. Note that sample size is not a constant percentage of the population size. Many people believe that proper sampling requires a certain percentage of the population; these people often complain about survey results when they discover that a survey of an entire state was done with “only” 700 or 1,000 people. However, you can see in the table that the needed sample size does not change much, even as the population size increases from 5,000 to 100,000 or more. As Fowler (2014) notes, “a sample of 150 people will describe a population of 1,500 or 15 million with virtually the same degree of accuracy …” (p. 38).


There are two basic techniques for sampling individuals from a population: probability sampling and nonprobability sampling.

  • Probability sampling: Each member of the population has a specifiable probability of being chosen.
  • Nonprobability sampling: The probability of any particular member of the population being chosen is unknown.

Page 150Probability sampling is required when you want to make precise statements about a specific population on the basis of the results of your survey. Although nonprobability sampling is not as sophisticated as probability sampling, we shall see that nonprobability sampling is quite common and useful in many circumstances.

Probability Sampling

Simple random sampling With simple random sampling, every member of the population has an equal probability of being selected for the sample. If the population has 1,000 members, each has one chance out of a thousand of being selected. Suppose you want to sample students who attend your school. A list of all students would be needed; from that list, students would be chosen at random to form the sample.

When conducting telephone interviews, researchers commonly have a computer randomly generate a list of telephone numbers with the dialing prefixes used for residences in the city or area being studied. This will produce a random sample of the population because most residences have telephones (if many people do not have phones, the sample would be biased). Some companies will even provide researchers with a list of telephone numbers for a survey in which the phone numbers of businesses and numbers that phone companies do not use have been removed. You might note that this procedure results in a random sample of households rather than individuals. Survey researchers use other procedures when it is important to select one person at random from the household.

Stratified random sampling A somewhat more complicated procedure is stratified random sampling. The population is divided into subgroups (also known as strata), and random sampling techniques are then used to select sample members from each stratum. Any number of dimensions could be used to divide the population, but the dimension (or dimensions) chosen should be relevant to the problem under study. For instance, a survey of sexual attitudes might stratify on the basis of age, gender, and amount of education because these factors are related to sexual attitudes. Stratification on the basis of height or hair color would be ridiculous for this survey.

Stratified random sampling has the advantage of a built-in assurance that the sample will accurately reflect the numerical composition of the various subgroups. This kind of accuracy is particularly important when some subgroups represent very small percentages of the population. For instance, if African Americans make up 5% of a city of 100,000, a simple random sample of 100 people might not include any African Americans; a stratified random sample would include five African Americans chosen randomly from the population. In practice, when it is important to represent a small group within a population, researchers will “oversample” that group to ensure that a representative sample of the group is surveyed; a large enough sample must be obtained to be able to Page 151make inferences about the population. Thus, if your campus has a distribution of students similar to the city described here and you need to compare attitudes of African Americans and Whites, you will need to sample a large percentage of the African American students and only a small percentage of the White students to obtain a reasonable number of respondents from each group.

Cluster sampling It might have occurred to you that obtaining a list of all members of a population might be difficult. What if officials at your school decide that you cannot have access to a list of all students? What if you want to study a population that has no list of members, such as people who work in county health care agencies? In such situations, a technique called cluster sampling can be used. Rather than randomly sampling from a list of individuals, the researcher can identify “clusters” of individuals and then sample from these clusters. After the clusters are chosen, all individuals in each cluster are included in the sample. For example, you might conduct the survey of students using cluster sampling by identifying all classes being taught—the classes are the clusters of students. You could then randomly sample from this list of classes and have all members of the chosen classes complete your survey (making sure, of course, that no one completes the survey twice).

Most often, use of cluster sampling requires a series of samples from larger to smaller clusters—a multistage approach. For example, a researcher interested in studying county health care agencies might first randomly determine a number of states to sample and then randomly sample counties from each state chosen. The researcher would then go to the health care agencies in each of these counties and study the people who work in them. Note that the main advantage of cluster sampling is that the researcher does not have to sample from lists of individuals to obtain a truly random sample of individuals.

Nonprobability Sampling

In contrast to probability sampling, where the probability of every member is knowable, in nonprobability sampling, the probability of being selected is not known. Nonprobability sampling techniques are quite arbitrary. A population may be defined, but little effort is expended to ensure that the sample accurately represents the population. However, among other things, nonprobability samples are cheap and convenient. Three types of nonprobability sampling are haphazard sampling, purposive sampling, and quota sampling.

Haphazard sampling One common form of nonprobability sampling is haphazard sampling or “convenience” sampling. Haphazard sampling could be called a “take-them-where-you-find-them” method of obtaining participants. Thus, you would select a sample of students from your school in any way that is convenient. You might stand in front of the student union at 9 a.m., ask people who sit around you in your classes to participate, or visit a couple of fraternity and sorority houses. Unfortunately, such procedures are likely to Page 152introduce biases into the sample so that the sample may not be an accurate representation of the population of all students. Thus, if you selected your sample from students walking by the student union at 11 a.m., your sample excludes students who do not frequent this location, and it may also eliminate afternoon and evening students. At many colleges, this sample would differ from the population of all students by being younger, working fewer hours, and being more likely to belong to a fraternity or sorority. Sample biases such as these limit your ability to use your sample data to estimate the actual population values. Your results may not generalize to your intended population but instead may describe only the biased sample that you obtained.

Purposive sampling A second form of nonprobability sampling is purposive sampling. The purpose is to obtain a sample of people who meet some predetermined criterion. Sometimes at a large movie complex, you may see researchers asking customers to fill out a questionnaire about one or more movies. They are always doing purposive sampling. Instead of sampling anyone walking toward the theater, they take a look at each person to make sure that they fit some criterion—under the age of 30 or an adult with one or more children, for example. This is a good way to limit the sample to a certain group of people. However, it is not a probability sample.

Quota sampling A third form of nonprobability sampling is quota sampling. A researcher who uses this technique chooses a sample that reflects the numerical composition of various subgroups in the population. Thus, quota sampling is similar to the stratified sampling procedure previously described; however, random sampling does not occur when you use quota sampling. To illustrate, suppose you want to ensure that your sample of students includes 19% first-year students, 23% sophomores, 26% juniors, 22% seniors, and 10% graduate students because these are the percentages of the classes in the total population. A quota sampling technique would make sure you have these percentages, but you would still collect your data using haphazard techniques. If you did not get enough graduate students in front of the student union, perhaps you could go to a graduate class to complete the sample. Although quota sampling is a bit more sophisticated than haphazard sampling, the problem remains that no restrictions are placed on how individuals in the various subgroups are chosen. The sample does reflect the numerical composition of the whole population of interest, but respondents within each subgroup are selected in a haphazard manner. These techniques are summarized in Table 7.3.


Samples should be representative of the population from which they are drawn. A completely unbiased sample is one that is highly representative of the population. How do you create a completely unbiased sample? First, you would randomly sample from a population that contains all individuals in the population. Second, you would contact and obtain completed responses from all individuals selected to be in the sample. Such standards are rarely achieved. Even if random sampling is used, bias can be introduced from two sources: the sampling frame used and poor response rates. Moreover, even though nonprobability samples have more potential sources of bias than probability samples, there are many reasons (summarized in Table 7.3) why they are used and should be evaluated positively.

Page 153

TABLE 7.3 Advantages and disadvantages of sampling techniques


Page 154

Sampling Frame

The sampling frame is the actual population of individuals (or clusters) from which a random sample will be drawn. Rarely will this perfectly coincide with the population of interest—some biases will be introduced. If you define your population as “residents of my city,” the sampling frame may be a list of telephone numbers that you will use to contact residents between 5 p.m. and 9 p.m. This sampling frame excludes persons who do not have telephones or whose schedule prevents them from being at home when you are making calls. Also, if you are using the telephone directory to obtain numbers, you will exclude persons who have unlisted numbers. As another example, suppose you want to know what doctors think about the portrayal of the medical profession on television. A reasonable sampling frame would be all doctors listed in your telephone directory. Immediately you can see that you have limited your sample to a particular geographical area. More important, you have also limited the sample to doctors who have private practices—doctors who work only in clinics and hospitals have been excluded. When evaluating the results of the survey, you need to consider how well the sampling frame matches the population of interest. Often the biases introduced are quite minor; however, they could be consequential to the results of a study.

Response Rate

The response rate in a survey is simply the percentage of people in the sample who actually completed the survey. Thus, if you mail 1,000 questionnaires to a random sample of adults in your community and 500 are completed and returned to you, the response rate is 50%. Response rate is important because it indicates how much bias there might be in the final sample of respondents. Nonrespondents may differ from respondents in any number of ways, including age, income, marital status, and education. The lower the response rate, the greater the likelihood that such biases may distort the findings and in turn limit the ability to generalize the findings to the population of interest.

In general, mail surveys have lower response rates than telephone surveys. With both methods, however, steps can be taken to maximize response rates. With mail surveys, an explanatory postcard or letter can be sent a week or so prior to mailing the survey. Follow-up reminders and even second mailings of Page 155the questionnaire are often effective in increasing response rates. It often helps to have a personally stamped return envelope rather than a business reply envelope. Even the look of the cover page of the questionnaire can be important (Dillman, 2000).

With telephone surveys, respondents who are not home can be called again and people who cannot be interviewed today can be scheduled for a call at a more convenient time. Sometimes an incentive may be necessary to increase response rates. Such incentives can include cash, a gift, or a gift certificate for agreeing to participate. A crisp dollar bill “thank you” can be included with a mailed questionnaire. Other incentives include a chance to win a prize drawing or a promise to contribute money to a charity. Finally, researchers should attempt to convince people that the survey’s purposes are important and their participation will be a valuable contribution.


Much of the research in psychology uses nonprobability sampling techniques to obtain participants for either surveys or experiments. The advantage of these techniques is that the investigator can obtain research participants without spending a great deal of money or time on selecting the sample. For example, it is common practice to select participants from students in introductory psychology classes. Often, these students are asked to participate in studies being conducted by faculty and their students; the introductory psychology students can choose which studies they wish to participate in.

Even in studies that do not use college students, the sample is often based on convenience rather than concern for obtaining a random sample. One of our colleagues studies children, but they are almost always from one particular elementary school. You can guess that this is because our colleague has established a good relationship with the teachers and administrators; thus, obtaining permission to conduct the research is fairly easy. Even though the sample is somewhat biased because it includes only children from one neighborhood that has certain social and economic characteristics, the advantages outweigh the sample concerns for the researcher.

Why aren’t researchers more worried about obtaining random samples from the “general population” for their research? Most psychological research is focused on studying the relationships between variables even though the sample may be biased (e.g., the sample will have more college students, be younger, etc., than the general U.S. population). But to put this in perspective, remember that even a random sample of the general population of U.S. residents tells us nothing about citizens of other countries. So, our research findings provide important information even though the data cannot be strictly generalized beyond the population defined by the sample that was used. For example, the findings of Brown and Rahhal (1994) regarding experiences of younger and older adults when they hid an object but later forgot the location Page 156are meaningful even though the actual sample consisted of current students (younger adults) and alumni (older adults) of a particular university who received a mailed questionnaire. In Chapter 14, we will emphasize that generalization in science is dependent upon replicating the results. We do not need better samples of younger and older adults; instead, we should look for replications of the findings using multiple samples and multiple methods. The results of many studies can then be synthesized to gain greater insight into the findings (cf. Albright & Malloy, 2000).

These issues will be explored further in Chapter 14. For now, it is also important to recognize that some nonprobability samples are more representative than others. Introductory psychology students are fairly representative of college students in general, and most college student samples are fairly representative of young adults. There are not many obvious biases, particularly if you are studying basic psychological processes. Other samples might be much less representative of an intended population. Not long ago, a public affairs program on a local public television station asked viewers to dial a telephone number or send email to vote for or against a gun control measure being considered by the legislature; the following evening, the program announced that almost 90% of the respondents opposed the measure. The sampling problems here are obvious: Groups opposed to gun control could immediately contact members to urge them to vote, and there were no limits on how many times someone could respond. In fact, the show received about 100 times more votes than it usually receives when it does such surveys. It is likely, then, that this sample was not at all representative of the population of the city or even viewers of the program.

When local news programs, 24-hour news channels, or websites ask viewers to vote on a topic, the resulting samples are not representative of the population to which they are often trying to generalize. First, their viewers may be different from the U.S. population in meaningful ways (e.g., more Fox News viewers are conservative, more MSNBC viewers are liberal). Second, these programs and websites often ask about hot-button topics, things that people care passionately about, because that is what drives viewers and visitors to tune in. Questions about abortion, taxes, and wars tend to drive certain types of viewers to these informal “polls.” The results, whatever they may be, are biased because the sample consists primarily of people who have chosen to watch the program or visit the website, and they have chosen to vote because they are deeply interested in a topic.

You now have a great deal of information about methods for asking people about themselves. If you engage in this type of research, you will often need to design your own questions by following the guidelines described in this chapter and consulting sources such as Groves et al. (2009) and Fowler (2014). However, you can also adapt questions and entire questionnaires that have been used in previous research. Consider using previously developed questions, particularly if they have proven useful in other studies (make sure you do not violate any copyrights, however). A variety of measures of social, political, and occupational attitudes developed by others have been compiled by Robinson Page 157and his colleagues (Robinson, Athanasiou, & Head, 1969; Robinson, Rusk, & Head, 1968; Robinson, Shaver, & Wrightsman, 1991, 1999).

We noted in Chapter 4 that both nonexperimental and experimental research methods are necessary to fully understand behavior. The previous chapters have focused on nonexperimental approaches. In the next chapter, we begin a detailed description of experimental research design.


Every year hundreds of thousands of U.S. college students travel to Florida, Mexico, or similar sunny locales for spring break. For the most part, everybody involved—students, their universities, their parents, and the communities that they are traveling to—realizes that spring break can also be a dangerous time for college students: Students consume more alcohol during spring break and the risks associated with over consumption are more prevalent.

In a survey study conducted by Patrick, Morgan, Maggs, and Lefkowitz (2011), male and female college students completed a survey related to their perceptions of their friends’ “understandings” of spring break behaviors. That is, students were surveyed to see if their friends would “have their back” during spring break.

First, acquire and read the following article:

Patrick, M. E., Morgan, N., Maggs, J. L., & Lefkowitz, E. S., (2011). “I got your back”: Friends’ understandings regarding college student Spring Break behavior. Journal of Youth and Adolescence, 40, 108–120. doi:10.1007/s10964-010-9515-8

Then, after reading the article, consider the following:

1. What kinds of questions were included in the survey? Identify examples of each.

2. How and when was the survey administered? What are the potential problems with their administration strategy?

3. What was the nature of the sampling strategy? What was the final sample size?

4. What was the response rate for the survey?

5. Describe the demographic profile of the sample.

6. Do you think that these findings generalize to all college students? Why or why not?

7. Describe at least one finding that you found particularly interesting or surprising.

Page 158

Study Terms

Closed-ended questions (p. 138)

Cluster sampling (p. 151)

Computer-assisted telephone interview (CATI) (p. 145)

Confidence interval (p. 148)

Face-to-face interview (p. 145)

Focus group (p. 145)

Graphic rating scale (p. 140)

Haphazard (convenience) sampling (p. 151)

Interviewer bias (p. 145)

Mail survey (p. 143)

Nonprobability sampling (p. 149)

Online survey (p. 144)

Open-ended questions (p. 138)

Panel study (p. 146)

Population (p. 147)

Probability sampling (p. 149)

Purposive sampling (p. 152)

Quota sampling (p. 152)

Random sample (p. 150)

Rating scale (p. 139)

Response rate (p. 154)

Response set (p. 134)

Sampling (p. 147)

Sampling error (p. 148)

Sampling frame (p. 154)

Semantic differential scale (p. 140)

Simple random sampling (p. 150)

Social desirability (p. 134)

Stratified random sampling (p. 150)

Survey research (p. 133)

Telephone interview (p. 145)

Yea-saying and nay-saying (p. 137)