Picking a Qualitative Method with Rigor and Reproducibility

Dan Dohan
Jan 24, 2023
5 min read

Updated: Apr 14

MCL Guidance

By Dan Dohan

Three cheers for NIH’s enshrinement of rigor and reproducibility (R&R) as principles of research quality. First, who doesn’t love a little alliteration? Second, like all good poetry, R&R are terse, thorough and thought-provoking. Third, R&R have proved useful beyond their intended domain of bench and clinical science. This post shares some personal reflections on the role of R&R in NIH grant review and examines how these principles can help guide investigators in their selection of qualitative methods for studies of medical culture.

NIH devised (2014) and instituted (2015) R&R in the wake of widespread attention to challenges in the bench sciences even while recognizing that this framework would have impact beyond the basic science lab. My personal perspective comes from 7 years of service on the Societal and Ethical Issues in Research (SEIR) study section, the NIH peer-review group that evaluated proposals involving ethics in biomedical research. SEIR was highly inter-disciplinary: anthropologists, general internists, geneticists, lawyers, oncologists, pharmacoepidemiologists, philosophers, sociologists, and trialists all had seats at the same table. Proposals featured methodologies ranging from laboratory experiments to forensic anthropology to case law review and randomized trials. Qualitative social sciences were the modal empirical approach.

My service on SEIR (2012-19) coincided with the implementation of R&R. I saw first-hand how these criteria shaped discussion. My recollection is that in the pre-R&R period, SEIR reviews of qualitative research often featured a focus on representativeness and generalizability, key standards of scientific rigor in quantitative social science research. The group understood that these criteria were not the last word in evaluating a qualitative proposal, but it provided a common-ground starting point on the interdisciplinary committee.

R&R moved the starting line. When the group was at its best, rather than jump to an assessment using these quantitative standards, it tried to articulate more fundamental questions about the proposed study. What puzzle did the investigator propose to solve? How well did existing evidence justify its importance? Did they propose a research approach that could solve it? Here are some of the lessons I learned from those discussions.

For investigators whose question related to health decision-making, interviews with individuals typically provided a rigorous and reproducible approach. Individual recollections, reflections, and thoughts help us understand the decision that was taken. How do patients think through and choose to pursue (or choose to defer) genetic health assessments, and to what extent do they appreciate the implications of those choices? What about testing for specific risks of serious illness such as kidney disease? These decisions are complex and deeply embedded in social context. Interviewers can probe for that contextual depth. Ask the person to walk through the decision and how they made it. Probe to flesh out the sequence of events that delivered the person to the decision point. No matter the contextual richness – and even in situations in which a research participant might struggle to specify a precise moment when they actively decided something – a decision point was reached that reflected and shifted individual circumstances.

When examining collective understandings of a health issue, a focus group or similar strategy provides rigorous and reproducible insights into sense-making. For example, investigators often proposed expert gatherings to develop or critique guidelines surrounding the use of new technologies, such as genetic variant classification. When law enforcement started cracking cold cases using genetic data, investigators used focus groups to assess public attitudes and expert panels to propose best practices. And when researchers wanted to identify best practices for the ethical tailoring of materials to inform families about genetics risks, they used human centered design workshops to get feedback.

In some circumstances, talking to people, whether individually or in a group, may not yield rigorous and reproducible insights. When the research question turns on what people are doing and how they are doing it—rather than how individuals perceive a given activity—interviews may yield data of limited rigor and reproducibility. Sometimes actively engaged people turn out to be poor informants of their own activities. They have a front row seat to the activity in question but are too busy in the doing to have bandwidth or motivation to reflect on what they are doing. Retrospectively, they may assign or discover a motivation for what they did, but those reflections may shift over time. These are situations where observing behavior may be the best move.

One situation in which observation yields data of high rigor and reproducibility is when seeking to understand the implementation and impact of a new program, such as recruiting to clinical trials based on genetic markers. Watching what happens in this kind of situation generates fantastic data. Watching different people in different circumstances strengthens data quality. Observational rigor may also be warranted when an investigator is examining unintended consequences. For example, might the inclusion of genetic data in the electronic health record end up codifying perceptions of racial identity? Compared to interviews, observation is a relatively slow and difficult method, but for particular types of questions its insights are invaluable.

In the health sciences, most qualitative social science research can be classified as an individual interview, group interview, or observation. Individual interviews come in many forms and flavors —structured, semi-structured, life history, cognitive, etc. — but fundamentally they all surface individual perspective and experiences and are thus justified when the study question places those issues at stake.

Group interviews may differ in focus, structure, and goals. Focus groups may be open-ended and unstructured with the goal of surfacing inductive insights to explore and elaborate in group discussion. Human-centered design often includes group interviews that are structured to provide specific feedback or insight into research instruments or ideas. Deliberative democracy follows an even-more structured set of procedures to develop group-endorsed recommendations on research topics. Despite their differences, these approaches all make sense in circumstances when group dynamics are central to a research puzzle or when collaborative discussion can shed light.

In general, observation – whether it be direct observation, participant-observation, or using video recording – is less common in health sciences than are interviews. The scientific premise of a study that uses observation is that research participants may have difficulty verbalizing insights that are scientifically significant. Other approaches can also be used in this way and even though they are not considered observational methods, they might be considered kissing cousins. Passive audio recording, in which researchers receive permission to place a recording device unattended in a clinic exam room, is one example. Community engaged and community participatory research, which are often undertaken with an eye towards addressing and advancing health equity and justice, may be another example. During community engagement, researchers and participants interact in ways that can surface difficult-to-verbalize insights. Community engaged research also has the potential to create conditions of trust between researchers and participants that foster interactions of more genuine authenticity. The end result is improved rigor and reproducibility. Investigators who embrace community engagement with an eye towards equity and justice may also consider highlighting its scientific advantages. In many situations, community engagement can provide scientific insights akin to those generated by participant-observation.

Over the years, I’ve turned to rigor and reproducibility countless times to explain my enthusiasm (or lack thereof) for grants under review. Used in checkbox fashion (sample: check, variable construction: check, statistical analysis: check, etc.), R&R are just empty words. Properly applied, though, R&R can help operationalize critical habits of mind. They help us interrogate whether an investigator has identified an appropriate real-world scientific puzzle that can be studied with existing methods. They help us assess whether the investigator has articulated a study premise that holds promise to shed light on that puzzle. They help us evaluate how fully a particular study design and its procedures will deliver on that promise. Over the years, I’ve found this habit of mind is surprisingly agnostic with respect to method and paradigm. R&R can be particularly helpful to guide thoughtful selection of qualitative research methods.