At Fors Marsh Group (FMG), we pride ourselves on pairing traditional research methods with new computational advances to ensure we are getting the fullest possible perspective. We always seek the “right way” for any given research question; recent projects have included everything from focus groups to big data analytics—and even physiological data. But the data landscape is changing before our eyes—how can researchers stay balanced?
A Shifting Data Landscape
In this “age of information,” our society generates a staggering 2.5 quintillion bytes of data every day.
This data comes from a wide variety of sources. Data are being generated both actively, as when users post on social media, and passively, through administrative or private data collection, such as when credit card transaction information is collected instantaneously at point of sale (POS).
This means that not only is the volume of data increasing, but the velocity and variety of data collection is increasing as well. Today’s researchers have access to data that is more extensive, diverse, and timely than ever before.
Why Human Understanding Still (and Always) Matters
With this breadth of information, comes the risk of shallow understanding.
When using tried and true data sets that have been collected through traditional methods, such as surveys, targeted focus groups, and experimental tests, researchers have the benefit of a thorough understanding of what is (and is not) included in the data collection and analysis process. In other words, researchers know exactly how the data are collected, how the sample compares to the larger population, how the key variables are operationalized, and what conclusions they can safely draw from the results.
The volume, velocity, and variety of today’s new data sources means that often such foundational knowledge is obscured. Data sets are long but “thin”; they lack meaning or generalizability. Relying on algorithm-driven data science analytics without a firm understanding of the underlying assumptions can lead to false generalizations and poor decisions.
At FMG, we often offset this risk by pairing our quantitative research with “thick” qualitative methods to ensure that we are getting a sense of the whole picture—not just what the numbers are telling us.
Three Guidelines to Implementing Thick Data
1. Understand potential data limitations and biases
Sometimes data biases fall along well-known cleavages, which can easily be corrected by reweighting the research sample to better reflect the population. For example, when we collected data for a study on seat belt usage for the National Highway Traffic Safety Administration, we found that young males—a demographic known for risk-taking behavior—were underrepresented in our data collection. We took this into account during analysis, so we did not misrepresent total seat belt usage based on the limitations of our sample.
But potential biases may be less obvious in new sources of data. For instance, one recent FMG study was tasked with calculating tipping income from credit card transaction data. However, our parallel qualitative inquiry into tipping behavior revealed substantial variation by form of transaction (cash versus credit). Without this additional “thick” information, our final conclusions could have been biased by thinking that this “thin” data set was representative of the whole story.
2. Consider the results in context (and alternative interpretations)
Seemingly irrational behavior, or sudden “blips” in long-established trends, should not be discarded out of hand or held up as completely novel findings without context. Instead, they should be opportunities for deeper discovery.
For example, FMG is very proud of the success of our tobacco reduction campaigns, including our work supporting the FDA Real Cost campaign. However, we would be remiss if we focused only on these successes and overlooked the possibility of replacement (and not reduction) of tobacco use for many young adults. As such, some of our recent work has explicitly focused on outreach and research related to topics such as public understanding of the relative harm of e-cigarettes in comparison to traditional cigarettes.
3. Make sure next steps are grounded in experience
Finally, researchers need to recognize that they are not alone in their endeavors: it is important to obtain buy-in from the individuals who will ultimately be responsible for implementing any potential findings, recommendations, or next steps. As most practitioners are not comfortable with blindly following a report produced by a computer and a data scientist, it is important for researchers to include these practitioners along the way, not just in the final stages.
When working with the U.S. Department of Agriculture (USDA), FMG ensured that stakeholders were involved in delivering a successful final product that was rooted in their experience. We worked collaboratively with agency experts throughout the process—from research planning to interpretation of results to making decisions based on the research.
Researchers and data scientists alike need to ensure that we are meeting client and stakeholder needs by using new sources of data. However, we must temper the use of these new sources with a thorough understanding of any potential biases, gaps, or limitations that may be hidden in these new sources. Only by linking the breadth of big data with thick understanding will we best inform our clients’ needs.