Foundations of Quantitative Research in Political Science
- For many years, the New York City police engaged in stop-and-frisk, a policy of police officers stopping citizens on the street and searching for hidden weapons, drugs or other items. Stop-and-frisk disproportionately targeted racial minorities and escalated a sense of racial injustice of New Yorkers. Despite claims to end Stop-and-frisk, former New York city mayor Michael Bloomberg was convinced that the policy was needed to reduce crime. And Bloomberg was not alone: public officials in favor of stop-and-frisk would leverage data on crime rates to back up their claim that stop-and-frisk worked. Now we know that stop-and-frisk doesn't work and does more harm than good. How could policy makers be misled by the data to the point of maintaining a policy that does more harm than good? To understand how these mistakes can happen, we must master a crucial concept: confounding variables. The defense of stop-and-frisk was flawed because it failed to account for confounding variables. By completing this video, you will be able to identify the criteria of confounding variables, propose a confounding variable when presented with a hypothesis, and justify your proposed confounding variable using the criteria of confounding variables. Suppose we collect data and find that countries that invest more in clean energy research pollute more than countries that invest less in clean energy research. In other words, there is a positive correlation between investment in clean energy research and carbon dioxide emissions. Should we conclude that investing in clean energy research makes countries pollute more? No, we shouldn't because as it turns out, richer countries can afford to invest more money in clean energy research than poor countries. Richer countries can also afford to consume more energy from fossil fuels, which leads to more carbon dioxide emissions. So richer countries can invest more in clean energy research and emit more carbon dioxide. This means that wealth is a confounding variable. So there are three criteria that allow us to identify a confounding variable. These criteria are: they cause change in the dependent variable; they are correlated with the independent variable; and they causally prior to the independent variable. This means a confounding variable comes before the independent variable, not after. So notice how, in our example, the variable "wealth" satisfies all three conditions. First wealth causes change in the dependent variable. That is, being wealthier, makes a country emit more carbon dioxide. Second, wealth is correlated with the independent variable. That is, wealthier countries can invest more in clean energy research. And finally, wealth is causally prior to investment in clean energy research. In other words, being wealthier comes before investing in clean energy research, not after. To be clear, pointing out a confounding variable may give us reasons to question a hypothesis. In our example, think about the following hypothesis: investing in clean energy research causes countries to pollute more. If we didn't know about confounding variables, we could have thought that the evidence was in favor of this hypothesis. Knowing that wealth is a confounding variable in this case, is what makes us skeptical about the hypothesis. We can see from this example that when we propose a hypothesis, we must think about the confounding variables. If we didn't know that wealth causes both investment in clean energy research and carbon dioxide emissions, we could have been tricked into stopping investment in clean energy research, and that would make climate change even more disastrous. Thankfully, we know that wealth is a confounding variable in this example, but this is not always the case. Public officials may overlook confounding variables and implement bad policies. That is exactly what happened with stop-and-frisk. When confronted with claims to end stop-and-frisk, New York city public officials would defend their stance by showing how crime had decreased after the police department implemented the policy. So we can see in this graph that indeed crime rates fall after stop-and-frisk starts. However, if we expand the time span of this graph, we can see that crime rates have been falling since the early 1990s - long before stop-and-frisk was implemented. In fact, crime was falling for reasons other than stop-and-frisk. One of the variables known to affect crime rates is the number of employment opportunities, which have increased since the early 1990s. We can see that employment is a confounding variable because it meets all the criteria. So first, employment causes change in the dependent variable (employment causes crime rates to decline). Second, employment is correlated with the independent variable (when employment was low, there was no stop-and-frisk and when employment was high, there was stop-and-frisk). And finally employment is causally prior to the independent variable. So the rise in employment came before stop-and-frisk. The last two steps to identify a confounding variable are crucial. In my experience as a TA, I have seen many students talk about confounding variables by looking only at the relationship between the confound and the dependent variable. Remember that for a variable to be a confound, it must be correlated with the independent variable and it must come before the independent variable. So to sum up, the objective of this video is to teach you to identify the criteria of confounding variables, propose a confounding variable when presented with a hypothesis, and justify your proposed confounding variable using the criteria of confounding variables. People who don't master these skills will be subject to being misled by data, just like public officials who said that stop-and-frisk worked. So mastering the concept and application of confounding variables is crucial for social scientists who want to contribute to a better world.