Foundations of Quantitative Research in Political Science
- In a previous video, we defined what an experiment is and explained why many researchers consider them to be the gold standard of research designs. But we also briefly mentioned that researchers can't always use experimental designs in their research. So, how do we conduct research in these cases? Well, in this video, we'll discuss observational studies, and explain how researchers use them to evaluate their hypothesis. By the end of this video, you should be able to: define an observational study, identify when to use observational data rather than experimental data, explain how observational studies can mitigate the threats of confounding variables, and identify which types of variables are most difficult to control for using observational data. Suppose we want to understand the relationship between education and democracy. Are countries with higher levels of education more likely to be democratic? Well, you may have heard that countries with higher levels of education tend to be more democratic, but there are many variables that could affect both education and democratization, such as a country's income or its culture. If we were all mighty researchers with the power to manipulate the education level of countries, we could conduct an experiment to test the hypothesis that countries with higher education are more democratic. We would randomly select a large number of countries to be in the treatment group, and a large number of countries to be in the control group. And then we would increase the education level of all the countries in the treatment group. If the countries in the treatment group became more democratic after our intervention, we could be more confident that education level causes democratization. Remember that people like experiments, because if we randomly assigned countries to the treatment and control groups, we would be dealing with groups that are on average similar to each other in their incomes and culture. So, we could be pretty confident that changes in democratization would be caused by the increase in education. - As you probably noticed, it's impossible to manipulate the level of education of entire countries overnight. However, we can still mimic the experiment through an observational study. A study is observational if the independent variable is not under the control of the researcher. When we conduct an observational study, we have to think about as many confounding variables as we can, and use that information to be sure that we account for confounding variables. In the education and democracy example, if we think that income and culture are possible confounding variables, we have to make sure that when comparing between more and less educated countries, we are also considering that these countries have different incomes and cultures. How do we do this? Suppose we're classifying countries as either having high or low levels of education, and as being either a dictatorship or a democracy. If we want to know if a country with high education is more likely to be democratic, we can come up with a table like this. For the record, this table is entirely made up for teaching purposes. But if this is the data we have, we'll be led to believe that education goes hand in hand with democratization. As you can see, for countries with low education, 60% are dictatorships and 40% are democracies. But for those with high education, 70% are democratic. Feel free to pause this video to study the table for a moment to make sure you understand it. But notice that this table does not take into account all of the variables that could be confounding variables. One of them is income. It could be the case that income is what makes country more educated and more democratic. So to start considering the level of income, we could move on to looking at two different tables. One table only has the rich countries, and the other only has poor countries. Something like this. Now, we're only comparing between countries with similar levels of income. Notice that when we look at poor countries only, we still see that countries with high education are more likely to be democratic. For countries with low education, 70% are dictatorships, and 30% democracies. But for those with high education, 60% are democratic. The same thing happens when we look only at rich countries. More education, more democracies. Democracies make up 50% of countries with low education, and 70% of countries with high education. This means that education goes hand in hand with democratization, even if we only make comparisons between countries with similar income levels. In other words, education is correlated with democratization even after controlling for income. Because we still see a relationship between education and democracy, even after controlling for income, we should be more confident that education brings democratization. But income is not the only possible confounding variable here. We have also thought about culture. Can we control for all the variables that we could think about as confounding variables? We can, but doing this with tables would be too complicated. However, there are many statistical tools we can use. Let's now turn to how we would do this in a linear regression. When we try to test a hypothesis using regression analysis, we are testing whether our independent variable has an effect on our dependent variable. The amazing thing about linear regression is that it makes it easy for us to control for many confounding variables at once. For example, if we want to test if education brings democratization, and we have identified and measured important confounding variables such as income and culture, regression tests let us control for these confounding variables, and isolate the effect of education. If we believe that education, income and culture are the variables that might influence whether a country is democratic, we are saying the following: democracy equals education plus income plus culture. This equation is telling us that the level of democracy in the country is the result of a combination of the levels of education, income, and culture of the country. Linear regression can use this equation to estimate the effect of education on democracy, controlling for income and culture. When we include income and culture into the regression equation, we're making sure that we are studying the effect of education on democracy independent of potential effects of incomes and cultures of countries on democracy. In other words, we are controlling for income and culture just like we did before with the two tables for rich and poor countries. Yet, income and culture are not the only confounds that we need to worry about. We could easily come up with other possible confounds. Inequality might be a confound, population size might be a confound, and racial diversity might be a confound. Ideally, we want to control for all of these confounding variables. If we could point out every single confounding variable and control for them, our regression analysis would be very similar to an experiment, and we could therefore argue that our regression results estimate the causal effects of education on democracy. However, this is extremely challenging. For one, we might fail to identify all confounds, because we are not aware that it was a confound. We might also not be able to measure some of the confounding variables. Think about culture, how could we measure that? Lastly, there are confounds that are unobservable that we cannot measure, and therefore we cannot include in our regression. Think about a dictator's willingness to let their country democratize. This is likely very important, but there is simply no way to know or measure this confound. For these reasons, simple linear regressions can be very useful, but also very limited in what they can tell us about causal effects. - In this video, you have learned that we can still account for confounding variables in an observational study, but it's much harder to account for all possible confounding variables when we can't conduct an experiment. Think back to the hypothetical experiment in which we manipulated the level of education of several countries. If we could randomly assign a large number of countries to be in the treatment and control groups, we could be very confident that the treatment group was on average similar to the control group across every possible confound. In other words, we could be confident that the assignment to treatment was not correlated with other variables such as income, or culture, or even variables that we can't observe, can't measure, or we didn't even think about. In summary, the objective of this video is to teach you: how to define observational study, how to identify when to use observational data rather than experimental data, how to explain how observational studies can mitigate the threats of confounding variables, and how to identify which types of variables are most difficult to control for using observational studies. We encourage you to use the quiz questions in this module to check for your understanding.