Foundations of Quantitative Research in Political Science
- At 0:53, Leo says that the regression table on the screen is from a paper used for a homework assignment. Please ignore that. This video was taken from another course (Econ/Poli 5D).
- You can find the paper by clicking this link (you have to be connected to the UCSD network to download it).
In this video, I will show you how to read a regression table when, whenever we find a regression table in a scientific paper or in a book, just about any scientific publication that includes the regression table. It's going to follow specific 100 for regression results. So when you encounter, when we're counting regression table and the scientific paper, you won't look like a regression output and Stata or R. And this is because publishers want to show the results of different regressions in one single table. So this is an example of a regression table from that paper, global competition and Brexit, which is the paper that you got to use the data from that paper, write the replication data from that paper to do your homework 3. So this is a table that has results of six different regressions. And you can see them here in columns. So each column is a separate regression. In other words, each column is a model case. So every time you specify a regression with a given set of pebbles, York's, you're specifying a different model. And the dependent variable is shown at the top of each column. So you see that all of the six models there have the exact same dependent variable. But someone could come up with a regression table in which different models have different deepened and criminals who just happens that in this case, leave share is the one dependent variable of every single regression model in the table. And the independent variables are in the rows. So you see the list of variables, import shock, immigrant share, immigrant arrivals, you excession immigrants and so on and so forth. Right? So when it comes to what's your independent variable and what's your confounding variable? This is something that's on the eyes of the beholder, right? This is something that would be, this is a distinction that we do whenever we're interpreting a regression result. So whenever you're sped, whenever you tell him our data to run a regression for you. What goes on the right-hand side of the posterior of the regression, what goes on the list of independent variables in the regression is something that is mathematically the, all of those variables, mathematically they're indistinguishable from one, from one another, right? So it's not like you can tell r or theta that, okay, run this regression. I want this to be my independent variable when this to be my confound variable. Now for, for the math process, every variable that is not the dependent variable in a regression is an independent variable. But when we're interpreting results, we want to treat, we want to interpret variables differently, right? So if your, if your theory, you're interested in one particular variable, then you're going to treat that as your independent variable and the other variables that you add as your confounding variables. But you could be also be interested in more than one independent variable in our regression. This is something that is for the researcher desert decisions that researchers are supposed to take whenever they're running and interpreting regressions. So in any case in this, in this regression table, you can see that every variable that went that is being treated mathematically as an independent variable is shown in the rows. So you can see the entire list in there. And you can notice that some cells are empty because some models don't include certain variables, right? So for example, fiscal cuts and cancer treated in 62 days and public employment growth and EU economic independence. None of these variables was included in model one. So the first column doesn't have a blank space. Now the cells, because model doesn't include any of these variables. Okay? See where, wherever there is a number, it's because that variable was included in that model. And additional information about each model goes at the bottom. So you can, you can see the R-squared number of observations, the nuts one, fixed effects. So it's the same for me that they're employing a technique called fixed effects, which is something that you do not have to learn for this course, but you might want to learn in your future. Learning of research design and statistics. Model linear is simply saying that this is a plain old linear regression. It's nothing, nothing different from the type of regression that we have learned so far. Because there are different methods to estimate models like these. But again, all of those different metals are not, all of those different methods are not subject to in this course. It's also something that you might want to learn in more advanced state. So each cell is going to contain those estimated slope Beta hat above and the standard error below k. So remember we just talked about the center here. When you encounter a regression table in a publication, you're going to see the standard error in parentheses or in brackets, right below the estimate of the slope. So you can see that in this case we have a slope that is 9.3 and a standard error of 3.8. So the slope is more than twice the size of the standard error, and therefore, the estimate is statistically significant. And you can also tell that there is, this result is statistically significant. When you identify that right next to the slope there, 9.391, you get two stars. And the stars indicates statistical significance in these, in these regression tables. And look not notice at the bottom left, you have the legend that's explaining to you that the table will print three stars whenever the p-value is smaller than 0.012 stars whenever the p-value is smaller than 0.051 star when the p-value is smaller than 0.01. So it has two stars, meaning that it is smaller than 0.05, but not smaller than 0.01. Meaning that if we want to be 95 percent confident that the slope for import shock is greater than 0, we can be 95 percent confident that the shop is the slope form a port truck is greater than 02 stars. It means that the p-value smaller than point of five. And we can already, we, we're ready. We're able to reveal that way in the beginning when we first saw this table. And notice that the slope 9.3 is more than twice the size of the standard error. So some regression tables do not include stars, most of them do, okay? In most cases, they will include this visual shortcut so that you don't have to analyze the fact and the size of the standard error to make it easy for you to see where where the slopes are significant. But in some cases, in a few regression table, you won't see stars and then the reader will have to divide the slopes by the standard errors to know if the slope is significant. So the reader will see okay, as the, as the slope twice the size of the standard error. And if the slope is negative, right? This, the result is the result less than minus two. So for example, EU accession immigrants rights or minus 12.045. The standard error is 5.8. If you divide those, you get a number. It's going to be less than minus two. Therefore, this means that the p-value is going to be smaller than point of five. And voila, you get the two stars right here, indicating that the p-value is smaller than point 05. And also regression tables could, may omit the intercept. And this is the case in this, in this regression table. You don't see the intercept. It doesn't mean that there wasn't an intercept. One this regression was, was Brian. Okay. All of these models, each one of these models has an intercept. Every model has to have an intercept. Sometimes you might want to print a table and just not include that information. And that might be the case because the author doesn't think that the intercept is important. Okay, so the, in this case right here, the authors, there, there is an intercept for each model, but the authors are interested in them. So they just didn't, didn't report them. They are, there are emitted in this, in this regression table. And you might encounter that when you find a regression tuples and publications.