Annual Faculty Evaluations: A Flawed Process

Phillip Ardoin • Appalachian State University

Click Here to Download the complete Department Chair Quarterly.

Completing the annual evaluation of faculty is one of the most important responsibilities of a Department Chair. The annual evaluation impacts faculty merit pay and is a critical role in a faculty member’s tenure and promotion process. While the role of annual evaluations vary across campuses, for many departments one or more negative annual evaluations can significantly decrease the likelihood of a successful tenure and promotion vote for a junior faculty member. Moreover, consistent annual evaluations, which note a faculty member has met expectations, all but guarantees a positive tenure and promotion vote or at a minimum provides a faculty member with substantial cause for an appeal if they are denied tenure and promotion.

Therefore, the annual evaluation process is often one of the most difficult and challenging responsibilities of a Department Chair. Annual evaluations require us to provide honest and direct feedback to our colleagues. While these conversations can be enjoyable and even provide opportunities for mentoring, they can often be very uncomfortable when colleagues have not met the expectations of your department. I know I have spent many sleepless nights dreading annual evaluation meetings with colleagues who did not meet our expectations.

Read the Rest of the Issue Here

However, I recognize the responsibilities of leadership often require challenging conversations and uncomfortable situations. The more notable issue which makes annual evaluations difficult and the primary motivation for this blog post is the significant flaws with the standard measures chairs must rely on for annually evaluating our colleagues.

The standard dimensions used for evaluating our colleagues are (1) research, (2) teaching, and (3) service. Unfortunately, the standard measures for evaluating each of these dimensions is seriously flawed.

How does a chair or department measure quality/impactful research or the scholarly productivity of a faculty member? Common methods of evaluating the quality or impact of faculty research include number of publications in ranked journals, number of publications, number of times faculty publications have been cited by other scholars, and number of citations in ranked journals. Unfortunately, each of these measures are plagued by biases and measurement error. Research (Atchison, 2017; Maliniak, Powers, and Walter, 2013) has highlighted substantial gender biases with citations, and significant concerns with journal impact factors led to more than 75 scholarly organizations signing the 2012 San Francisco Declaration on Research Assessment (DORA) condemning the measure. Moreover, scholars publishing in areas which are developing or not part of the mainstream are often significantly less likely to successfully publish in the “top” journals or to receive multiple citations. For instance, my research on college student voting, which received little (or no) attention when published in 2015, has recently seen a spike in downloads and citations. The quality of my research on college voters did not improve, the political winds simply turned in my favor.

The most common methods of evaluating teaching are student evaluations of teaching (SETs) and peer evaluations by colleagues. Once again, each of these measures present significant issues of bias and measurement error. The significant equity and measurement biases of SETs have been well documented and most recently summarized by the work of Kreltzer and Cushman (2021). Considering the growing recognition of the significant problems with SETs, the University of California System has stopped using SETs as a way of evaluating teaching effectiveness and more than 20 scholarly associations have urged colleges and universities to stop using the evaluations as a way of judging teaching effectiveness. As an alternative or addition to SETs, many universities use peer evaluations. While I am not familiar with the literature regarding the use of faculty peer evaluations of teaching, I have personally found them to be useless. The primary problem I have found during my seven years as department chair is that they lack variation. Specifically, faculty peer teaching evaluations indicate that all 36 of my colleagues consistently exceed expectations on all dimensions of teaching. As I have noted in our department discussions, one of the key characteristics of a variable is that it must vary. Peer evaluations of teaching in my department does not work as a variable for measuring teaching because it does not vary.

Service is critical to the functioning of any academic department and particularly those which believe in democratic governance. I am fortunate to chair a department that supports a culture of service. Faculty actively participate in thesis, curriculum, and even search committees, which are necessary for our department to function. Senior faculty also recognize that our interests will only be heard across campus if they are engaged in university committees. Measuring service, as with research and teaching, is flawed. Most Departments simply count the number of Department, University or Disciplinary committees an individual serves on or chairs. Unfortunately, all committees are not created equal. For instance, some faculty search committees may have to review 50 applications, while others may review 150 applications depending on the line. These differences grow exponentially with committee service outside of the department. I have served on several university committees which required nothing more than showing up for 2 or 3 brief meetings while other committees have felt like a second job requiring weekly meetings after hours and weekends.

In summary, the annual evaluation is broken. As a discipline, we have documented and debated the fatal flaws of measuring research and teaching. Unfortunately, we have spent less time discussing alternatives and improved measures of evaluating faculty research, teaching, and service.



Ardoin, Phillip J., C.S. Bell., and M.M. Ragozzino (2015). “The Partisan Battle Over College Student Voting: An Analysis of Student Voting Behavior in Federal, State and Local Elections.” Social Science Quarterly (May).

Atchison, A. (2017). Negating the Gender Citation Advantage in Political Science. PS: Political Science & Politics, 50(2), 448-455.

Kreitzer, R.J., Sweet-Cushman, J. Evaluating Student Evaluations of Teaching: a Review of Measurement and Equity Bias in SETs and Recommendations for Ethical Reform. J Acad Ethics (2021).

Van Noorden, Richard (May 16, 2013). “Scientists join journal editors to fight impact-factor abuse”. Nature News Blog.

Phillip Ardoin serves as Chair of the Department of Government and Justice Studies at Appalachian State University. He serves with Dr. Paul Gronke as Co-Editor of PS: Political Science and Politics, the journal of record for the American Political Science Association.  His research interests address a broad array of issues within the field of American Politics. He is currently working on several research projects that range from an analysis of factors which influence Partisan Polarization in the N.C. General Assembly to an examination of the influence of college student voting on local elections throughout the United States and attitudes of political elites regarding college student voting.


Notify of
Inline Feedbacks
View all comments
Would love your thoughts, please comment.x
Scroll to Top