The lead article in the June 2010 edition of the Journal of Political Economy is

Does Professor Quality Matter? Evidence from Random Assignment of Students to Professors
Scott E. Carrell and James E West

Student evaluations may not be a good signal of teaching quality because

“Professors can inflate grades or reduce academic content to elevate student evaluations.”

The authors argue that if a student takes Calculus I, say, their performance in Calculus II is a good signal of how well they learned the material in Calculus I.  So their study:

“uses a unique panel data set from the United States Air Force Academy (USAFA) in which students
are randomly assigned to professors over a wide variety of standardized core courses. The random assignment of students to professors, along with a vast amount of data on both professors and students, allows us to
examine how professor quality affects student achievement free from the usual problems of self-selection. Furthermore, performance in USAFA core courses is a consistent measure of student achievement
because faculty members teaching the same course use an identical syllabus and give the same exams during a common testing period. Finally, USAFA students are required to take and are randomly assigned
to numerous follow-on courses in mathematics, humanities, basic sciences, and engineering. Performance in these mandatory follow-on courses is arguably a more persistent measurement of student learning.
Thus, a distinct advantage of our data is that even if a student has a particularly poor introductory course professor, he or she still is required to take the follow-on related curriculum.”

Their methodology:

“We start by estimating professor quality using teacher value-added in the contemporaneous course. We then estimate value-added for subsequent classes that require the introductory course
as a prerequisite and examine how these two measures covary. That is, we estimate whether high- (low-) value-added professors in the introductory course are high- (low-) value-added professors for student
achievement in follow-on related curriculum. Finally, we examine how these two measures of professor value-added (contemporaneous and follow-on achievement) correlate with professor observable attributes
and student evaluations of professors. These analyses give us a unique opportunity to compare the relationship between value-added models (currently used to measure primary and secondary teacher quality) and
student evaluations (currently used to measure postsecondary teacher quality).

Their findings:

Results show that there are statistically significant and sizable differences in student achievement across introductory course professors in both contemporaneous and follow-on course achievement. However,
our results indicate that professors who excel at promoting contemporaneous student achievement, on average, harm the subsequent performance of their students in more advanced classes. Academic rank,
teaching experience, and terminal degree status of professors are negatively correlated with contemporaneous value-added but positively correlated with follow-on course value-added. Hence, students of less
experienced instructors who do not possess a doctorate perform significantly better in the contemporaneous course but perform worse in the follow-on related curriculum.

For example:

As an illustration, the introductory calculus professor in our sample who ranks dead last in deep learning ranks sixth and seventh best in student evaluations and contemporaneous value-added, respectively.

Required reading for all serious teachers and students and Deans.  Ungated version