In examining the wisdom of crowds, we are going to see that it depends on 2 things—talent, that is, good predictors and diversity. We are going to see that diversity plays an equal role with talent so they matter just as much; that the talent and diversity matter in equal parts. We are going to make this argument in a formal way using some pretty straightforward mathematics.

Before we get to that mathematics, I want to talk about cattle. Why cattle? Well, I like cattle. I use to own some—9 cattle in fact, not 10, but 9. Cattle are large creatures and you buy them by the pound. When I bought my cattle, I paid a price per pound and then when I sold them a few months later, after fattening them up on Iowa pasture land, I auctioned them off again at a price per pound. If you are going to trade in cattle, you have to be able to estimate their weight; you want to think about what they weigh. It turns out, one of the most famous examples of the wisdom of crowds involves guessing the weight of cattle.

This example is due to Sir Francis Galton, and the story opens Jim Surowiecki’s wonderful book *The Wisdom of Crowds*. Surowiecki tells us that the great scientist Francis Galton had collected some data from the 1906 West of England Fat Stock and Poultry Exhibition—catchy name right, 787 people at this exhibition guessed the weight of a steer. Their average guess of that weight was 1197 pounds. The actual weight of the steer? 1198 pounds.

787 people at this exhibition guessed the weight of a steer. Their average guess of that weight was 1197 pounds. The actual weight of the steer? 1198 pounds.

Amazing, right? It is totally amazing. That’s why Surowiecki’s book is called *The Wisdom of Crowds*! It is sort of incredible. Let’s not get too excited because Galton’s cattle example, that contest, is a one-shot case, it is a single example. By no means does that imply that in every case a crowd is going to be incredibly accurate like that. And in fact, crowds often make big mistakes. They can be bad as well. Just like individuals can make mistakes so can crowds.

But, the weight of the evidence, not just from cattle guessing or jelly bean jar contests, and stuff like that, but from the trenches, from business and policy worlds, is that though Galton’s example has this big wow factor—it has, pardon the pun, a grain of truth: groups tend to be more accurate at prediction than individuals. They are more likely to be wise, at least in predictive context. This generally accepted fact, that crowds have greater accuracy than individuals, but that they can also be horribly wrong resonates with my own experiences as well.

For a decade, I have had students in my classes make predictions on everything from my weight, they are almost always within a pound, no I don’t weight 1197 pounds, but to the number of floors in the tallest building in Rio, last year they were off by fewer than 2 floors, or even the number of chairs in a coffee shop that was just opening. They were off by only 3. I also had them guess the height of the Saturn V rocket. They were off by 1000 feet the first time, and the number of pizza places in Ann Arbor—and there they were off by a factor of 2. They were off by nearly double the amount. Later on, I taught them how to make better estimates—something we are going to do in this course, and they got within just a few percentage points on both of those questions. They became a wise crowd.

Crowds are able to make more accurate predictions most of the time, but sometimes they can be way off. What we want to do is make sense of this. We want to understand when can crowds predict.

So, the facts are in here. Crowds are able to make more accurate predictions most of the time, but sometimes they can be way off. What we want to do is make sense of this. We want to understand when can crowds predict? When are they wise and when are they not? The key to this is going to be that diversity plays a big role. That’s the point of this lecture: to explain how and why crowds can make accurate predictions, and to show the value of diversity.

I am going to present 2 main results and they are both mathematical results. The first is going to show that the collective accuracy of a crowd depends in equal measure on the accuracy of its members, that’s talent, and on their diversity. More diversity is going to be better. The second result is going to be a corollary of the first and that’s going to say that a diverse crowd will always be more accurate than its average member, not sometimes but always. The crowd is always more accurate than the people in it. That’s sort of cool.

As a way to introduce the formal statistical terminology that I am going to need to state these mathematical results, I am going to present a fairly simple example. Suppose you have 3 people: Amy, Belle, and Carlos and they are making predictions regarding the number of new clients that their firm is going to attract in the next year. We are going to work through this example so we can build some understanding of how the statistics work. Let’s do an example.

Here are our 3 people, Amy, Belle, and Carlos. Each one of them has made a prediction about the number of new clients. Amy predicts 12, Belle predicts 6, and Carlos predicts 15. What we want to do is we want to figure out the crowd’s prediction. The crowd’s prediction is just going to be an average of these 3 predictions. If we sum these up we get 33, so we are going to get the crowd’s prediction is equal to 11, which is the average of the 3 people.

Let’s suppose that the actual number of new clients turns out to be 10. I am setting this up so the crowd is pretty accurate, it is only off by one, but they are not perfect, a smart crowd, but not an incredibly wise crowd. What we want is we want some way of measuring the accuracy of these individuals as well as the accuracy of the crowd. How do we do it?

This is a problem statisticians have thought about for a long time. They typically do it by taking the difference between the actual prediction and the true value and squaring that amount. They call the result the squared error. In the case of Amy, we take 12 − 10, which is the true amount and square that and get a total of 4.

Why do we square it? Here is the simple reason. Suppose one person had an error of plus 5 and another person had an error of minus 5. If we added those up we would get zero; we would get no error. Think of it this way. Suppose I am shooting arrows and I shoot one high and then one low. I can’t average those and then say “bull’s eye!” What I do is I want the distance from the center. What squaring does is gives us the distance.

Let’s compute the errors for the other people as well. Let’s look at Belle. Belle predicts 6, the actual value is 10 so (6 − 10)^{2} is just going to be 16. What about Carlos? Carlos predicted 15, (15 − 10) is 5 and if we square that we get 25. Amy is off by 4, Belle is off by 16, and Carlos is off by 25. If we average these things up, we can average the squared errors and we are going to get 41 + 4, which is 45. We are going to get an average of 15. The individual squared error in this case is 15. What is the crowd’s squared error? Well this is easy. Remember the crowd guessed 11, so 11 − 10 is 1 and 1^{2} is just 1. The crowd’s squared error is 1. What we get is the crowd is more accurate than the individuals are on average. Notice, the crowd is also more accurate than anybody in the crowd.

That isn’t always going to be true—sometimes someone in the crowd can be more accurate than the crowd—but the former always will be. The crowd will be more accurate than the average member in it. I am going to come back to that point later, but what I am going to do is focus on for the moment is this idea that we have individuals who make mistakes and the crowd who makes mistakes and the crowd who makes mistakes. The crowd is smarter than the individuals in it.

Next, here is what we need to do. I need some way to think about why the crowd is smarter. What is making the crowd good? One thing I figured out is the accuracy of the people, their squared error; I need some way of measuring the diversity of the crowd. How different are those people? Here again, we can go to statistics. We have got a standard approach. What we do is instead of computing the difference between people and the true value; we look at the variation in the prediction, so the difference between the predictions and the crowd prediction.

One quick aside, statisticians use this expression to compute what they call the variance of a data-generating process, some process that generates numbers. By variance, what they typically mean is how much noise or error is produced by the process. Here we are using it to mean diversity. Let’s think about this variance thing for a second.

Suppose I have got a machine that’s producing cookies that are supposed to weigh 6 ounces, some are going to weigh a little more and some are going to weigh a little bit less. How much more or less is the variation? What causes that variation? The variation can be caused by vibrations in the machine or clumping of the dough, all sorts of things.

In our case, we are getting variations in these predictions. The cause of the variation isn’t the shaking of machines or lumping of cookie dough, it is differences in how people think. When we think about this variation in the predictive context, we are going to call it diversity of predictions because that’s what it is. It is differences in how people predict. Remember in our case, people’s average prediction was 11.

What we can do is we can figure out what in some sense is the diversity of those predictions? Well let’s again just do some simple math. Amy’s prediction was 12 and the crowd’s prediction was 11 so we can say Amy’s contribution to diversity is (12 − 11)^{2}, so that’s just 1. Belle’s contribution to diversity is (6 − 11)^{2}, so that’s −5^{2}, which is 25. And Carlos’ contribution of diversity is (15 − 11)^{2}, which is 4^{2}, which is 16.

When I add that up I get 42. When I then divide that by 3, I get that on average people are off by 14. I am going to call this 14 the diversity of the predictions because this is how different people are in the predictions that they make.

The crowd’s squared error equals the average individual squared error minus the diversity of the predictions. I am going to call this the diversity prediction theorem.

Let’s look at these 3 numbers that we have calculated. What have we calculated so far? We have calculated the average individual squared error; that’s 15, so that’s on average how far off for people. We have calculated the diversity of the predictions (that’s 14) and then we have calculated the crowd’s squared error (that’s 1). Do you notice anything? That’s correct: The crowd’s squared error equals the average individual squared error minus the diversity of the predictions. I am going to call this the diversity prediction theorem.

This happens not only in this example; it is true in every example. It is a mathematical identity. It is like the Pythagorean theorem. Remember, take any right triangle, the hypotenuse squared equals the sum of the squares of the 2 sides. This is the same thing. The diversity prediction theorem is just like that. It’s just like the Pythagorean theorem: It’s always true.

Now this is a really important result and it is counterintuitive. What it tells us is the crowd’s ability depends in equal measure on ability (that’s the average individual error) and on diversity. It is so important, this is so central and also so counterintuitive, I want to flesh this out in more detail.

Let’s do this on a general case. Let’s suppose we have some general thing that we are trying to figure out, some future value we are trying to figure out, some *x* that’s sitting out here. We have people make predictions. This could be anything. This could be the unemployment rate, the number of jelly beans in a jar, even the weight of a steer, so we’ve got this thing *x* that we are trying to figure out. I have got a whole bunch of people. Instead of 3, we are going to say we have *n* people who make predictions. I can label these people from person 1 to 2 to 3 to 4 up to *n* and I can just index them by these *x _{i}*,

What is the crowd’s prediction? Well that’s easy. That’s just the sum of all the individual predictions, so (*x*_{1} + *x*_{2} + *x _{3}* +

What we have then is you have this particular expression and we want to write down how things work. Let’s unpack things a little bit. Here is the crowd’s error. The crowd’s error is their prediction minus *x*, squared—and remember *x* is the truth. This is how far off the crowd is. Now I want to write down what is the average error of the individuals. Well the average error of the individuals is just each person’s prediction, that’s *x _{i}* minus the truth squared. Remember with Amy, I had her prediction which was 12 and the truth was 10, so I just took 12 − 10 and squared it. Here what I do is just take

Now all I am left with is figuring out some way to write diversity. How do I write diversity? What I do here is I ask what is people’s distance, *x _{i}*, from the crowd’s prediction. I take each

Here is the cool thing. Now I can write the following expression. I can write that the crowd error equals the average error minus the diversity. This is just always true. Let me show this with mathematics. This is what it looks like. It looks a little bit frightening, right? The crowd error, that’s the (*C* − *x*)^{2}, equals the average error. This is just averaging up each of the individual’s errors minus the diversity of the predictions. What I have is this formal mathematical expression. This is crowd error equals average error minus diversity.

Let’s think for a minute how unintuitive this is because I said what is going to make the crowd smart. You might have said well it is having smart people and that’s captured here in average error. It is also diversity. We are going to see that diversity matters. The interesting thing here is diversity matters just as much as ability; it matters just as much as average error. What we want to do here at this moment is I want to take some time and walk through this result. Here is why. This is going to be by far the most math we do in this course and it is going to look a little bit scary so let me just put that right out there.

We are going to do this math for 2 really important reasons. The first is this. The primary reason for this course, at least for me to teach this course, is to speak math to metaphor. When people talk about diversity we often tell stories and speak in metaphors. I want to nail down the logic. I want to convince you why diversity is so important, so I am going to show you how it is done.

A lot of the later results I show you I am going to just ask you to take them on faith; we are not going to do the math, but for at least one case I think it is important to do one. Second, the other reason is this: It is that by working through the proof, by watching me do it, you are going to see the logic of why it works. You are going to learn something from the exercise in terms of how we go about proving results like this. Let’s go at it. This is going to be fun, but a little bit tricky. Here we go!

Here is what I have. Here is my expression. I have the crowd’s error equals average error minus diversity. The first thing I am going to do is I am going to multiply both sides, everything by *n*. I am taking this crowd’s error and I multiply it by *n* and it allows me to get rid of the 1/*n* that I had in front of the average error and the diversity. The next thing I do is I just expand terms. This is pretty easy, right, so if I have *C* − *x*^{2}, that’s going to be *C*^{2 }− 2*CX* + *X*^{2}. The same is true for little *x _{i}* −

Then what I can do after I multiply all these things out is I can think is there some way to simplify it? Here is an interesting trick. Notice I have this summation of −2*x _{i}*s here. What is that? If I am summing up all the

What this is saying is, is that these 2 expressions even though they look very different, even though this expression looks quite different from this expression, if we expand this expression all the way out, cancel some things out, we see that in fact they are the exact same thing. What we have with this case then, is what we call an identity in mathematics. The crowd error is literally the exact same thing as the average individual error minus the diversity. It is just a theorem. It is always true.

Enough math. Do not worry if you did not follow it, you can go back and watch the proof again if you want. The point of doing this exercise was to show you there is no great mystery to the proof. It is really just a matter of multiplying things out and cancelling terms, but what is most important, now that you have seen the mathematics, is to recognize that this is always true. Always. The wisdom of crowds—by that I mean the ability of crowds to make accurate collective predictions—depends in equal measure on the crowd’s ability (their averaged individual squared error) and on the diversity of their predictions.

Let me give you a really interesting corollary here and that’s this. The crowd error has to be less than the average individual error. Let’s look at this. What did I have before? I had the crowd error equals average error minus diversity. Well if it equals average error minus diversity, if diversity is positive at all, the crowd error has to be less than the average error. That means if there is any diversity in the room in any way than the crowd is going to be better than the average person in it because it is just a mathematical fact. That’s an interesting thing that follows from what we found.

I am going to call this—the formal name I’m going to call this is the crowd beats the average law. It’s really easy to see why it is true. Before I move on a quick corollary and this is our second main result. If the crowd squared error equals the average individual error minus the diversity of their predictions then the following also has to be true. If the crowd had any diversity at all, any diversity in its predictions, then the crowd’s error is strictly less than the average squared error of the people in the crowd. In other words, the crowd is better than the average person in it. I call this the crowd beats the average law. It is really easy to see why it is true, right? The crowd’s error is the average individual error minus the diversity. If the crowd’s got positive diversity at all then the crowd’s error has to be smaller than the average individual error. That’s pretty cool. Crowds really are better than the people in them—well, at least on average.

As the diversity prediction theorem is a mathematical fact, it is got to be true in every single case, and it is. It is got to be true in Galton’s data so let’s check. I am not worried about it or anything; it is just fun to see the logic in action. If we take his data, we get that the crowd’s squared error equals 0.6. The average individual squared error in that case was 2956. Wait a minute, you say, whoa, that’s huge. The cow only weighed 1150 pounds how could the error be 2956. Remember, this is the squared error. If we take the square root of that number, we get 54.4, 55 pounds. That isn’t bad. Cattle weigh about 5 times the size of people. We can guess the weight of a person within about 10 pounds so 50 pounds isn’t that far off. Wait a minute, if the crowd was off by 0.6 and people are off by 2955, then there must have been a lot of diversity. In fact, there was. The diversity was 2954.4. So that’s how crowd error can equal average error minus the diversity.

If you a want a wise crowd, this suggests we have 2 options. We could find brilliant people who all know the answer so then the individual error would be zero, crowd error would be zero, and diversity would be zero, or we can find a bunch of fairly smart people who have moderate errors, who happen to be diverse, so you also get moderate diversity. If you take any example from one of these books on wise crowds, such as Surowiecki’s book, so you look at jelly bean guessing contests or any one of the things that you might see on cattle weight guessing to predicting the NFL draft, anything like that, you are going to see it is almost always the case that it looks like Galton’s data where you see a wise crowd because you have moderately accurate people who happen to be diverse.

Let’s see why this works. Here is another way to think of it. We basically have this equation: crowd error equals average error minus diversity. How do we get small crowd error? Let’s think of it this way. It always looks like small equals big minus big. This is what we see in these books. Why is that the case?

Let’s think about it. How does it make a book called *The Wisdom of Crowds*? The only way it can make *The Wisdom of Crowds* is if the crowd error is small, if people don’t make mistakes. In order for it to make *The Wisdom of Crowds* is you need a small crowd error. What also has to be true for it to make *The Wisdom of Crowds*, well it has to be the case that the average error is pretty big because if the average error was not big that would mean that everybody can get it right and it wouldn’t be surprising because we could just basically say this was an easy question.

If you make a book like *The Wisdom of Crowds* you have to have a small crowd error. You have to have a big average error. Guess what! It has to be the case that diversity is also big. The only examples that make the books about wise crowds look like this. You have high average individual error and high diversity. Therefore, you have *The Wisdom of Crowds* explained by diversity in most cases.

Let’s think about this intuition for a second. We have done a lot of math. We can see how from each line follows from the next, but that does not mean we necessarily intuitively understand why this diversity prediction theorem works.

What I want to do is I want to go back and look at another example of why diversity matters so much. Let’s look at 100 people guessing the weight of a steer. Let’s write down our theorem here and let’s suppose that each person is off by exactly 20 pounds. If each person is off by exactly 20 pounds then what we are going to get is an average error of 400. Let’s first suppose there is no predictive diversity so everybody guesses 20 pounds too high. That means the diversity is going to be zero, which means the crowd error is going to be 400 as well.

What we get is that we don’t get a wise crowd. The crowd is no better than the average people in it because the diversity is effectively zero. Let’s suppose instead that people are off by an average 20, but now we have a lot of diversity, so half the people are 20 too high and half the people are 20 too low. Well that’s going to mean that the diversity is also 400.

If the diversity is 400 and the average individual error is 400, what we are going to get is we are going to get a crowd error of zero. We get 0 = 400 − 400. Here is what is interesting. If we compare case one to case 2, we see that the people did not get any smarter, but the crowd got smarter. How did the crowd get smarter? It got smarter because it got more diverse. What we see in this simple example is that collective wisdom comes from diversity.

Now, that we have the core intuition, let’s drive it home. Suppose that we have a crowd that isn’t wise. We have a big crowd error. In order to have a big crowd error that means you have to have a big average individual error. The people can’t be getting it right. It also means that diversity has to be relatively small because otherwise that diversity would cancel out the errors. Remember at the beginning of the lecture I said sometimes crowds get things right, sometimes they get things wrong, sometimes crowds are mad. For a crowd to be mad what has to be true is we have to have big equals big minus small because if the diversity were not small relative to the error, the crowd couldn’t possibly be mad. Wise crowds come from diversity; mad crowds come from a lack of diversity and a lack of talent.

Where does the diversity come from? We have done all this mathematics, we have seen the intuition for why diversity improves the ability of a crowd to make predictions, but we haven’t explored at all what causes this diversity of predictions. That’s a really important question, and one that we are going to meet up with again in the next 4 or 5 lectures.

To get us started on this question, I want to go back to some of the examples we have talked about in terms of the crowds making accurate predictions. The first, remember I had my students predict the height of the tallest building in Rio. The tallest building in Rio was only 60 stories high. The individual guesses from my students went from around 30 up to 90 stories. I went back to my students and I said, well how did you come up with these predictions? The students who predicted 90 floors, they basically said Rio is the second largest city in Brazil, it is one of the largest cities in all of the Americas and it is beautiful, there is a lot of money there so it must be the case that they have huge skyscrapers. Why wouldn’t there be tall buildings?

The people who guessed 30 floors, they did something different. They used different logic. They said Rio is a beach city; you don’t want huge skyscrapers in a beach city. It isn’t even the capital of Brazil, so the tallest building is probably going to be a hotel. Beach hotels tend to be about 30 stories high. There were other people who predicted short buildings and they said look there is that *Christ the Redeemer* statue that sits on Corcovado Mountain, which is really huge, and you don’t anything that detracts from that, so probably nothing above 40 floors.

This is kind of funny because people made different predictions based on having different models, different understandings of what Rio was like. They had different conceptual models of how the world works. Those different models—one based on capital and wealth, one on beach culture, and one based on aesthetics all led to different predictions.

In reality, all of these ideas probably contributed to the tallest building really only being 60 floors. Rio does have lots of money and Rio is a major city, but it is also a beach city, it isn’t a city of bankers, so that argues against something being really, really tall. As for the Corcovado Mountain arguments, well it turns out that’s 2300 feet high, and a 1200 foot tower wouldn’t block out the view, it remains true that this *Christ the Redeemer* serves as this iconic image of Rio and no one would want some glass tower to detract from it, but the fact is that probably does not restrict the height. The truth, like the average prediction in this case, lies in between what people thought.

Let’s go back Galton’s steer because this one is more problematic. How in the heck do people look at a steer in a whole bunch of different ways? Here, the explanation comes less from diversity of mental models than it comes from diversity of experiences. Each of these people, this is the West of England, each of these people they probably had steer at home and they probably knew the weights of those steer. So, if I have a steer at home that weighs 1300 pounds and I look at the contest steer and it looks a little smaller, then I will guess a little less than 1300 pounds. If I have a steer at home that weighs 1050, maybe a little smaller than the contest steer, maybe I guess a little bit north of that 1050 amount.

This tendency for people to base predictions on what they know has a formal name; it is called a base rate bias. We are influenced by how we start thinking about a problem. In the case of the Galton’s steer, the crowd gets it right because the idiosyncratic errors of the individuals cancel out, and they are equally likely to be high or low and so therefore they are diverse and we get a wise crowd.

We have 2 primary reasons for these predictions to be diverse. The first one is different models and the second is different, idiosyncratic errors. In the case of guessing the tallest building in Rio, it is the different models that explains the wise crowd. In Galton’s steer, it seems to me at least, that it is a classic case of errors cancelling because they happened to be diverse because of the base rate thought.

I want to finish up with an observation that I am going to return to in greater depth near the end of this course—about 16 or 17 lectures from now. This observation concerns disagreement. Let’s suppose you go to a meeting and you are asked to predict something, like maybe to invest in some business opportunity. It could be something as simple as the number of attendees you are going to get at some event. It could be the sales of a new product, it could be the price of a stock, or, if you are sitting on the Federal Reserve’s Open Market Committee, it could be how much unemployment is going to change in the next month or what you should do with the money supply.

Let me lay out 2 scenarios. Scenario 1: You go to this meeting and everybody agrees, you all think the same thing, you make the same predictions, you use similar logic. Scenario 2: Your predictions differ because people use different models or they have different sets of experiences.

In scenario one, there are 2 possibilities: either we are all right, or we are all wrong. If it is a hard problem, it is probably not that likely that we are all right. The only way, you should feel good about the outcome is if you feel that it was an easy task. And if it was an easy task, then why did you have the meeting? It was easy. Everybody could have gotten it right on their own. There was no reason for bringing the diverse group together.

In scenario 2, there is disagreement. Some people might think sales are high, others might think sales are going to be low. We know from our corollary to the diversity prediction theorem, the crowd beats the averages law, that the crowd is going to be more accurate than its average member. So, when you leave this meeting you should feel good, you should feel like wow the crowd probably made a better prediction than a random person, including me, would have made on my own. We did better.

In other words, if you go to a meeting and you are predicting something and people disagree: That’s good. It is good because it means there is diversity in the room. And that diversity in the room improves performance. This isn’t a metaphor; we just worked through the math. It is a mathematical fact.

What is the key lesson? The key lesson is this: within your organizations, you should include multiple, diverse models when you are making a forecast. In your daily life, you should do the same thing. You should open your mind to new and diverse ways of looking at the world. Not only is it going to be fun and enlarging, it is going to be better. It is going to make you better at predicting what lies ahead.

Taught by Professor Scott E. Page, University of Michigan