So
let me start by saying that the significant majority of people are absolutely

*terrible*at working with statistics. Part of the problem is that people just don’t like math, and part of the problem is the cognitive biases that enter into actual implementations in real-world situations, but another part of the problem is that the education system just doesn’t teach you what you need to know.
For
example, most intro-level probability courses (which are all your average
person ever takes) will teach you about the mean, median, and mode. The mean of a distribution is what we more
commonly refer to as the average. The
mode is the most frequently-occurring result, and the median is the result that
lies right in the middle of the distribution.
But if I add 14 plus the rolls of two 6-sided dice (abbreviated as 2d6 +
14) and compare this to adding the rolls of two 20-sided dice (abbreviated as
2d20), I will find that the two distributions have the same mean, median, and
mode (all of which are 21). So your
average person has

*no idea*how to mathematically describe the difference between 2d6 + 14 and 2d20 (though the average Dungeons and Dragons player might). The statistical concept required to differentiate the two actions just isn’t taught at an introductory level.
Furthermore,
when you get into real statistics the median and the mode are mostly
ignored. The mean (which, despite the
terminology introduced in your high-school classes is actually referred to as
the average by everyone I’ve ever worked with) is used all the time, but nobody
cares about the median and the mode.
(Though now that I think about it, this may be because with a Gaussian
or Normal distribution, mean = median = mode, and there’s no use giving someone
the same information three times.)

What
people

*do*care about, in addition to the mean, is a number called the*standard deviation*. This is a number that represents the*spread*of outcomes in a distribution. While 2d6 + 14 and 2d20 have the same mean, median, and mode, they have very different standard deviations. This is because the possible outcomes of 2d20 range from 2 to 40, while the possible outcomes for 2d6 + 14 range only from 16 to 26. So while the two distributions have the same average, the same most likely outcome, and the same middle outcome, the possible outcomes in 2d6 + 14 are grouped much more closely together, while the possible outcomes in 2d20 are more spread out.
The
standard deviation is more difficult to calculate than the mean, median, and
mode (which is probably why it’s left out of introductory courses). Mathematically, it’s the difference between
the average of “outcome squared” and the square of “average outcome.” But you don’t actually need to be able to
calculate standard deviations to understand what I want to say about
discrimination. All you need to know is
what a standard deviation represents, and that’s the spread of
possibilities. The larger the standard
deviation, the larger the spread.

This
is easier to see graphically, so here you go: Two graphs representing the
distribution of 2d6 + 14 versus 2d20.

Good choice of topic for an intro to discrimination, Zaq. I see I've come late to your posts on discrimination, but I'd still like to mention a bit of 'advice' (just my opinion, actually).

ReplyDeleteWhen giving examples of flawed thinking, such as the later examples in this post, I think you might do well to avoid using examples from the specific domain in which you are looking to challenge. Otherwise, the examples will tend to trigger just the kinds of flawed reactions you're hoping to illustrate, and the effectiveness of the example suffers as a result.

For example, your example of men being, on average better at math. Now, you didn't actually come out and state that 'men are on average better at math', but by using that example repeatedly, to an unfamiliar observer who holds typical biases, it can sound a heck of a lot like that you are presuming it to be true. When this presumption is made, then the attempt at illustrating an example becomes 'suspect' since it then would appear that you're coming at it from a biased viewpoint, and perhaps cherry-picking or poisoning the well, or whatever.

Personally, when I approach examples along the lines of what you were showing, I might bring up the specific example of how belief that 'men are better on average' is common and how that plays out in a typical discussion, but then I would switch that up to a more abstract or neutral example for demonstrating the general principle of how different statistical spreads can invalidate the typical flawed reasoning. Maybe something along the lines of different fruits being different sizes or something.

I rather like that idea. I shall have to steal it :P

ReplyDeleteMaybe next time it will help keep me from getting covered in straw...