STATISTICS II: NORMAL DISTRIBUTIONS
Objectives:
Normal distributions are a family of distributions that have the same general shape. They are symmetric with scores more concentrated in the middle than in the tails. Normal distributions are sometimes described as bell shaped. Examples of normal distributions are shown in Figure 1 Notice that they differ in how spread out they are. The area under each curve is the same. The height of a normal distribution can be specified mathematically in terms of two parameters: the mean (m) and the standard deviation (s).
The normal curve is not a single curve, rather it is an infinite number of possible curves, all described by the same algebraic expression:
π is the constant 3.14159, and e is the base of natural logarithms and is equal to 2.718282; x can take on any value from -infinity to +infinity.
Upon viewing this expression for the first time the initial reaction of the student is usually to panic. Don't. In general it is not necessary to "know" this formula to appreciate and use the normal curve. It is, however, useful to examine this expression for an understanding of how the normal curve operates.
A Family of Distributions
The normal curve is called a family of distributions. Each member of the family is determined by setting the parameters (m and σ) of the model to a particular value (number). Because the m parameter can take on any value, positive or negative, and the σ parameter can take on any positive value, the family of normal curves is quite large, consisting of an infinite number of members. This makes the normal curve a general-purpose model, able to describe a large number of naturally occurring phenomena, from test scores to the size of the stars.
Similarity of Members of the Family of Normal Curves
All the members of the family of normal curves, although different, have a number of properties in common. These properties include: shape, symmetry, tails approaching but never touching the X-axis, and area under the curve.
All members of the family of normal curves share the same bell shape, given the X-axis is scaled properly. Most of the area under the curve falls in the middle. The tails of the distribution (ends) approach the X-axis but never touch, with very little of the area under them.
All members of the family of normal curves are bilaterally symmetrical. That is, if any normal curve was drawn on a two-dimensional surface (a piece of paper), cut out, and folded through the third dimension, the two sides would be exactly alike. Human beings are approximately bilateral symmetrical, with a right and left side.
All members of the family of normal curves have tails that approach, but never touch, the X-axis. The implication of this property is that no matter how far one travels along the number line, in either the positive or negative direction, there will still be some area under any normal curve. Thus, in order to draw the entire normal curve one must have an infinitely long line. Because most of the area under any normal curve falls within a limited range of the number line, only that part of the line segment is drawn for a particular normal curve.
All members of the family of normal curves
have a total
area of one (1.00) under the curve, as do all probability
models or models
of frequency distributions. This property, in addition to
the property
of symmetry, implies that the area in each half of the
distribution is
.5 or one half.
Standard Normal Distribution
The standard normal distribution N(0,1) is a normal distribution with a mean of 0 and a standard deviation of 1. Normal distributions N(m,σ) can be transformed to standard normal distributions by the procedure called normalization, i.e. by using the transformation:
where X is a score from the original normal distribution, m is the mean of the original normal distribution, and σ is the standard deviation of the original normal distribution. The standard normal distribution is sometimes called the z distribution. A z score always reflects the number of standard deviations above or below the mean a particular score is. For instance, if one scored a 70 on a test with a mean of 50 and a standard deviation of 10, then one scored 2 standard deviations above the mean. Converting the test scores to z scores, an X of 70 would be:
So, a z score of 2 means the original score was 2 standard deviations above the mean. Note that the z distribution will only be a normal distribution if the original distribution (X) is normal.
What's so important about the normal distribution?
One reason the normal distribution is important is that many psychological and educational variables are distributed approximately normally. Measures of reading ability, introversion, job satisfaction, and memory are among the many psychological variables approximately normally distributed. Although the distributions are only approximately normal, they are usually quite close.
A second reason the normal distribution is so important
is that it
is easy for mathematical statisticians to work with. This means
that many
kinds of statistical tests can be derived for normal
distributions.
Fortunately, these tests work very well even if the
distribution is only
approximately normally distributed. Some tests work
well even with very
wide deviations from normality.
Finally, if the mean
and
standard deviation of a
normal
distribution are known, it is easy to convert back and forth from
raw scores
to percentiles.
Converting between scores from a
normal distribution
and percentile
ranks.
If the mean and standard deviation of a normal distribution are known, it is relatively easy to figure out the percentile rank of a person obtaining a specific score.
Example
1
Assume a test in Introductory Psychology is
normally
distributed with a mean of 80 and a standard deviation of 5. What
is the
percentile rank of a person who received a score of 70 on the
test?
Mathematical statisticians have developed
ways of determining
the proportion of a distribution that is below a given
number of standard
deviations from the mean. They have shown that only
2.3% of the population
will be less than or equal to a score two
standard deviations below
the mean. This is because the mean is 80 and
the standard deviation
is 5. Since 70 is 10 points below the mean (80-70 =
10) and since a standard
deviation is 5 points, there is a distance of 2
standard deviations between
the 80 and 70 (10/5=2).
In terms of the Introductory Psychology test example,
this
means that a person scoring 70 would have a percentile rank score
of 2.3.
This graph shows the distribution of scores on the test. The shaded area is 2.3% of the total area. The proportion of the area below 70 is equal to the proportion of the scores below 70.
What about a person scoring 75 on the test? The proportion of the area below 75 is the same as the proportion of scores below 75.
A score of 75 is one standard deviation below the mean because the mean is 80 and the standard deviation is 5. Mathematical statisticians have determined that 15.9% of the scores in a normal distribution are lower than a score one standard deviation below the mean. Therefore, the proportion of the scores below 75 is 0.159 and a person scoring 75 would have a percentile rank score of 15.9.
Table 1
gives the proportion of the scores below
various values of z. z is
computed with the formula:
where z is the number of
standard deviations (s) above
the mean (m) X is.
Table
1
When z is negative it
means that X is below
the mean. Thus, a z of -2 means that X is -2
standard deviations above
the mean which is the same thing as being +2
standard deviations below
the mean.
To take
another example, what is the percentile rank
of a person receiving a score
of 90 on the test?
The graph shows that most
people scored below 90. Since
90 is 2 standard deviations above the mean
[z = (90 - 80)/5 = 2] it can
be determined from the table that a z score
of 2 is equivalent to the 97.7th
percentile: The proportion of people
scoring below 90 is thus .977.
Table
2.
What score on the
Introductory Psychology test would it
have taken to be in the 75th
percentile?
(Remember the test has a mean of 80
and a standard deviation
of 5.) The answer is computed by reversing the
steps in the previous problems.
First, determine how many standard
deviations above the mean one would
have to be to be in the 75th
percentile. This can be found by using a z
table and finding the z
associated with .75. The value of z is .674. Thus,
one must be .674
standard deviations above the mean to be in the 75th percentile.
Since the
standard deviation is 5, one must be (5)(.674) = 3.37 points
above the
mean. Since the mean is 80, a score of 80 + 3.37 = 83.37 is
necessary.
Rounding off, a score of 83 is needed to be in the 75th
percentile.
Since
a little algebra demonstrates that X = m+ z s. For the
present
example, X = 80 + (.674)(5) = 83.37 as just shown.
Area under a portion of the normal curve
If the scores from a test are normally distributed with a mean of 60 and a standard deviation of 10, what proportion of the scores are above 85?
This problem is very similar to figuring out the percentile rank of a person scoring 85. The first step is to figure out the proportion of scores less than or equal to 85. This is done by figuring out how many standard deviations above the mean 85 is. Since 85 is 85-60 = 25 points above the mean and since the standard deviation is 10, a score of 85 is 25/10 = 2.5 standard deviations above the mean. Or, in terms of the formula,
A z table (see Table 2.) can be used to calculate that .9938 of the scores are less than or equal to a score 2.5 standard deviations above the mean. It follows that only 1-.9938 = .0062 of the scores are above a score 2.5 standard deviations above the mean. Therefore, only .0062 of the scores are above 85.
Example 2
Suppose you wanted to know the proportion of students receiving scores between 70 and 80. The approach is to figure out the proportion of students scoring below 80 and the proportion below 70. The difference between the two proportions is the proportion scoring between 70 and 80. First, the calculation of the proportion below 80. Since 80 is 20 points above the mean and the standard deviation is 10, 80 is 2 standard deviations above the mean. (z=(80-60)/10=2)
A z table can be used to determine that .9772 of the scores are below a score 2 standard deviations above the mean.
To calculate the proportion below 70, z= (70-60)/10=1.
A z
table can be used
to determine that the proportion of scores
less than 1 standard deviation
above the mean is .8413. So, if .1587 of
the scores are above 70 and .0228
are above 80, then .1587 -.0228 = .1359
are between 70 and 80.
Example 3
Assume a test is normally distributed with a mean of 100 and a standard deviation of 15. What proportion of the scores would be between 85 and 105?
The solution to this problem is similar to the solution to the last one. The first step is to calculate the proportion of scores below 85. Next, calculate the proportion of scores below 105. Finally, subtract the first result from the second to find the proportion scoring between 85 and 105.
Begin by calculating the proportion below 85. 85 is one standard deviation below the mean: z=(85-100)/15=-1
Using a z table with the value of -1 for z, the area below -1 (or 85 in terms of the raw scores) is .1587.
Doing the same thing for 105, z=(105-100)/15=0.333
A z table shows that the proportion scoring below .333 (105 in raw scores) is .6304.
The difference is .6304 - .1587 = .4714. So .4714 of the scores are between 85 and 105.

A
more complete z-table can be found
here.
HOMEWORK
1. If scores are normally distributed with a mean of
30 and a
standard deviation of 5, what percent of the scores is: (a) greater than
30? (b) greater than 37? (c) between 28 and 34?
2. (a) What are the mean and standard deviation of the standard
normal distribution? (b) What would be the mean and standard deviation of
a distribution created by multiplying the standard normal distribution by
10 and then adding 50?
3. The normal distribution is defined by two parameters. What are
they?
4. (a) What proportion of a normal distribution is within one
standard deviation of the mean? (b) What proportion is more than 1.8
standard deviations from the mean? (c) What proportion is between 1 and
1.5 standard deviations above the mean?
5. A test is normally distributed with a mean of 40 and a standard
deviation of 7. (a) What score would be needed to be in the 85th
percentile? (b) What score would be needed to be in the 22nd
percentile?