|
Lesson 3 - Mean and Standard Deviation
Frequently an experiment will be performed several times under (assumed) identical conditions, resulting in what is referred to as a data set - several values that purport to measure the same thing. In this lesson we look at ways of characterizing the data set. Perhaps the most common way in which we do this is by giving the mean and the standard deviation of the data. The mean is simply the arithmetic average of the data. To calculate the mean you simply add up all of the values in your data set and divide by the number of elements in the set. For example, suppose you run the 100 meter dash on eight different occasions, under identical conditions, and your times are 12.23, 12.14, 12.45, 12.18, 12.26, 12.51, 12.47 and 12.36 seconds. You would find the mean by adding this series of numbers and dividing by eight. Do this calculation, either by hand or using your calculator, and write down the value you obtain for the mean. You will submit this value to your laboratory instructor later. We can represent the mean mathematically by the formula
where the x with the bar over it (read xbar) is the mean, xi represents the individual data points, and there are n such points.
Most values in any data set are not equal to the mean, and we can define the deviation of any value from the mean as
Calculate the deviation of each of the data points given above from the mean you calculated and record the results for reporting to your Laboratory Instructor.
Occasionally you will encounter the average deviation, given by the expression
Calculate and report the average deviation for the data above. Although the average deviation is easy to calculate, it has no statistical significance. The form of the deviation that is statistically significant, and which we will use in the next lesson to predict the form of the distribution curve for a set of data, is the standard deviation.
You will encounter two similar, but slightly different, formulas for the standard deviation. The first one
is known as the standard deviation of the population, and refers to the case when n is very large (or infinite). For a small data set, such as the ones you usually have in an experimental science, the standard deviation is approximated by what is referred to as the standard deviation for the sample, ususally represented by s to distinguish it from the population standard deviation s
This is the form of the standard deviation that we will use. Calculate both expressions for your data set (and report them, naturally). What will happen to the difference between the two expressions as n increases? Why? Now let's perform the calculations you have done above on a larger data set. When you click on the button labeled go, you will proceed to an Excel spreadsheet which shows the masses of a number of pre-1985 U.S. pennies. You should characterize that data set by the mean and explore several ways of expressing the deviations from the mean. In doing so, you will use the computational capabilities of Excel, rather than calculating by hand. The spreadsheet will show the data only, and will have a number of other column headings. You need to fill out those other columns and to compute the mean, average deviation and standard deviation. If you lose track of where you are while working on the spreadsheet you can always minimize it and look back at this page. When you are finished with your calculations, record your values for the mean, average deviation, maximum deviation and standard deviation. You will enter these in the electronic form to send to your laboratory instructor. Before leaving this exercise, quit Excel, and answer no when asked if you wish to save any changes in the spreadsheet.
|
|
|
|