Standard deviation is a crucial concept in statistics that helps us measure the amount of variation or dispersion within a set of data. In simpler terms, it tells us how spread out the values are from the average. Often abbreviated as SD and represented by the symbol '`σ`', standard deviation is essentially the positive square root of the variance.
It's important to note that there are specific formulas for calculating standard deviation based on the nature of the data – whether it's grouped or ungrouped. Let's delve into the details of these formulas to gain a comprehensive understanding of how standard deviation is calculated in different scenarios.
Standard deviation represents the degree of dispersion or scatter of data points relative to their mean. It gauges how values are spread across a data sample, measuring the variation from the mean.
When dealing with a set of observations \(x_1, x_2, ..., x_n\), the mean deviation from the mean is determined by the sum of squared deviations \(\sum_{i=1}^{n}(x_i - \bar{x})^2\).
This sum of squares helps us assess the degree of dispersion or scatter.
The standard deviation formulas differ slightly when dealing with a sample versus a population. Here are the standard deviation formulas for both:
Population Standard Deviation `(σ)`:
\[ \sigma = \sqrt{\frac{\sum_{i=1}^{N}(x_i - \mu)^2}{N}} \]
In this formula:
Sample Standard Deviation `(s)`:
\[ s = \sqrt{\frac{\sum_{i=1}^{n}(x_i - \bar{x})^2}{n-1}} \]
In this formula:
Note the difference in the denominator:
This correction factor (dividing by \(n-1\)) in the Sample Standard Deviation formula is known as Bessel's correction and is used to provide an unbiased estimate of the Population Standard Deviation based on a sample.
The standard deviation formula is a mathematical expression used to calculate the standard deviation of a set of data. It involves several steps:
Let's consider a simple example to demonstrate how to calculate standard deviation. Suppose we have a set of exam scores for a class of five students: `75, 82, 90, 88,` and `79`.
Step `1`. Find the mean (average)
\( \bar{x} = \frac{75 + 82 + 90 + 88 + 79}{5} = 82.8 \)
Step `2`: Calculate deviations from the mean
\( (75 - 82.8) = -7.8 \)
\( (82 - 82.8) = -0.8 \)
\( (90 - 82.8) = 7.2 \)
\( (88 - 82.8) = 5.2 \)
\( (79 - 82.8) = -3.8 \)
Step `3`: Square each deviation
\( (-7.8)^2 = 60.84 \)
\( (-0.8)^2 = 0.64 \)
\( (7.2)^2 = 51.84 \)
\( (5.2)^2 = 27.04 \)
\( (-3.8)^2 = 14.44 \)
Step `4`: Calculate variance
\( \text{Variance} = \frac{60.84 + 0.64 + 51.84 + 27.04 + 14.44}{5} = 30.96 \)
Step `5`: Take the square Root to find standard deviation
\( \sigma = \sqrt{30.96} \approx 5.56 \)
Therefore, the standard deviation of the exam scores is approximately `5.52`. This value indicates the extent of variation or spread in the student's performance, helping to assess the data's overall consistency.
Let's consider the following discrete grouped data set and calculate its standard deviation.
Solution:
Step `1`. Calculate the mean (\(\bar{x}\)):
Use the data points and frequencies to find the mean of the data set.
\( \bar{x} = \frac{\sum_{i=1}^{n} (x_i \cdot f_i)}{\sum_{i=1}^{n} f_i} \)
\( \bar{x} = \frac{(5 \cdot 3) + (10 \cdot 5) + (15 \cdot 7) + (20 \cdot 4)}{3 + 5 + 7 + 4} \)
\( \bar{x} = \frac{15 + 50 + 105 + 80}{19} \)
\( \bar{x} = \frac{250}{19} \)
\( \bar{x} \approx 13.158 \)
Step `2`. Calculate the deviations
Find the deviation of each data point from the mean and square the result.
\( (x_i - \bar{x})^2 \)
Step `3`. Calculate the variance (\(\sigma^2\)):
Formula for variance:
\( \sigma^2 = \frac{\sum_{i=1}^{n} (x_i - \bar{x})^2 \cdot f_i}{N} \)
Find the variance by averaging the squared deviations.
\( \sigma^2 = \frac{(66.692 \cdot 3) + (9.949 \cdot 5) + (3.387 \cdot 7) + (46.856 \cdot 4)}{19} \)
\( \sigma^2 = \frac{200.076 + 49.745 + 23.709 + 187.424}{19} \)
\( \sigma^2 = \frac{460.954}{19} \)
\( \sigma^2 \approx 24.26 \)
Step `4`. Calculate the standard deviation (\(\sigma\)):
Finally, take the square root of the variance to get the standard deviation.
\( \sigma = \sqrt{24.26} \)
\( \sigma \approx 4.92 \)
So, the standard deviation of the given discrete grouped data set is approximately \(4.92\).
Let's consider the following grouped data set and calculate its standard deviation.
Solution:
Step `1`. Find the midpoints
Find the midpoint (\(x_i\)) for each class interval. It is calculated by adding the lower and upper limits of each class interval and dividing by `2`.
Step `2`. Calculate the mean (\(\bar{x}\)):
Use the midpoint values and frequencies to find the mean of the data set.
\( \bar{x} = \frac{\sum_{i=1}^{n} (x_i \cdot f_i)}{\sum_{i=1}^{n} f_i} \)
\( \bar{x} = \frac{(15 \cdot 5) + (25 \cdot 8) + (35 \cdot 12) + (45 \cdot 10)}{5 + 8 + 12 + 10} \)
\( \bar{x} = \frac{75 + 200 + 420 + 450}{35} \)
\( \bar{x} = \frac{1145}{35} \)
\( \bar{x} \approx 32.714 \)
Step `3`. Calculate the deviations
Find the deviation of each midpoint from the mean and square the result.
\( (x_i - \bar{x})^2 \)
4. Calculate the variance (\(\sigma^2\)):
Find the variance by averaging the squared deviations.
\( \sigma^2 = \frac{\sum_{i=1}^{n} (x_i - \bar{x})^2 \cdot f_i}{N} \)
\( \sigma^2 = \frac{(314.836 \cdot 5) + (59.604 \cdot 8) + (5.236 \cdot 12) + (151.002 \cdot 10)}{35} \)
\( \sigma^2 = \frac{1574.18 + 476.832 + 62.832 + 1510.02}{35} \)
\( \sigma^2 = \frac{3623.864}{35} \)
\( \sigma^2 \approx 103.53 \)
5. Calculate the standard deviation (\(\sigma\)):
Finally, take the square root of the variance to get the standard deviation.
\( \sigma = \sqrt{103.53} \)
\( \sigma \approx 10.175 \)
So, the standard deviation of the given grouped data set is approximately \(10.175\).
Standard deviation is a vital statistical measure that can give us several insights.
Q`1`. Calculate the standard deviation rounded to the nearest hundredth.
`68, 72, 74, 80, 82`
Answer: c
Q`2`. Calculate the standard deviation rounded to the nearest hundredth.
`15, 18, 19, 20, 22, 22, 25, 26, 28, 30`
Answer: a
Q`3`. Calculate the standard deviation rounded to the nearest hundredth.
Answer: d
Q`4`. Calculate the standard deviation rounded to the nearest hundredth.
Answer: b
Q`1`. What is standard deviation and why is it important in statistics?
Answer: Standard deviation is a measure of the amount of variation or dispersion in a set of values. It indicates how much individual data points differ from the mean of the data set. It is crucial in statistics as it helps assess the spread of data, identify outliers, and compare the variability between different datasets.
Q`2`. How is standard deviation different from variance?
Answer: Standard deviation and variance both measure the dispersion of data, but they differ in their units. Standard deviation is the square root of variance. While variance is expressed in squared units, standard deviation returns to the original units of the data, making it more interpretable and user-friendly.
Q`3`. What does a high or low standard deviation indicate about a dataset?
Answer: A high standard deviation indicates that data points in a dataset are spread out over a larger range, suggesting higher variability and inconsistent data.
Conversely, a low standard deviation implies that data points are clustered closely around the mean, indicating lower variability and a more consistent dataset.
Q`4`. Can Standard Deviation be negative?
Answer: No, standard deviation cannot be negative. It is always a non-negative value because it involves squaring the differences between individual data points and the mean. The square root of these squared differences ensures a non-negative result, representing the spread or dispersion of data as a positive value.