Standard Deviation

    • Introduction
    • Definition of Standard Deviation
    • Formula for Standard Deviation
    • Calculating Standard Deviation
    • Calculating Standard Deviation of Ungrouped Data
    • Calculating Standard Deviation of Grouped Data (Discrete)
    • Calculating Standard Deviation of Grouped Data (Continuous)
    • Real-Life Applications of Standard Deviation
    • Practice Problems
    • Frequently Asked Questions
       

    Introduction

    Standard deviation is a crucial concept in statistics that helps us measure the amount of variation or dispersion within a set of data. In simpler terms, it tells us how spread out the values are from the average. Often abbreviated as SD and represented by the symbol '`σ`', standard deviation is essentially the positive square root of the variance. 

    • A low standard deviation indicates that the values are close to the average (mean).
    • A high standard deviation suggests that the values deviate significantly from the mean. 

    It's important to note that there are specific formulas for calculating standard deviation based on the nature of the data – whether it's grouped or ungrouped. Let's delve into the details of these formulas to gain a comprehensive understanding of how standard deviation is calculated in different scenarios.

     

    Definition of Standard Deviation

    Standard deviation represents the degree of dispersion or scatter of data points relative to their mean. It gauges how values are spread across a data sample, measuring the variation from the mean. 

    When dealing with a set of observations \(x_1, x_2, ..., x_n\), the mean deviation from the mean is determined by the sum of squared deviations \(\sum_{i=1}^{n}(x_i - \bar{x})^2\). 

    This sum of squares helps us assess the degree of dispersion or scatter. 

    • A small sum indicates that observations are close to the mean, signifying a lower degree of dispersion
    • A large sum suggests a higher degree of scatter from the mean. 
       

    Formula for Standard Deviation

    The standard deviation formulas differ slightly when dealing with a sample versus a population. Here are the standard deviation formulas for both:

    Population Standard Deviation `(σ)`:

    \[ \sigma = \sqrt{\frac{\sum_{i=1}^{N}(x_i - \mu)^2}{N}} \]

    In this formula:

    • \( \sigma \) represents the population standard deviation.
    • \( N \) is the total number of data points in the population.
    • \( x_i \) denotes each individual data point.
    • \( \mu \) is the mean of the population.

    Sample Standard Deviation `(s)`:

    \[ s = \sqrt{\frac{\sum_{i=1}^{n}(x_i - \bar{x})^2}{n-1}} \]

    In this formula:

    • \( s \) represents the sample standard deviation.
    • \( n \) is the total number of data points in the sample.
    • \( x_i \) denotes each individual data point.
    • \( \bar{x} \) is the mean of the sample.

    Note the difference in the denominator:

    • For the Population Standard Deviation, it is divided by the total number of data points (\(N\))
    • For the Sample Standard Deviation, it is divided by one less than the number of data points (\(n-1\)). 

    This correction factor (dividing by \(n-1\)) in the Sample Standard Deviation formula is known as Bessel's correction and is used to provide an unbiased estimate of the Population Standard Deviation based on a sample.
     

    Calculating Standard Deviation

    The standard deviation formula is a mathematical expression used to calculate the standard deviation of a set of data. It involves several steps:

    1. Find the mean (average) of the data set, denoted by \(\bar{x}\).
    2. Subtract the mean from each data point to determine the deviation of each value from the mean.
    3. Square each deviation to eliminate negative values and emphasize differences.
    4. Calculate the average of these squared deviations, known as the variance.
    5. Take the square root of the variance to obtain the standard deviation.
       

    Calculating Standard Deviation of Ungrouped Data

    Let's consider a simple example to demonstrate how to calculate standard deviation. Suppose we have a set of exam scores for a class of five students: `75, 82, 90, 88,` and `79`.

    Step `1`. Find the mean (average)

    \( \bar{x} = \frac{75 + 82 + 90 + 88 + 79}{5} = 82.8 \)

    Step `2`: Calculate deviations from the mean

    \( (75 - 82.8) = -7.8 \)
    \( (82 - 82.8) = -0.8 \)
    \( (90 - 82.8) = 7.2 \)
    \( (88 - 82.8) = 5.2 \)
    \( (79 - 82.8) = -3.8 \)

    Step `3`: Square each deviation

    \( (-7.8)^2 = 60.84 \)
    \( (-0.8)^2 = 0.64 \)
    \( (7.2)^2 = 51.84 \)
    \( (5.2)^2 = 27.04 \)
    \( (-3.8)^2 = 14.44 \)

    Step `4`: Calculate variance

    \( \text{Variance} = \frac{60.84 + 0.64 + 51.84 + 27.04 + 14.44}{5} = 30.96 \)

    Step `5`: Take the square Root to find standard deviation

    \( \sigma = \sqrt{30.96} \approx 5.56 \)

    Therefore, the standard deviation of the exam scores is approximately `5.52`. This value indicates the extent of variation or spread in the student's performance, helping to assess the data's overall consistency.
     

    Calculating Standard Deviation of Grouped Data (Discrete)

    Let's consider the following discrete grouped data set and calculate its standard deviation.

    Solution:

    Step `1`. Calculate the mean (\(\bar{x}\)):

    Use the data points and frequencies to find the mean of the data set.
    \( \bar{x} = \frac{\sum_{i=1}^{n} (x_i \cdot f_i)}{\sum_{i=1}^{n} f_i} \)
    \( \bar{x} = \frac{(5 \cdot 3) + (10 \cdot 5) + (15 \cdot 7) + (20 \cdot 4)}{3 + 5 + 7 + 4} \)
    \( \bar{x} = \frac{15 + 50 + 105 + 80}{19} \)
    \( \bar{x} = \frac{250}{19} \)
    \( \bar{x} \approx 13.158 \)

     

    Step `2`. Calculate the deviations

    Find the deviation of each data point from the mean and square the result.
    \( (x_i - \bar{x})^2 \)

    Step `3`. Calculate the variance (\(\sigma^2\)):

    Formula for variance:
    \( \sigma^2 = \frac{\sum_{i=1}^{n} (x_i - \bar{x})^2 \cdot f_i}{N} \)

    Find the variance by averaging the squared deviations.
    \( \sigma^2 = \frac{(66.692 \cdot 3) + (9.949 \cdot 5) + (3.387 \cdot 7) + (46.856 \cdot 4)}{19} \)
    \( \sigma^2 = \frac{200.076 + 49.745 + 23.709 + 187.424}{19} \)
    \( \sigma^2 = \frac{460.954}{19} \)
    \( \sigma^2 \approx 24.26 \)

     

    Step `4`. Calculate the standard deviation (\(\sigma\)):

    Finally, take the square root of the variance to get the standard deviation.
    \( \sigma = \sqrt{24.26} \)
    \( \sigma \approx 4.92 \)

    So, the standard deviation of the given discrete grouped data set is approximately \(4.92\).

     

    Calculating Standard Deviation of Grouped Data (Continuous)

    Let's consider the following grouped data set and calculate its standard deviation.

    Solution:

    Step `1`. Find the midpoints

    Find the midpoint (\(x_i\)) for each class interval. It is calculated by adding the lower and upper limits of each class interval and dividing by `2`.

    Step `2`. Calculate the mean (\(\bar{x}\)):

    Use the midpoint values and frequencies to find the mean of the data set.

    \( \bar{x} = \frac{\sum_{i=1}^{n} (x_i \cdot f_i)}{\sum_{i=1}^{n} f_i} \)

    \( \bar{x} = \frac{(15 \cdot 5) + (25 \cdot 8) + (35 \cdot 12) + (45 \cdot 10)}{5 + 8 + 12 + 10} \)

    \( \bar{x} = \frac{75 + 200 + 420 + 450}{35} \)

    \( \bar{x} = \frac{1145}{35} \)

    \( \bar{x} \approx 32.714 \)

     

    Step `3`. Calculate the deviations

    Find the deviation of each midpoint from the mean and square the result.
    \( (x_i - \bar{x})^2 \)

    4. Calculate the variance (\(\sigma^2\)):

    Find the variance by averaging the squared deviations.

    \( \sigma^2 = \frac{\sum_{i=1}^{n} (x_i - \bar{x})^2 \cdot f_i}{N} \)

    \( \sigma^2 = \frac{(314.836 \cdot 5) + (59.604 \cdot 8) + (5.236 \cdot 12) + (151.002 \cdot 10)}{35} \)

    \( \sigma^2 = \frac{1574.18 + 476.832 + 62.832 + 1510.02}{35} \)

    \( \sigma^2 = \frac{3623.864}{35} \)

    \( \sigma^2 \approx 103.53 \)

     

    5. Calculate the standard deviation (\(\sigma\)):

    Finally, take the square root of the variance to get the standard deviation.

    \( \sigma = \sqrt{103.53} \)

    \( \sigma \approx 10.175 \)

    So, the standard deviation of the given grouped data set is approximately \(10.175\).

     

    Real-Life Applications of Standard Deviation

    Standard deviation is a vital statistical measure that can give us several insights.

    1. Many investing companies use standard deviation to understand how much a fund’s performance deviates from the expected return. This deviation allows them to measure the risk associated with market securities. It also helps to forecast the performance of future market funds.
       
    2. Advert companies calculate the standard deviation of the revenue earned for each ad. This helps them to understand the fluctuation they can expect for a given ad.
       
    3. In human resources, the recruiting managers calculate the standard deviation of pay in a certain field. This helps them to determine the type of salary variation they need to provide for the new hires.
       
    4. Standard deviation is also widely used in weather forecasting. They can study the variation in daily and monthly temperatures in different cities.
       
    5. Standard deviation is also widely used in the healthcare industry by insurance analysts. Insurance analysts calculate the standard deviation of the age of the patients they provide insurance for. This helps them understand how much variation exists among the ages of individuals they provide insurance for.
       
    6. Standard deviation is a metric used often by real estate agents. Real estate agents calculate the standard deviation of house prices in a particular area. This informs their clients of the type of variation in house prices they can expect.
       
    7. Professors at universities use standard deviation to calculate the spread of test scores among students. This helps them understand whether most students score close to the average or have a wide spread in test scores.
       
    8. Like academics, standard deviation is also used on sports scores. This helps a coach to assess the consistency or variability of an athlete’s or team’s performances.

    Practice Problems

    Q`1`. Calculate the standard deviation rounded to the nearest hundredth.

    `68, 72, 74, 80, 82`

    1. `3.51`
    2. `4.95`
    3. `5.15`
    4. `5.76`

    Answer: c

     

    Q`2`. Calculate the standard deviation rounded to the nearest hundredth.

    `15, 18, 19, 20, 22, 22, 25, 26, 28, 30`

    1. `4.48`
    2. `4.95`
    3. `4.72`
    4. `5.16`

    Answer: a

     

    Q`3`. Calculate the standard deviation rounded to the nearest hundredth.

    1. `2.48`
    2. `2.45`
    3. `2.67`
    4. `2.60`

    Answer: d

     

    Q`4`. Calculate the standard deviation rounded to the nearest hundredth.

    1. `5.20`
    2. `6.02`
    3. `5.82`
    4. `6.20`

    Answer: b

     

    Frequently Asked Questions

    Q`1`. What is standard deviation and why is it important in statistics?

    Answer: Standard deviation is a measure of the amount of variation or dispersion in a set of values. It indicates how much individual data points differ from the mean of the data set. It is crucial in statistics as it helps assess the spread of data, identify outliers, and compare the variability between different datasets.

     

    Q`2`. How is standard deviation different from variance?

    Answer: Standard deviation and variance both measure the dispersion of data, but they differ in their units. Standard deviation is the square root of variance. While variance is expressed in squared units, standard deviation returns to the original units of the data, making it more interpretable and user-friendly.

     

    Q`3`. What does a high or low standard deviation indicate about a dataset?

    Answer: A high standard deviation indicates that data points in a dataset are spread out over a larger range, suggesting higher variability and inconsistent data.

    Conversely, a low standard deviation implies that data points are clustered closely around the mean, indicating lower variability and a more consistent dataset.

     

    Q`4`. Can Standard Deviation be negative?

    Answer: No, standard deviation cannot be negative. It is always a non-negative value because it involves squaring the differences between individual data points and the mean. The square root of these squared differences ensures a non-negative result, representing the spread or dispersion of data as a positive value.