Geometric Distribution

    • Introduction
    • What is Geometric Distribution?
    • Geometric Distribution Formula
    • Geometric Random Variable
    • Expected Value of Geometric Distribution
    • Geometric Distribution Standard Deviation
    • Comparing Geometric Distribution With Binomial Distribution
    • Real-life Applications of Geometric Distribution
    • Solved Examples
    • Practice Problems
    • Frequently Asked Questions

     

    Introduction

    Geometric distribution is a discrete probability distribution that describes the number of trials needed to achieve the first success in a series of independent trials. In a geometric distribution, each trial is called a Bernoulii trial. Each trial has only two possible outcomes: success or failure. Each outcome has a fixed probability across all trials. Each trial is thus not effected by past or future trials. An individual trial that is a failure will not impact the probability of the next trial being a success. It has various real-life applications.

     

    What is Geometric Distribution?

    The geometric distribution represents the probability of the number of successive failures before a success is obtained in a series of independent Bernoulli trials. A Bernoulli trial is an experiment with two possible outcomes: success or failure.

    Understanding the geometric distribution and its applications can help in making informed decisions and analyzing outcomes in various fields.

     

    Geometric Distribution Formula

    The probability mass function (PMF) of the geometric distribution provides the probability of obtaining the first success on the \( k \)-th trial. The formula for the PMF of the geometric distribution is as follows:

    where

    \( P(X = k) \) is the probability that the first success occurs on the \( k \)-th trial.

    \( p \) is the probability of success on each trial.

    \( k \) is the number of trials until the first success.

    In this formula, \( (1 - p)^{k-1} \) represents the probability of having \( k-1 \) consecutive failures followed by one success on the \( k \)-th trial, and \( p \) represents the probability of success on the \( k \)-th trial.

    This formula captures the essence of the geometric distribution, where the probability of success remains constant across trials, and the number of trials until the first success follows a geometric progression.

    Example: Calculate the probability of rolling a `4` for the first time on the fifth roll of a standard six-sided die.

    Solution:

    We can use the geometric distribution to find the probability of rolling a `4` for the first time on the fifth roll.

    Here, the "success" is rolling a `4`, and a "failure" is rolling any number other than `4`.

    Parameters for geometric distribution

    Probability of success, \( p \): The probability of rolling a `4` on a single roll of a six-sided die is \( \frac{1}{6} \) since there is one `4` on the die and six possible outcomes.

    Probability of failure, \( 1-p \): The probability of not rolling a `4` is \( \frac{5}{6} \).

    Number of trials until the first success, \( k \): We are interested in the first success (rolling a `4`) occurring on the `5`th roll.

    Applying the geometric distribution formula the probability mass function (PMF) for a geometric distribution is:

    \( P(X = k) = (1-p)^{k-1} \times p \)

    Plugging in the values:

    \( p = \frac{1}{6} \)

    \( k = 5 \)

    \( P(X = 5) = \left(1 - \frac{1}{6}\right)^{5-1} \times \frac{1}{6} = \left(\frac{5}{6}\right)^4 \times \frac{1}{6} \)

    \( \left(\frac{5}{6}\right)^4 = \left(\frac{5}{6}\right) \times \left(\frac{5}{6}\right) \times \left(\frac{5}{6}\right) \times \left(\frac{5}{6}\right) = \frac{625}{1296} \)

    \( P(X = 5) = \frac{625}{1296} \times \frac{1}{6} = \frac{625}{7776} \)

    The probability that a person rolls a `4` for the first time on the fifth roll of a six-sided die is \( \frac{625}{7776} \), which is approximately `0.08038` or about `8.038%`.

     

    Geometric Random Variable

    A geometric random variable (geometric RV) is a discrete random variable that represents the number of trials needed to achieve the first success in a series of independent Bernoulli trials, where each trial has a constant probability \( p \) of success and \( 1-p \) of failure.

    The possible values of a geometric random variable are \( 1, 2, 3, \ldots \), representing the number of trials until the first success occurs.

    The expected value (mean) of a geometric random variable is \( \frac{1}{p} \), and the variance is \( \frac{1-p}{p^2} \). 

    Geometric random variables find applications in various fields, including probability theory, statistics, engineering, finance, and biology, among others, where modeling the number of trials until a certain event occurs is important.

     

    Expected Value of Geometric Distribution

    The expected value (or mean) of a random variable \( X \), denoted by \( E(X) \), represents the average value that would be obtained if an experiment were repeated a large number of times. Mathematically, it is defined as the weighted average of all possible values of \( X \), where each value is weighted by its corresponding probability.

    For a geometric random variable \( X \), with probability of success \( p \), the expected value is indeed given by:

    \( E(X) = \frac{1}{p} \)

    This formula represents the average number of trials needed to achieve the first success.

    Example: The probability of success on each trial is \( p = 0.2 \). Find the expected value of the geometric distribution.

    Solution:

    The expected value of the geometric distribution would be \( E(X) = \frac{1}{0.2} = 5 \).

    This means that on average, you would expect to achieve the first success after `5` trials.

    The expected value is an important measure of central tendency in probability distributions and provides insight into the average behavior of the random variable.

    So, indeed, the mean of the geometric distribution and the expected value of the geometric random variable are one and the same, reflecting the average behavior of the random variable over many trials. 

     

    Standard Deviation of a Geometric Distribution 

    The standard deviation of a geometric distribution measures the spread or dispersion of the distribution. For the geometric distribution, the standard deviation can be calculated using the following formula:

    \( \text{SD}(X) = \sqrt{\frac{1-p}{p^2}} \)

    Where:

    • \( \text{SD}(X) \) is the standard deviation of the geometric distribution.
       
    • \( p \) is the probability of success on each trial.

    This formula quantifies the variability or spread of the number of trials until the first success in a series of Bernoulli trials with probability of success \( p \).

    Example: If \( p = 0.2 \), find the standard deviation of the geometric distribution.

    Solution:

    Standard deviation of the geometric distribution would be 

    \( \text{SD}(X) = \sqrt{\frac{1-0.2}{0.2^2}} = \sqrt{\frac{0.8}{0.04}} \approx \sqrt{20} \approx 4.47 \)

    So, for a geometric distribution with \( p = 0.2 \), the standard deviation would be approximately \( 4.47 \).

    The standard deviation provides a measure of how much the values of the random variable \( X \) typically deviate from the mean (expected value) of the distribution. A larger standard deviation indicates greater variability, while a smaller standard deviation indicates less variability.

     

    Comparing Geometric Distribution With Binomial Distribution

    The binomial and geometric distributions are both fundamental in probability theory and have important applications in various fields. However, they are used to model different types of scenarios and have distinct characteristics. Let's compare them:

    In summary, the binomial distribution is used when there is a fixed number of trials and you're interested in the total number of successes, while the geometric distribution is used when you're interested in the number of trials until the first success in a series of independent trials.

     

    Real-life Applications of Geometric Distribution

    Geometric distribution is used for modeling the number of attempts or trials needed to achieve success in various scenarios. Few of them are listed below:

    Call Center Operations

    Consider a call center where operators handle customer inquiries. Let's say that each call has a certain probability of resulting in a successful resolution, and this probability remains constant for each call. The call center managers may be interested in understanding how many calls it takes, on average, until a successful resolution is achieved. This information is valuable for resource allocation, staffing decisions, and optimizing customer service operations.

    Example: 

    Suppose the probability of successfully resolving a customer inquiry on each call is \( p = 0.2 \). Using the geometric distribution, we can calculate the expected number of calls until the first successful resolution:

    \( E(X) = \frac{1}{p} = \frac{1}{0.2} = 5 \)

    This means that, on average, it takes `5` calls until a successful resolution is achieved.

    Importance:

    Understanding the number of attempts or trials needed until success occurs is crucial for optimizing operations, managing resources efficiently, and providing satisfactory service to customers. By using the geometric distribution, organizations can make informed decisions and improve their performance in various real-life scenarios, such as call center operations, quality control processes, and marketing campaigns.

     

    Quality Control and Manufacturing

    In manufacturing processes, items are often inspected for defects. Each item may have a certain probability of being defective. The geometric distribution can be used to model the number of items inspected until the first defective item is found. This information is valuable for quality control purposes and helps in optimizing production processes.

     

    Website Conversion Rate

    In online marketing, businesses aim to convert website visitors into customers. The conversion rate represents the probability that a visitor will make a purchase or take a desired action (e.g., signing up for a newsletter) on each visit. The geometric distribution can be used to model the number of website visits until the first conversion occurs. This helps businesses understand customer behavior and optimize their marketing strategies.

     

    Equipment Maintenance

    In maintenance management, equipment failures are inevitable, and maintenance activities aim to reduce downtime and ensure operational efficiency. The geometric distribution can be used to model the number of operating hours until the first equipment failure occurs. This information helps in scheduling preventive maintenance tasks and predicting equipment reliability.

     

    Loan Default Prediction

    In the financial industry, lenders assess the risk of loan default by borrowers. Each borrower has a certain probability of defaulting on a loan. The geometric distribution can be used to model the number of loan repayments until the first default occurs. This helps lenders estimate credit risk and make informed decisions about loan approval and interest rates.

     

    Disease Spread

    In epidemiology, the spread of infectious diseases among a population is of significant concern. The geometric distribution can be used to model the number of contacts or interactions until the first infection occurs. This information helps public health officials understand disease transmission dynamics and develop effective intervention strategies, such as vaccination campaigns and social distancing measures.

     

    Customer Churn

    In customer relationship management, businesses aim to retain customers and reduce churn rates. Each customer has a certain probability of discontinuing their relationship with the business. The geometric distribution can be used to model the number of interactions or subscription renewals until the first customer churns. This helps businesses identify at-risk customers and implement retention strategies.

     

    Solved Examples

    Example `1`. A website has a click-through rate of `0.1`, meaning that each time a user visits the website, there is a `10%` chance that they will click on an advertisement. Use the geometric distribution to find the probability that it takes at least `3` visits for a user to click on an advertisement.

    Solution:

    Given: Probability of clicking on an advertisement (\( p \)) `= 0.1`

    We want to find \( P(X \geq 3) \), where \( X \) is the number of visits until the first click on an advertisement.

    Using the geometric distribution formula, we find:

    \( P(X \geq 3) = 1 - P(X < 3) = 1 - (P(X = 1) + P(X = 2)) \)

    \( P(X = k) = (1 - p)^{k-1} \times p \)

    For \( k = 1 \):

    \( P(X = 1) = (1 - 0.1)^{1-1} \times 0.1 = 0.1 \)

    For \( k = 2 \):

    \( P(X = 2) = (1 - 0.1)^{2-1} \times 0.1 = 0.09 \)

    So,

    \( P(X \geq 3) = 1 - (0.1 + 0.09) = 1 - 0.19 = 0.81 \)

    Therefore, the probability that it takes at least `3` visits for a user to click on an advertisement is \( 0.81 \) or \( 81\% \).

     

    Example `2`. A customer support hotline has a `20%` chance of resolving each call successfully on the first attempt. What is the probability that the first successful resolution occurs on the third call?

    Solution:

    Probability of Success (\(p\)) `= 0.20` (`20%` chance of resolving a call successfully on the first attempt)

    Probability of Failure (\(1-p\)) `= 0.80` (`80%` chance of not resolving a call on the first attempt).

    Use the geometric distribution formula:

    \( P(X = k) = (1-p)^{k-1} \times p \)

    Here, \(k = 3\) because we are interested in finding the probability of the first success occurring on the third call.

    \( P(X = 3) = (1 - 0.20)^{3-1} \times 0.20 = 0.80^2 \times 0.20 \)

    \( P(X = 3) = 0.64 \times 0.20 = 0.128 \)

    Hence, the probability that the first successful resolution happens on the third call is `0.128`, or `12.8%`. 

     

    Example `3`. Adrian is taking a multiple-choice quiz where each question has `4` options, and only one is correct. He randomly guesses the answers. What is the probability that Adrian gets the first question correct on the `5`th attempt?

    Solution:

    Given: Probability of guessing the correct answer (\( p \)) `= 1/4`

    We want to find \( P(X = 5) \), where \( X \) is the number of attempts until the first correct answer.

    Using the geometric distribution formula:

    \( P(X = 5) = (1 - p)^{5-1} \times p \)

    \( P(X = 5) = (1 - \frac{1}{4})^{5-1} \times \frac{1}{4} \)

    \( P(X = 5) = (\frac{3}{4})^4 \times \frac{1}{4} \)

    \( P(X = 5) = \frac{81}{256} \times \frac{1}{4} \)

    \( P(X = 5) = \frac{81}{1024} \)

    Therefore, the probability that the student gets the first question correct on the `5`th attempt is \( \frac{81}{1024} \).

     

    Example `4`. A technician is repairing a faulty machine that has a `30%` chance of being fixed with each repair attempt. What is the probability that the machine is fixed on the `3`rd repair attempt?

    Solution:

    Given: Probability of fixing the machine (\( p \)) `= 0.30`

    We want to find \( P(X = 3) \), where \( X \) is the number of repair attempts until the machine is fixed.

    Using the geometric distribution formula:

    \( P(X = 3) = (1 - p)^{3-1} \times p \)

    \( P(X = 3) = (1 - 0.30)^{3-1} \times 0.30 \)

    \( P(X = 3) = (0.70)^2 \times 0.30 \)

    \( P(X = 3) = 0.49 \times 0.30 \)

    \( P(X = 3) = 0.147 \)

    Therefore, the probability that the machine is fixed on the `3`rd repair attempt is \( 0.147 \) or \( 14.7\% \).

     

    Example `5`. A basketball player has a free-throw success rate of `60%`. What is the probability that it takes exactly `2` attempts for the player to make a successful free throw?

    Solution:

    Given: Probability of making a successful free throw (\( p \)) `= 0.60`

    We want to find \( P(X = 2) \), where \( X \) is the number of attempts until the first successful free throw.

    Using the geometric distribution formula:

    \( P(X = 2) = (1 - p)^{2-1} \times p \)

    \( P(X = 2) = (1 - 0.60)^{2-1} \times 0.60 \)

    \( P(X = 2) = (0.40)^1 \times 0.60 \)

    \( P(X = 2) = 0.40 \times 0.60 \)

    \( P(X = 2) = 0.24 \)

    Therefore, the probability that it takes exactly `2` attempts for the player to make a successful free throw is \( 0.24 \) or \( 24\% \).

     

    Practice Problems

    Q`1`. A student is studying for a multiple-choice test with `5` answer choices per question. The student randomly guesses the answers. What is the probability that the student gets the first question correct on the `4`th attempt?

    1. \( \frac{64}{625} \)
    2. \( \frac{25}{625} \)
    3. \( \frac{25}{664} \)
    4. \( \frac{4}{5} \)

    Answer: a

     

    Q`2`. A student is trying to solve a riddle, and each attempt has a `25%` chance of being correct. What is the probability that the student solves the riddle on the `3`rd attempt?

    1. \( 12.5\% \)
    2. \( 14.06\% \)
    3. \( 7.06\% \)
    4. \( 25.03\% \)

    Answer: b

     

    Q`3`. A light bulb manufacturer knows that `10%` of its bulbs are defective. What is the probability that the first defective bulb is found after inspecting `6` bulbs?

    1. \( 6.10\% \)
    2. \( 10\% \)
    3. \( 5.90\% \)
    4. \( 16.90\% \)

    Answer: c

     

    Q`4`. A student is taking a true-or-false quiz. If the student guesses randomly, what is the probability that the student answers the first question correctly on the `5`th attempt?

    1. \( 5.125\% \)
    2. \( 1.125\% \)
    3. \( 6.125\% \)
    4. \( 3.125\% \)

    Answer: d

     

    Q`5`. A factory produces smoke detectors and determines that `3` out of every `75` smoke detectors are defective. What is the probability that a quality analyst will find the first faulty smoke detector on the `6`th one that he tested?

    1. `≈` \( 2.36\% \)
    2. `≈` \( 4.85\% \)
    3. `≈` \( 3.26\% \)
    4. `≈` \( 8.35\% \)

    Answer: c

     

    Frequently Asked Questions

    Q`1`. What is the geometric distribution?

    Answer: The geometric distribution is a probability distribution that models the number of independent trials needed to achieve the first success in a series of Bernoulli trials, where each trial has two possible outcomes: success or failure.

     

    Q`2`. What are the key characteristics of the geometric distribution?

    Answer: The key characteristics of the geometric distribution include:

    • It is discrete, meaning it deals with countable outcomes.
    • It is memoryless, meaning the outcome of one trial does not affect the outcome of subsequent trials.
    • It has a single parameter \( p \), the probability of success on each trial.

     

    Q`3`. What is the probability mass function (PMF) of the geometric distribution?

    Answer: The PMF of the geometric distribution gives the probability of obtaining the first success on the \( k \)-th trial. It is given by: 

    \( P(X = k) = (1 - p)^{k-1} \times p \)

     

    Q`4`. What is the mean (expected value) of the geometric distribution?

    Answer: The mean (expected value) of the geometric distribution is \( \frac{1}{p} \), where \( p \) is the probability of success on each trial. It represents the average number of trials needed to achieve the first success.

     

    Q`5`. What is the standard deviation of the geometric distribution?

    Answer: The standard deviation of the geometric distribution is \( \sqrt{\frac{1-p}{p^2}} \). It measures the spread or dispersion of the distribution.

     

    Q`6`. In what situations is the geometric distribution applicable?

    Answer: The geometric distribution is applicable in situations where:

    • There are a series of independent trials with two possible outcomes.
       
    • The probability of success on each trial remains constant.
       
    • We are interested in the number of trials needed until the first success occurs.

     

    Q`7`. Can the geometric distribution model have non-success outcomes?

    Answer: Yes, the geometric distribution can be adapted to model the number of failures until the first success occurs. This is done by adjusting the definition of success and failure in the Bernoulli trials.

     

    Q`8`. What are some real-life applications of geometric distribution?

    Answer: Real-life applications include modeling the number of attempts until a customer makes a purchase in online marketing, the number of calls until a successful resolution in customer support, and the number of attempts until a light bulb fails in manufacturing quality control.