top of page
Inaugurated by IN-SPACe
ISRO Registered Space Tutor

S3-SA3-0263

What is the Cumulative Distribution Function?

Grade Level:

Class 8

AI/ML, Data Science, Physics, Economics, Cryptography, Computer Science, Engineering

Definition
What is it?

The Cumulative Distribution Function (CDF) tells us the probability that a random value will be less than or equal to a certain number. Think of it as a running total of probabilities, showing how probabilities build up as you consider larger values.

Simple Example
Quick Example

Imagine you are tracking the number of wickets taken by a bowler in 5 matches. If the bowler took 0 wickets in 1 match, 1 wicket in 2 matches, and 2 wickets in 2 matches. The CDF would tell you the probability of the bowler taking, say, 1 or fewer wickets.

Worked Example
Step-by-Step

Let's say we have data for the number of samosas sold at a stall each hour: 0 samosas (1 hour), 1 samosa (2 hours), 2 samosas (3 hours), 3 samosas (4 hours). Total hours = 1+2+3+4 = 10 hours.

Step 1: Find the probability for each number of samosas.
P(0 samosas) = 1/10
P(1 samosa) = 2/10
P(2 samosas) = 3/10
P(3 samosas) = 4/10

---

Step 2: Calculate the CDF for 0 samosas.
CDF(0) = P(samosas <= 0) = P(0 samosas) = 1/10 = 0.1

---

Step 3: Calculate the CDF for 1 samosa.
CDF(1) = P(samosas <= 1) = P(0 samosas) + P(1 samosa) = 1/10 + 2/10 = 3/10 = 0.3

---

Step 4: Calculate the CDF for 2 samosas.
CDF(2) = P(samosas <= 2) = P(0 samosas) + P(1 samosa) + P(2 samosas) = 1/10 + 2/10 + 3/10 = 6/10 = 0.6

---

Step 5: Calculate the CDF for 3 samosas.
CDF(3) = P(samosas <= 3) = P(0 samosas) + P(1 samosa) + P(2 samosas) + P(3 samosas) = 1/10 + 2/10 + 3/10 + 4/10 = 10/10 = 1.0

Answer: The CDF values are CDF(0)=0.1, CDF(1)=0.3, CDF(2)=0.6, CDF(3)=1.0.

Why It Matters

The CDF helps us understand the distribution of data and make predictions. It's used by data scientists to analyze customer behavior, by engineers to predict component failures, and by economists to study income distribution. Knowing CDF helps in making smart decisions in many fields!

Common Mistakes

MISTAKE: Confusing CDF with simple probability (PDF). | CORRECTION: Simple probability (PDF) is for an exact value, while CDF is for 'less than or equal to' a value, accumulating probabilities.

MISTAKE: Forgetting that the CDF always increases or stays the same, and its maximum value is 1. | CORRECTION: The CDF represents a cumulative total, so it can never decrease, and the total probability for all possible outcomes is always 1.

MISTAKE: Not adding all previous probabilities when calculating CDF for a specific point. | CORRECTION: Always sum up the probabilities of all values from the smallest up to and including the value you are calculating the CDF for.

Practice Questions
Try It Yourself

QUESTION: A spinner has outcomes: 1 (prob 0.2), 2 (prob 0.3), 3 (prob 0.5). What is the CDF for spinning a 2? | ANSWER: CDF(2) = P(outcome <= 2) = P(1) + P(2) = 0.2 + 0.3 = 0.5

QUESTION: In a survey, the number of siblings students have is: 0 siblings (10 students), 1 sibling (20 students), 2 siblings (15 students), 3 siblings (5 students). What is the CDF for having 2 or fewer siblings? | ANSWER: Total students = 10+20+15+5 = 50. P(0 siblings) = 10/50 = 0.2. P(1 sibling) = 20/50 = 0.4. P(2 siblings) = 15/50 = 0.3. CDF(2) = P(0) + P(1) + P(2) = 0.2 + 0.4 + 0.3 = 0.9

QUESTION: A dice is rolled. What is the CDF for rolling an odd number? (Hint: First find P(1), P(2), P(3), P(4), P(5), P(6), then find CDF for largest odd number). | ANSWER: P(each number) = 1/6. CDF(5) = P(1)+P(2)+P(3)+P(4)+P(5) = 1/6 + 1/6 + 1/6 + 1/6 + 1/6 = 5/6. (Note: CDF is for <= a value, so for 'odd numbers', the highest odd is 5, so we calculate CDF(5) to include all odd numbers up to 5).

MCQ
Quick Quiz

Which of the following statements about a Cumulative Distribution Function (CDF) is true?

It gives the probability of a single, exact value.

Its value always decreases as the input value increases.

It gives the probability that a value is less than or equal to a given number.

Its maximum value can be greater than 1.

The Correct Answer Is:

C

Option C correctly defines CDF as the probability of a value being less than or equal to a given number. Options A, B, and D are incorrect because CDF accumulates probabilities, never decreases, and its maximum value is always 1.

Real World Connection
In the Real World

Imagine you are tracking the waiting times for an auto-rickshaw using an app like Ola or Uber. A CDF could tell you the probability that your waiting time will be 5 minutes or less, 10 minutes or less, and so on. This helps the app optimize service and give you better estimates for your ride.

Key Vocabulary
Key Terms

PROBABILITY: The chance of an event happening. | RANDOM VARIABLE: A variable whose value is determined by the outcome of a random phenomenon. | DISCRETE DATA: Data that can only take certain values (like whole numbers). | CUMULATIVE: Increasing by successive additions.

What's Next
What to Learn Next

Now that you understand CDF, you can explore the Probability Distribution Function (PDF), which shows the probability of exact values. You can also learn about different types of probability distributions, which are essential for understanding data in subjects like AI and Machine Learning.

bottom of page