S7-SA3-0352
What is Stratified Random Sampling?
Grade Level:
Class 12
AI/ML, Physics, Biotechnology, FinTech, EVs, Space Technology, Climate Science, Blockchain, Medicine, Engineering, Law, Economics
Definition
What is it?
Stratified Random Sampling is a method where you first divide a large group (population) into smaller, similar subgroups called 'strata'. Then, you pick a random sample from each of these subgroups. This ensures that every different type of group is fairly represented in your final sample.
Simple Example
Quick Example
Imagine your school has students from Class 6, 7, and 8. If you want to know their favourite subject, you can't just ask the first 50 students you see. Instead, you divide students into three strata: Class 6, Class 7, and Class 8. Then, you randomly pick 10 students from Class 6, 10 from Class 7, and 10 from Class 8. This way, all classes get a voice.
Worked Example
Step-by-Step
Let's say a mobile company wants to survey 100 people about their new phone model across three cities: Mumbai, Delhi, and Bengaluru. There are 500 customers in Mumbai, 300 in Delhi, and 200 in Bengaluru.
Step 1: Identify the strata. Here, the cities are the strata: Mumbai, Delhi, Bengaluru.
---Step 2: Calculate the total population. Total customers = 500 + 300 + 200 = 1000.
---Step 3: Determine the proportion of each stratum in the total population. Mumbai: 500/1000 = 0.5 (50%), Delhi: 300/1000 = 0.3 (30%), Bengaluru: 200/1000 = 0.2 (20%).
---Step 4: Calculate the number of samples to take from each stratum based on their proportion and the total sample size (100). Mumbai: 0.5 * 100 = 50 people. Delhi: 0.3 * 100 = 30 people. Bengaluru: 0.2 * 100 = 20 people.
---Step 5: Randomly select the calculated number of people from each city's customer list.
Answer: The company will survey 50 customers from Mumbai, 30 from Delhi, and 20 from Bengaluru, chosen randomly from each city's customer list.
Why It Matters
This method is super important in fields like AI/ML to ensure data used for training models is balanced, preventing biased results. In medicine, it helps ensure clinical trials include diverse patient groups for accurate drug testing. Market researchers and economists use it to understand consumer behaviour or economic trends across different income groups or regions, helping them make better predictions and decisions.
Common Mistakes
MISTAKE: Not dividing the population into distinct, non-overlapping strata. | CORRECTION: Ensure each member of the population belongs to only one stratum. For example, a student cannot be in both Class 7 and Class 8 strata.
MISTAKE: Not sampling randomly *within* each stratum. | CORRECTION: After dividing into strata, you must use a random method (like drawing names from a hat or using a random number generator) to pick individuals from each stratum.
MISTAKE: Taking an unequal number of samples from strata that are vastly different in size, without considering their proportion. | CORRECTION: Usually, the number of samples taken from each stratum should be proportional to its size in the overall population to maintain representativeness.
Practice Questions
Try It Yourself
QUESTION: A school has 400 boys and 600 girls. If you want to survey 100 students using stratified random sampling to understand their favourite sport, how many boys and how many girls should you select? | ANSWER: Boys: 40, Girls: 60
QUESTION: An online shopping platform wants to survey 200 users. They have 1000 users in Tier 1 cities, 800 users in Tier 2 cities, and 200 users in Tier 3 cities. How many users should be sampled from each tier? | ANSWER: Tier 1: 100 users, Tier 2: 80 users, Tier 3: 20 users
QUESTION: A dairy company wants to test the quality of milk from 3 different farms: Farm A (200 cows), Farm B (300 cows), and Farm C (500 cows). If they decide to test milk from 50 cows in total, how many cows should be randomly selected from each farm using stratified sampling? What is the proportion of cows selected from Farm B? | ANSWER: Farm A: 10 cows, Farm B: 15 cows, Farm C: 25 cows. Proportion from Farm B: 30%
MCQ
Quick Quiz
Which of the following is the primary reason for using stratified random sampling?
To make the sampling process faster and easier.
To ensure every subgroup is represented in the sample.
To only select individuals who are easy to reach.
To avoid any form of randomness in selection.
The Correct Answer Is:
B
Stratified random sampling ensures that important subgroups within a population are adequately represented in the sample, which might not happen with simple random sampling. It doesn't necessarily make it faster or easier, nor does it avoid randomness.
Real World Connection
In the Real World
In India, election polling agencies often use stratified random sampling. They divide voters into strata based on factors like age groups, gender, rural/urban areas, or even specific constituencies. Then, they randomly survey a proportional number of people from each stratum to predict election results, ensuring a fair representation of different voter segments across the country.
Key Vocabulary
Key Terms
POPULATION: The entire group of individuals or items you are interested in studying. | STRATA: Smaller, distinct subgroups formed by dividing a population based on shared characteristics. | SAMPLE: A smaller, representative subset of the population chosen for study. | RANDOM SELECTION: Choosing individuals from a group purely by chance, where each has an equal probability of being chosen.
What's Next
What to Learn Next
Now that you understand stratified random sampling, you can explore other sampling methods like Cluster Sampling or Systematic Sampling. These build on the idea of selecting representative groups and will help you understand when to use different techniques for collecting data effectively.


