{"id":1808,"date":"2021-05-15T15:32:29","date_gmt":"2021-05-15T15:32:29","guid":{"rendered":"http:\/\/192.168.31.181\/muthu\/?p=1808"},"modified":"2021-05-15T15:32:29","modified_gmt":"2021-05-15T15:32:29","slug":"birthday-problem-and-monte-carlo-simulation","status":"publish","type":"post","link":"http:\/\/write.muthu.co\/birthday-problem-and-monte-carlo-simulation\/","title":{"rendered":"Birthday Problem and Monte Carlo Simulation"},"content":{"rendered":"\n
If you get on a plane that can carry 100 or more passengers at a time, there is a 99% chance that one of the passengers shares the same birthday as yours. If you get into a bus with a capacity of 50 passengers, you have a 97% chance of finding someone who shares your birth date! Surprising isn’t it? This is exactly what the birthday paradox or the birthday problem is.<\/p>\n\n\n\n
In this post, I will try to solve the birthday problem first the analytical way and then using the Monte Carlo simulation. <\/p>\n\n\n\n
The problem is to compute the probability of finding at least two people with the same birthday in a group of N <\/em> people. Suppose we have a group of 10 people, the probability of two or more people sharing a birth date is equivalent to finding the probability of everyone having a different birthday and subtracting it from 1, using the idea, P<\/em>(A<\/em>) = 1 \u2212\u00a0P<\/em>(A<\/em>\u2032), where A<\/em>\u2032 represents “everyone having different birthdays”<\/p>\n\n\n\n We will ignore the leap year for simplicity’s sake. If a person is born on a particular day, there are 365 ways of picking a date. Once the first date is picked, the 2nd person will be left with 364 days, the next one 363, and so on. So the number of ways everyone can have a different birthday is equivalent to 365x364x363x362.. 365-(N + 1), where N represents the number of people. In terms of probabilities, the person has 365 options to choose from so his\/her probability will be 365\/365, if the next person has a different birthday, his\/her probability will be 364\/365 and the list goes on. This leads us to the total probability as below.<\/p>\n\n\n\n \\(P(A{`}) = \\frac{365}{365}\\times \\frac{364}{365}\\times \\frac{363}{365}\\cdots \\frac{365-(n+1))}{365} \\)<\/p>\n\n\n\n so the probabability of two or more people sharing the same birthday is given by <\/p>\n\n\n\n \\(P(A) = 1- \\frac{365}{365}\\times \\frac{364}{365}\\times \\frac{363}{365}\\cdots \\frac{365-(n+1))}{365} \\)<\/p>\n\n\n\n which is equivalent to,<\/p>\n\n\n\n \\(P(A) = 1- \\frac{365\\times364\\times363\\times..(365-(n+1))}{365^{n}} \\)<\/p>\n\n\n\n Now that we have the formula, let’s check the probability with different values of N using a simple python code:<\/p>\n\n\n\n <\/p>\n\n\n\n The results are printed as below<\/p>\n\n\n\n As you can see in the above table, surprisingly, the probability of you sharing a birthday with someone else in a group of 100 and more people is nearly a 100%. The growth of this probability distribution can be seen in the below graph.<\/p>\n\n\n\nimport numpy as np\nimport pandas as pd\n\nresults = []\nfor number_of_people in np.arange(5,130,5):\n probability = 1\n for count in range(1, number_of_people+1):\n probability *= (365-count+1)\/365\n results.append({\n 'n': number_of_people,\n 'p(n)': (1-probability)\n })\nprint(pd.DataFrame(results))<\/code><\/pre>\n\n\n\n
N<\/th> P(N)<\/th><\/tr><\/thead> 5<\/td> 0.027136<\/td><\/tr> 10<\/td> 0.116948<\/td><\/tr> 15<\/td> 0.252901<\/td><\/tr> 20<\/td> 0.411438<\/td><\/tr> 25<\/td> 0.568700<\/td><\/tr> 30<\/td> 0.706316<\/td><\/tr> 35<\/td> 0.814383<\/td><\/tr> 40<\/td> 0.891232<\/td><\/tr> 45<\/td> 0.940976<\/td><\/tr> 50<\/td> 0.970374<\/td><\/tr> 55<\/td> 0.986262<\/td><\/tr> 60<\/td> 0.994123<\/td><\/tr> 65<\/td> 0.997683<\/td><\/tr> 70<\/td> 0.999160<\/td><\/tr> 75<\/td> 0.999720<\/td><\/tr> 80<\/td> 0.999914<\/td><\/tr> 85<\/td> 0.999976<\/td><\/tr> 90<\/td> 0.999994<\/td><\/tr> 95<\/td> 0.999999<\/td><\/tr> 100<\/td> 1.000000<\/td><\/tr> 105<\/td> 1.000000<\/td><\/tr> 110<\/td> 1.000000<\/td><\/tr> 115<\/td> 1.000000<\/td><\/tr> 120<\/td> 1.000000<\/td><\/tr> 125<\/td> 1.000000<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n\n\n