Jump to content

Boy or girl paradox

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Thesoxlost (talk | contribs) at 22:24, 25 February 2009 (changed intro to explicitly quote the Ask Marilyn and Gardner formulations of the question). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

The Boy or Girl paradox is a well-known set of questions in probability theory, dating back to at least 1979.[1] There are many variants. Martin Gardner published one of the earliest variants of the paradox, published in Scientific American in 1959 described the problem as The Two Children Problem:

Mr. Smith has two children. At least one of them is a boy. What is the probability that both children are boys? Mr. Jones has two children. The older child is a girl. What is the probability that both children are girls?"

Another variant has been discussed on the pages of Ask Marilyn in Parade Magazine, who asked:

A woman and a man (unrelated) each have two children. At least one of the woman's children is a boy, and the man's older child is a boy. Do the chances that the woman has two boys equal the chances that the man has two boys?

. Many other variants of this problem can be found online[2][3][4][5] and in the recent book Drunkard's Walk.[6]

The paradox stems from the intuition that the answer to these three questions is the same[3][6], although in fact they are not.

Common assumptions

There are four possible combinations of children. Labeling boys B and girls G, and using the first letter to represent the older child, the possible combinations are:

{BB, BG, GB, GG}.

These four possibilities are taken to be equally likely a priori. This follows from three assumptions:

  1. That the determination of the sex of each child is an independent event.
  2. That each child is either male or female.
  3. That each child has the same chance of being male as of being female.

It is worth noting that these conditions form an incomplete model. By following these rules, we ignore the possibilities that a child is intersex, the ratio of boys to girls is not exactly 50:50, and (amongst other factors) the possibility of identical twins means that sex determination is not entirely independent. However, one can see intuitively that the occurrence of each of these exceptions is sufficiently rare to have little effect on our simple analysis of the general population.

First question

  • A (random) family has two children, and the older child is a boy. What is the probability that the younger child is a girl?

In this problem, a random family is selected. In this sample space, there are four equally probable events:

Older child Younger child
Girl Girl
Girl Boy
Boy Girl
Boy Boy

Only two of these possible events meets the criteria specified in the question (e.g., BG, BB). Since both of the two possibilities in the new sample space {BG, BB} are equally likely, and only one of the two, BG, includes a girl, the probability that the younger child is a girl is 1/2.

Second question

  • A (random) family has two children, and one of the two children is a boy. What is the probability that the younger child is a girl?

This question is identical to question one, except instead of specifying that the older child is a boy, it is specified that one of them is a boy. Again, there are four equally probable events for a two-child family as seen in the sample space above. Three of these families meet the criteria of having at least one boy. The set of possibilities (possible combinations of children that meet the given criteria) is:

Older child Younger child
Girl Girl
Girl Boy
Boy Girl
Boy Boy

Thus, the answer to question 2 is 1/3.

Third question

  • A (random) family has two children, and one of the two children is a boy named Jacob. What is the probability that the other child is a girl?
Older child Younger child
Girl Girl
Boy Boy
Girl Jacob
Jacob Girl
Jacob Boy
Boy Jacob

Or, the set {GJ, JG, JB, BJ}, in which two out of the four possibilities includes a girl.

Therefore we might think that the probability returns to 1/2. But this is wrong because it doesn't take into account different frequencies of each of these answers. The likelihood of a boy being named Jacob and a boy not being named Jacob are not equal. Thus, we must replace our classical interpretation of probability with either a Frequentist or Bayesian interpretation. (Note that in real life child names are not independent of each other. In particular, people usually do not give the same name to two children. Thus, this discussion is purely theoretical).

Frequentist approach

Consider 10,000 families that have two children. Assume that the gender and name of each child is independent, within family and between families. Assume that the probability of each individual child being a girl is .5; otherwise the child is a boy. Assume that the probability of a child having the name Jacob is .01, and that all children with name Jacob are also boys.

In the table above, we have a list of all possible unique outcomes. But these outcomes do not have the same frequency. If we start with the assumption that the family has two children, we get the following frequency table:

Older child Younger child Frequency
Girl Girl 2500
Girl Boy 2500
Boy Girl 2500
Boy Boy 2500

With the additional bit of information that the family has a boy named Jacob, we can break every instance of "Boy" into two: "Jacob" and "Boy not Jacob". For every 50 Boys, 1 will fall into the "Jacob" bin and 49 into the "Boy not Jacob" bin. Thus, we have the following table:

Older child Younger child Frequency
Girl Girl 2500
Girl Jacob 50
Girl Boy not Jacob 2450
Jacob Girl 50
Boy not Jacob Girl 2450
Jacob Jacob 1
Boy not Jacob Jacob 49
Jacob Boy not Jacob 49
Boy not Jacob Boy not Jacob 2401

If we eliminate all instances that do not meet our given criteria ({Girl, Girl} {Girl, Boy not Jacob} {Boy not Jacob, Girl} {Boy not Jacob, Boy not Jacob}), then we eliminate 9801 of our events, leaving 199 possible events. Of those, the successful events are {Girl, Jacob} and {Jacob, Girl}, or 100 cases.

So if the probability of a boy being named Jacob is 1 in 50, then the probability that the family has a girl is 100/199, or roughly 50%. But this value will change depending on the popularity of the name. At the extreme, if all boys were given the same name, then being named Jacob would provide no more information than being a boy, and thus the probability would still be 2/3 that the family has a girl. As the likelihood of the name decreases, the likelihood of the two-Jacob case also decreases, and the probability of the family having a girl approaches the limit of 50%.

If we further assume that parents never name two children with the same name, we can eliminate {Jacob, Jacob}, leaving 198 possible events; thus it would appear that the probability of the family having a girl is 100/198, or 50/99. However, there are now 50 occurrences each of {Jacob, Boy not Jacob} and {Boy not Jacob, Jacob} making the probability of a girl 100/200, or exactly 1/2.

Conclusion

Many people coming across this paradox for the first time will agree with the answer to the first question, but some may be confused by the answer to the second question.

Two ways of explaining the error are as follows:

  1. The second question does not assume anything about the age of the boy. He might be the older or he might be the younger sibling. Therefore the thought that there are only three possibilities (2 boys {BB}, 2 girls {GG}, or a mix) does not take into account that the last of these three is twice as likely as either of the first two, because it can be either {GB} or {BG}.
  2. The chance that there are two boys is 1/4, the same as the chance that there are two girls. The chance that there is one boy and one girl (or one girl and one boy) consumes the remainder (1/2), therefore two boys are half as likely as a mixture.

Mistakes

A look at why some "explanations" are flawed can be very explanatory.

For example, to answer the second question someone may make this list of possibilities:

  1. The boy has an elder brother
  2. The boy has a younger brother
  3. The boy has an elder sister
  4. The boy has a younger sister

Apparently only the latter two are the ones sought for, giving a total probability of 1/2. The error here is that the first two statements are counted double. If there are two boys, we have no referent for "the boy". Therefore the first two possibilities should read:

  1. A boy has an elder brother
  2. A boy has a younger brother

But now it is clear that these two statements are equivalent – both effectively state that there are two boys – and therefore one should be removed.

Incomplete problem statements

The problem is often posed in a way that leave other interpretations open.

Example 1

Two old classmates, Mary and Brian, meet in the street, not having seen each other since they left school.

Mary asks Brian: "Have you got any children?"
Brian answers: "Yes, I've got two."
Mary: "Do you have a boy?"
Brian: "Yes, I do!"

Here, for some reason, the conversation is cut short.

Formally, this corresponds to the second version as Brian only has told Mary that at least one child is a boy. Accordingly, the probability that Brian has a girl should be 2/3. However, in real conversation, if Brian had two boys, he would be more likely to answer, e.g., "Yes, they are both boys" (Grice's maxim of quantity).[citation needed] The fact that he does not answer like that could reasonably be taken by Mary as a clue increasing her posterior probability of one child being a girl above 2/3. This highlights the need for precision when stating such problems in probability.


See also

References

  1. ^ Howard E. Reinhardt, Don O. Loftsgaarden (1979). "Using simulation to resolve probability paradoxes". International Journal of Mathematical Education in Science and Technology. 10. doi:10.1080/0020739790100212.
  2. ^ , "The Boy or Girl Paradox". BBC. {{cite web}}: Unknown parameter |dateaccessed= ignored (help)
  3. ^ a b "Finishing The Game". Jeff Atwood. Retrieved 15 February 2009.
  4. ^ "Probability Paradoxes". Sho Fukamachi. Retrieved 15 February 2009.
  5. ^ Debra Ingram. [www.csm.astate.edu/~dingram/MAA/Paradoxes.RPSmith.ppt "Mathematical Paradoxes"]. Retrieved 15 February 2009. {{cite web}}: Check |url= value (help)
  6. ^ a b Leonard Mlodinow (2008). Pantheon. ISBN 0375424040. {{cite book}}: Missing or empty |title= (help); Text "(May 13, 2008)" ignored (help)