Proving the Obvious and Understanding the Not-So Obvious
Continuing on with my exploration of the National Survey of Drug Use and Health, I thought that I should calculate some simple conditional frequency statistics. The graph below strikes me as a very good example of how conditional probabilities play out in the real world. From it, you can see how the right piece of information can radically improve your ability to make guesses about the answer to another question.
To quantify the pattern that you can see in the chart, only 4% of those who’ve tried cocaine have not also tried cigarettes at some point in their lives. In contrast, 49% of those who’ve never tried cocaine have tried cigarettes. In general, people are unlikely to try cocaine, but those who do are almost certain to have tried cigarettes as well. In other words, cocaine use tells you a lot about cigarette use, but cigarette use tells you effectively nothing about cocaine use. If you meet someone who’s tried cocaine, and you assume that they’ve also tried cigarettes, these statistics suggest that your assumption will be wrong less than 5% of the time.
[Post-Publication Edit]: Fixed an error in the second paragraph regarding the 49% figure.