The D.I.Y. way of getting a probability estimate from your doctor

One frustrating thing about dealing with doctors is that they tend to be unwilling or unable to talk about probabilities. I run into this problem in particular when they’ve told me there is “a chance” of something, like a chance of a complication of a procedure, or a chance of transmitting an infection, or a chance of an illness lasting past some time threshold, and so on. Whenever I’ve pressed them to try to tell me approximately how much of a chance there is, they’ve told me something to the effect of, “It varies” or “I can’t say.” I sometimes tell them, look, I know you’re not going to have exact numbers for me, but I just want to know if we’re talking more like 50% or, you know, 1%? Still, they balk.

My interpretation is that this happens due to a combination of (1) people not having a good intuitive sense of how to estimate probabilities and (2) doctors not wanting to be held liable for making me a “promise” – perhaps they’re concerned that if they give me a low estimate and it happens anyway, then I’ll get angry or sue them or something.

So I wanted to share a useful tip from my friend, the mathematician who blogs at www.askamathematician.com, who was about to have his wisdom teeth removed and was trying unsuccessfully to get his surgeon to tell him the approximate risks of various possible complications from surgery. He discovered that you can actually get a percentage out of your doctor if you’re willing to just construct it yourself:

Friend: “I’ve heard that it’s possible to end up with permanent numbness in your mouth or lip after this surgery… what’s the chance of that happening?”

Surgeon: “It’s pretty low.”

Friend: “About how low? Are we talking, like five percent? Or only a fraction of one percent?”

Surgeon: “I really can’t say.”

Friend: “Okay, well… how many of these surgeries have you done?”

Surgeon: “About four thousand.”

Friend: “How many of your patients have had permanent numbness?”

Surgeon: “Two.”

Friend: “Ah, okay. So, about one twentieth of one percent.”

Surgeon: “I really can’t give you a percentage.”

Visualizing data with lines, blocks, and roller coasters

Randall Munroe's infographic on radiation dose levels (Click to enlarge)

I’m a huge fan of clever ways of visualizing data, especially when there’s something challenging about the data in question. For example, if it contains more than three important dimensions and therefore can’t be easily graphed with the typical representations (e.g., position on x-axis, position on y-axis, color of dot). Or if it contains a few huge outliers which distort the scale of the data.

This recent infographic in Scientific American by my friend (and co-blogger, at Rationally Speaking) Lena Groeger is a great example of the latter. The challenge in displaying relative levels of radioactivity is that there are a few outliers (e.g., Chernobyl) which are so many times higher than the rest of the data that when you try to graph them on the same scale, you end up with the outlier at one end and then all the rest of the data clumped together in an indeterminate mass at the other end.

Randall Munroe over at the webcomic XKCD came up with a pretty good, inventive solution that relies on our intuitive sense of area, rather than length. Each successive grid represents only one small block of the next grid, which is how he manages to cram the entire skewed scale into one page. It’s cool, but I don’t think it works that intuitively. We have to consciously keep in mind the reminder of how big each grid is relative to the next, and it’s easy to lose your grip on the relative scales involved.

However, one of the benefits of online infographics as opposed to print is that you don’t have to fit the whole image in view at once. Lena and her colleagues created a long, leisurely scale that has the space at one end to show the differences between various low levels of radiation dose, below 100,000 micro-Sieverts… and then it hits you with a sense of relative magnitude as you have to scroll down, down, down, until you get to Chernobyl at 6 million micro-Sieverts.

It reminded me of one of my all-time favorite data visualizations: over one hundred years of housing prices, transformed into a first-person perspective roller coaster ride. There are a number of wonderful things about this design choice. For one thing, it works on a visceral level: reaching unprecedented heights actually makes you feel giddy, and sudden steep declines are a little scary.

I also love the way it captures the most recent housing bubble — as you keep climbing higher, and higher, and higher, and higher, and higher, the repetitive climb starts to feel relaxing, and you even forget that you’re on a roller coaster. You forget, in other words, that you’re not going to keep going up forever. And that moment at the end, when the coaster pauses and you turn around to look down at how far away the ground is (this video stops right before the 2008 crash) — shiver. Just perfect.

Food, Bias, and Justice: a Case for Statistical Prediction Rules

We’re remarkably bad at making good decisions. Even when we know what goal we’re pursuing, we make mistakes predicting which actions will achieve it. Are there strategies we can use to make better policy decisions? Yes – we can gain insight by looking at cognitive science.

On the surface all we need to do is experience the world and figure out what does and doesn’t work at achieving goals (the focus of instrumental rationality). That’s why we tend to respect expert opinion: they have a lot more experience on an issue and have considered/evaluated different approaches.

Let’s take the example of deciding whether or not to grant prisoners parole. If the goal is to reduce repeat offenses, we tend to trust a panel of expert judges who evaluate the case and use their subjective opinion. They’ll do a good job, or at least as good a job as anyone else, right? Well… that’s the problem: everyone does a pretty bad job. Quite frankly, even experts’ decision-making is influenced by factors that are unrelated to the matter at hand. Ed Yong calls attention to a fascinating study which finds that a prisoner’s chance of being granted parole is strongly influenced by when their case is heard in relation to the judges’ snack breaks:

The graph is dramatic. It shows that the odds that prisoners will be successfully paroled start off fairly high at around 65% and quickly plummet to nothing over a few hours (although, see footnote). After the judges have returned from their breaks, the odds abruptly climb back up to 65%, before resuming their downward slide. A prisoner’s fate could hinge upon the point in the day when their case is heard.

Curse our fleshy bodies and their need for “Food” and “breaks”! It’s obviously a problem that human judgment is influenced by irrelevant, quasi-random factors. How can we counteract those effects?

Statistical Prediction Rules do better

Fortunately, we have science and statistics to help. We can objectively record evidential cues, look at the resulting target property, and find correlations. Over time, we can build an objective model, meat-brain limitations out of the way.

This was the advice of Bishop and Trout in “Epistemology and the Psychology of Human Judgment“, an excellent book recommended by Luke Muehlhauser of Common Sense Atheism (and a frequent contributor to Less Wrong).

Bishop and Trout argued that we should use such Statistical Prediction Rules (SPRs) far more often than we do. Not only are they faster, it turns out they’re more trustworthy: Using the same amount of information (or often less) a simple mathematical model consistently out-performs expert opinion.

They point out that when Grove and Meehl did a survey of 136 different studies comparing an SPR to the expert opinion, they found that “64 clearly favored the SPR, 64 showed approximately equivalent accuracy, and 8 clearly favored the clinician.” The target properties the studies were predicting varied from medical diagnoses to academic performance to – yup – parole violation and violence.

So based on some cues, a Statistical Prediction Rule would probably give a better prediction than the judges on whether a prisoner will break parole or commit a crime. And they’d do it very quickly – just by putting the numbers into an equation! So all we need to do is show the judges the SPRs and they’ll save time and do a better job, right? Well, not so much.
Read more and comment:

“More like OKStupid, amirite?”

That was the subject line of an email my friend James sent me yesterday. His email contained a link to this post by OK Cupid’s blog, where the OKC team sifts through their massive amounts of data to find interesting facts about people’s dating habits.

This latest post is called “The Mathematics of Beauty” and it purports to reveal a startling finding: women whose looks inspire a lot of disagreement among men (i.e., with some men rating them hot and others rating them ugly) get more messages. And the number of messages you receive is positively correlated with the number of men rating you a “5 out of 5,” but is negatively correlated with the number of men rating you a “4 out of 5.”  OK Cupid says, “This is a pretty crazy result, but every time we ran the numbers—changing the constraints, trying different data samples, and so on—it came back to stare us in the face.”

To explain these odd results, the OKCupid bloggers came up with two game theoretic stories: First, men who see a woman and think “She’s a 4” will also think “That’s cute enough for plenty of other men to be into her, so I’ll have lots of competition… but that’s not hot enough for it to be worth it for me to try anyway.” And second, if men think, “She’s really hot to me, but I bet other men will disagree,” they’ll be more likely to message her, because they expect less competition. So women with a polarizing look will turn off some men, but the men who are turned on will be even more likely to message her knowing that other men are turned off.

Based on these stories, OKCupid offers the following advice to its female users who want to get more messages from men:

“We now have mathematical evidence that minimizing your “flaws” is the opposite of what you should do. If you’re a little chubby, play it up. If you have a big nose, play it up. If you have a weird snaggletooth, play it up: statistically, the guys who don’t like it can only help you, and the ones who do like it will be all the more excited.”

Oh my. That sounds like really bad advice. Before people start enthusiastically pointing the camera at their fat rolls, maybe we should check and make sure this analysis is sound. Because my opinion is that OKCupid’s crazy results can easily be explained by much less counterintuitive stories than the ones they concoct.

First of all, the “attractiveness” ratings they’re using aren’t really attractiveness ratings. They come from a feature on the site called Quickmatch, which presents you with the profile pictures of a succession of people for you to rate from 1 to 5. But you’re free to click through to each person’s full profile. And if you like the way they present themselves through the written part of the profile, you might well rate them highly on Quickmatch; conversely, if you don’t like their written profiles, you might well rate them poorly. Treating those scores as pure “attractiveness” ratings is way off the mark.

Second of all, the way Quickmatch works is that if you rate someone a 4 or 5 and they similarly rate you a 4 or 5, then you both receive emails informing you of each other’s interest. So this data is even more tainted, because people are not simply thinking “How attractive is this person?” — they’re thinking “Do I want this person to contact me?” If you think someone’s not that attractive but you’d still want to date her, you might well rate her a 4 just in case she’s also interested in you.

In fact, I strongly suspect there are a lot of guys who just rate every single girl a 4 or 5, giving 5’s to the girls they think are good-looking and 4’s to everyone else. It’s a carpet-bombing strategy — why rule anyone out off the bat? (My suspicion is grounded in some results from a speed-dating study I worked on in college, with a psychology professor at Columbia; I got to look at the ratings sheets after each speed dating session, and there were plenty of guys who just circled the entire row of “YES” rather than circling YES or NO to each girl individually.)

And as you can imagine, if a lot of guys are using “4” to mean “anyone who’s not a 5,” then of course 4’s are going to be negatively correlated with the number of messages a girl gets, because many or most of those 4’s actually indicate 1’s, 2’s, and 3’s.

What I think the OKCupid blog post illustrates is how easy it is to come up with a story to explain any result, whether or not the result is real. To paraphrase my friend James for a minute: if you find yourself saying “I know this is crazy, but numbers don’t lie,” you should really calm down and check to see if you’ve made a mistake, because chances are, you have.

%d bloggers like this: