Assumptions behind a curtain

This is basically an overly long response to a recent blog post by Scott Alexander. It’s not very interesting outside of that context, so read that first unless you did so already. Also, most of this is further simplification of Cosma Shalizi’s ancient and semi-famous blog posts on IQ, so if you understood those well you probably won’t find much new here.

Let me start of with a factor analysis that happened long before we knew the math of factor analysis: female sexual attractiveness.

One could build a very IQ-like measure of that by having a few random men rank a sample of women and then taking average percentiles.

Then one could do a lot of correlational studies to find things correlated with beauty and assign them to input and output categrories, drawing a flow-chart from inputs through attractiveness to outputs.

On the input side of the flow-chart one might have things like facial symmetry, make-up, breast size, skin clarity, waist-to-hip ratio, hair length and shininess, etc.

On the output side one would have things like the cost of favours men are likely to do for a woman, popularity, other woman getting angry when their boy-friend looks at the woman, etc.

Also on that side one would have things like the probability of her getting pregnant per sex event and the probability of the baby being born healthy if she gets pregnant.

In reality though only some of those arrows should flow through the central box. Some of the input factors are directly correlated with some of the output factors, namely those that contribute to prospects of offspring. Those arrows aren’t actually going through a real central box, it’s just that the outputs are relevant for basically the same reason.

On the other hand, some arrows here actually flow through a central box one might label “the brain stem’s estimate of biological procreation prospects” or female sexual attractiveness in the strict sense. Things correlating with the outputs not actually going through the central box still tend to correlate with it, because the brain stem is somewhat good at estimating such prospects.


Let’s look at how evolution designed the central box. (I’ll talk of evolution like an agent here. I know that’s not how it works in reality, but everyone talks that way for good reason.)

It had a lot of facts available, like “broad hips make it easier for the baby to get out alive”, “visible diseases are sometimes inherited by the baby”, etc. It also knew that some of these facts were more important than others. Using all those rules directly would be computationally inefficient though, and evolution didn’t want to waste too many resources on a large look-up table. So basically it created a weighted sum of many known female physiological influences on procreation and then tinkered with the weights until predictions with that sum became sufficiently similar to predictions made with the original data. Sufficient meaning in that context, that the better results one would get from the real calculation are not worth the cost of doing that calculation.

Basically this is an efficiency hack for reasoning with the facts evolution had available.


Let’s also look at where the central box works and where it doesn’t.

Nowadays there are a lot ways to confuse the input variables, like make-up, chemical hair-shinyfication, breast-implants etc. They do change the decision the brain stem makes through the central box, like male favor cost, girlfriend look-triggered angryness etc. But they don’t change any of the outputs correlated with the original inputs of the central box, which it was supposed to optimize. In other words, from a designer’s perspective these are things the box is bad at.

On the other side, modern medicine has also changed the consequences of some of the traits the box is adding up. For example, the correlation between sexual attractiveness and probability of conception is probably smaller than in used to be in the ancestral environment, because nowadays people having sex might be contracepting. Also there are now caesarians, so the correlation between obstacles and the baby not getting out alive probably went down. In other words, from the designers perspective even the original inputs are now probably weighted wrongly.

Still, the central box still works reasonably good where nothing has changed.


Now compare the situations that evolution considered in designing the central box with the situations it does a good job on. As you might have noticed, they are identical.

This is not a coincidence. Remember that box was created by taking all the correlations available at the time and then throwing some of the data away until it was boiled down to a single score. In other words it doesn’t contain any information that didn’t already go into constructing it and not even all of that. So it does a reasonably good job on the kind of correlations it was built to simplify, but there is no reason why it should work on correlations that came up later and so it doesn’t.

So even though make-up correlates positively with sexual attractiveness and sexual attractiveness correlates negatively with miscarriage even the dumbest conceivable doctor wouldn’t prescribe more make-up to prevent miscarriage, because the make-up-attractiveness-correlation is not among the ones the box summarizes and reasoning through the box and unsummarized correlations is not a valid argument.


Compare this to one of Scott’s examples, blood pressure. I’ll come back to his points about blood pressure not working that well, but first let me talk about why it works when it does.

My oversimplified layman’s understanding is that we basically have a good idea of how blood pressure works. If it is high, blood will press against blood vessels more and sometimes that will make them break. This is a bad thing, particularly if it happens in the brain. On the other hand, more pressure makes the blood go faster, which means the cells get more oxygen per time. This is why people with a blood pressure of zero (Or maybe equal to ambient pressure, don’t want to figure the details out right now) tend to go brain-dead a few minutes later.

On the other side, we also know how it is influenced. Things changing blood pressure either change how hard the heart presses or how big the blood vessels are, which in turn changes how hard they press back or maybe how much blood there is in total, which determines how hard the body must push to keep it in.

So, while we only care about changing blood pressure because of its effects, we actually know these effects are mediated through real thing called blood pressure. Eating too much salt will give you seizures by increasing blood pressure and not, say, by chemically corroding the blood vessels. Likewise, a heart attack will kill your brain by reducing blood pressure to zero , not by, say, just phoning the brain and telling it the apacalypse is here so it might as well go home now.

Blood pressure being real in that way means we can validly make arguments on newly found correlations between blood pressure and other things. So if we find a new drug increasing blood pressure and if that drug doesn’t have any other direct effects on the body (to be honest that latter one is the mother of all ifs, but I’m arguing principles here), then yes, that drug will cause the kind of things higher blood pressure causes.

On to the ways it doesn’t work. As Scott explains, pressure is different in different parts of the body. We actually care about blood pressure in the body parts where we care about effects but measure it somewhere else and that’s close enough except whenn it isn’t. So the explanation I gave above isn’t quite right, which in turn means it doesn’t work perfectly. Note however, that the explanation’s success is due to it being like actual reality and its failure is due to it not being like reality. Also, Scott notes measurement methods suck. Fine, but again measuring blood pressure works because the measurement is close enough to the actual physical quantity and fails because it isn’t.

Flow chart wise, we actually know the picture is similar to reality. In particular, the box in the middle corresponds to something in reality, there is only one such thing and the arrows are rightly drawn in going through that one box. They shouldn’t really go directly to the boxes on the right side or maybe through fifteen boxes we omitted that also have arrows between each other, some of which go backwards. Again, this isn’t perfectly true, but blood pressure is a good concept because and in so far as it is close enough and we have a justified expectation of it being close enough for newly discovered correlations.

In cool stats lingo, blood pressure is a causal node. Female attractiveness is a causal node only for the things evolution conditioned on it, but not for the things evolution was trying to achieve.

What evolution did with sexual attractiveness can be done with math for things we care about. The method doing that is called factor analysis. Sometimes we may be lucky and discover an actual causal node that way. But the way the math works, we almost always can build a box from a large bunch of correlations, even if no causal node is out there. Such a box is called a factor. Sometimes it represents something real. Sometimes it doesn’t.

Spearman did this for the various parts of IQ-tests and came up with a box he called the g-factor. (For the pedantic, he used a now-obsolete predecessor method of factor analysis, which hadn’t been invented yet.) This was particularly cool, because at the time he had good evidence for the g-factor being a causal node. That evidence turned out to be a fluke though.

So now we don’t have good proof of the g-factor actually being causal. Some people think it is though, and that matters because some arguments will be valid if it is but not if it isn’t. So they can think they proved things they actually only assumed in the disguise of assuming g to be causal.

At this point I’ll make a slight digression on heritability.

Scott thinks IQ-sceptics are trying to avoid thinking about claims like “Intelligence is at least 50% heritable”.

I’m actually fine with that claim, as long as we stick to its technical meaning.

Problem is, the word “heritable” sounds like it should mean but actually doesn’t mean “unchangeable short of bioengineering”.

To illustrate, let me make up a toy model were the two things differ. I’m not claiming that’s how it works, just showing one easy example of how they could differ. So in my fake model intelligence consists of 10000 binary abilities, each of which you either have or don’t have. Some of these abilities depend on each other, so you can’t learn the advanced ones before the primitive ones. All are learnable. The genetic part is that for each ability you have a genetically determined teaching time necessary to master it. For some abilities some people need less learning time than others, but given enough time everybody can learn every ability. If you get enough teaching time for a given ability you learn it, if not not.

Actual teaching time isn’t strongly enough dependent of needed teaching time, so ceteris paribus, people who would need more time on fundamental abilities (though perhaps less on advanced ones, for an equal total travel time to the ceiling) tend to learn less abilities. Thus intelligence is strongly heritable in our present environment.

As schooling improves we discover abilities many people didn’t learn previously and spend more time on them. Thus the Flynn effect.

Eventually we will figure out all the critical abilities and human differences in IQ will vanish without any bioengineering.

Again, I’m not claiming this is how it works, mostly because its a complex just-so story I just made up. But it is very compatible with all the results we get from twin studies. In fact more so than the story the average “human biodiversity” guy on the intertubes professes as a necessary result of that data. Which is to say, heritability doesn’t prove strict biodetermination.

However, if you assume g is causal and make a second and actually falsified assumption no one really believes to outlaw models like this one and add in the results of twin studies, then you can conclude intelligence differences can’t be much reduced short of bioengineering. Problem is, a lot of people think this follows from twin studies alone. It doesn’t.


Like Scot I’m concerned there’s a motte and bailey tactic involved here, only I think it’s on the other side of the controversy.

The motte:
When you need to screen people for the kind properties IQ tests are designed to screen for, and when you don’t have more specific tests to screen for a more specific version of what you’re looking for (for example subject matter tests in college admissions) then IQ tests will do a better job than nothing. So far so good, this is pretty obviously true and actually not all that controversial.

Now for the bailey:
Almost all human societies have fairly hereditary social strata, where people tend to end up with approximately the same amounts of power, prestige, and money their parents also had, and so on until the tenth generation. This is somewhat embarassing for societies that follow a nominal ideology of giving everyone equal chances. It is particularly embarassing in America, where the underclass has a distinct skin colour, making it comparatively hard to just ignore the problem.

Here comes the IQ ideologue with cruel but comforting story dressed up as science: The underclass is so stable because they are irredeemably dumb for genetic reasons. No kind of affirmative action can ever alleviate that genetic stupidity so we better don’t even try. See, it’s nature itself that is highly unfair, not, perish the thought, the social structure. So those of us getting fairly high positions on the totem pole won a genetic lottery, not a benefit-from-structural-sin lottery.

There actually is no good evidence for that story. To the extent it’s directly testable it’s wrong. For example, shithole countries tend to have low average IQ’s and one could argue about what causes what. Except that occasionally countries do emerge from poverty and bad institutions, and when they do, the average achievements of their inhabitants go up in degrees this theory declares impossible in a single country.

But such stories are very attractive for an upper class, look how Malthus previously Eulered basically the same consequences from then exotic math (exponential growth) and some people still want to stick with that bullshit.

So if you put in some additional assumptions, assumptions that are so subtle you don’t even need to understand them or know you’re making them, then you can derive this bailey from the actually justifiable motte. And from the unfounded assumptions, but don’t look at that curtain too hard.

This entry was posted in Arguments and tagged , , . Bookmark the permalink.

7 Responses to Assumptions behind a curtain

  1. Joe says:

    Great post! Keep’em coming.

  2. lmm says:

    Your just-so story is making extra unjustified assumptions though, so we should penalise it under Occam’s razor. There’s no reason to assume that people who take a long time to learn low-level skills should be able to learn high-level skills faster. And without that part I think the IQ realist arguments go through even if you assume the heritable part of IQ is some sort of “teachability factor”.

    > occasionally countries do emerge from poverty and bad institutions, and when they do, the average achievements of their inhabitants go up in degrees this theory declares impossible in a single country.

    This part sounds like a relevant argument. Do expand.

  3. Scott Alexander says:

    I think you’re trying to establish a qualitative dichotomy between things where we know the causal structure and things where we don’t. I think both of them involve only probabilistic knowledge and that in some cases our probabilistic knowledge rises to the level where we can feel somewhat confident.

    Consider a brilliant general, like Napoleon. Suppose Napoleon can consistently win battles even when he has fewer men and worse weapons than his opponents, solely due to his tactical genius. Suppose also we have a good idea why Napoleon is such a genius. Maybe he was the end result of an Ender’s-Game-style eugenics program to breed great generals, and in fact he was the son of the two greatest generals the program directors could find.

    Intelligence seems to serve as a causal node here.

    Napoleon’s intelligence explains why he wins his battles. This gives us non-zero information. Perhaps some other general is stupid but wins because he has more men. Or some third general is stupid and has few men, but wins because he gives very inspirational speeches. If I know Napoleon is intelligent and you do not, I can make better predictions about a variety of things.

    Further, the correct causal graph – EUGENICS -> INTELLIGENCE -> WINS BATTLES is much harder to draw if you forbid yourself from interpreting “intelligence” as a node.

    Now, granted, INTELLIGENCE is a black box. But to a degree everything is a black box. Let me give an example.

    You say that blood pressure causes strokes by bursting small arteries in the brain. But this is only the cause of a small minority of blood-pressure-induced strokes. 90% of strokes are ischaemic (not relating to blood vessel bursting) and these are also strongly linked to blood pressure. The mechanism seems to be “blood pressure damages artery walls, which causes complicated biological effects which encourage clotting.” I don’t know whether “complicated biological effects” is as far down as anyone’s dug, or whether there’s some biochemist somewhere who understands the entire process.

    Suppose in fact the mechanism of blood pressure on clotting is poorly understood. In that case, blood pressure is also, in some sense, a black box. It might be a smaller black box, but it is a black box nonetheless.

    A great example of this is cholesterol. Like blood pressure, cholesterol is widely known to cause cardiovascular disease. If you lower your cholesterol by (for example) eating right and exercising, or by taking statins, you get less cardiovascular disease.

    A while ago, some researchers developed a couple new classes of drugs that were also very good at lowering cholesterol. They tested them on people and they worked great and were approved for sale.

    These drugs lowered people’s cholesterol a lot, but were later found to not affect heart attacks in the slightest. So apparently cholesterol was a very very close proxy for some other variable, such that almost everything that lowers cholesterol also lowers heart attacks, but not if you do it with these particular drugs.

    This seems to very quickly bump up into Humean limits. When things are correlated enough, and they fit our intuitive ideas of causal structure, and they respond to our attempts to manipulate them, we call them causation. Even if every intervention we have used to lower cholesterol thus far has also lowered heart attacks, it may be some undiscovered intervention won’t. There will always be black boxes at certain joints, but we can make them arbitrarily small with careful observation and testing.

    I don’t think the structure:


    is very different from the structure


    except that in the latter we’ve done a little more of our homework and have fleshed out more (though not all) of the intermediate steps.

    I hope neuroscientists flesh out some of the remaining fine structure in the INTELLIGENCE node in the same way I hope vascular biochemists flesh out some of the remaining fine structure in the BLOOD PRESSURE node. But in both cases our current understanding is a best guess (in certain systems we understand very well, it’s a best guess with >0.9999 probability). And I think we understand both systems will enough that our best guesses are enough to make some conjectures on.

    I wish to register disagreement with some of your object level points in part VIII, but for obvious reasons I’m not going to debate it.

    • Alexander Stanislaw says:

      I think you’re trying to establish a qualitative dichotomy between things where we know the causal structure and things where we don’t. I think both of them involve only probabilistic knowledge and that in some cases our probabilistic knowledge rises to the level where we can feel somewhat confident.

      Why yes, the entire point is that “IQ has statistical predictive value” and “IQ is the cause of several different outcomes” are separate claims. The fact that there is uncertainty in both claims doesn’t change the fact that no amount of arguing for the former can imply the latter. And certainly arguing for either one does not imply that increasing IQ will increase those outcomes. (Sorry if this seems like pandering, but to illustrate why, consider why no amount of bench presses will allow you to beat an average boxer even if you are stronger than them, despite the fact that arm/chest strength explains a great deal of why boxers are good at boxing).

      From my limited experience, I think that models are great in theory and can be tweaked to agree with all of the available data, but lousy when you extend them past their scope. In computational chemistry, a field we supposedly understand, you can make fantastic optimizations, create sensible parameters and make your models agree with empirical measurements, but then we you go to test your model on a new case reality suddenly diverges. And as phenomenon get more complex, modeling them can only get more difficult.

      On the blood pressure point, perhaps we could agree on the following? Increasing IQ might increase life outcomes, but in order to show that we would need to measure how much IQ interventions increase those outcomes, then check if the degree of improvement is conditioned on how much it increases IQ. It can’t be shown just by correlating IQ to those interventions and then correlating IQ to those outcomes.

      • Alexander Stanislaw says:

        Regarding the bench press example, I initially had squats, but I thought that was ungenerous to the case for a casual g. Either one works though, one could construct a strength factor and correlate it to success in boxing, but it wouldn’t imply that anything that increases the strength factor is going to impact your boxing ability by much. If boxers have a strength factor that is a standard deviation above normal then increasing your own strength factor by a standard deviation via weightlifting will _not_ make you a good boxer.

      • Gilbert says:

        I agree. Scott is very likely no longer listening, given that this post is more than a year old.

        • Alexander Stanislaw says:

          Doh! I either didn’t notice the year or it didn’t register.

Comments are closed.