This is basically an overly long response to a recent blog post by Scott Alexander. It’s not very interesting outside of that context, so read that first unless you did so already. Also, most of this is further simplification of Cosma Shalizi’s ancient and semi-famous blog posts on IQ, so if you understood those well you probably won’t find much new here.
Let me start of with a factor analysis that happened long before we knew the math of factor analysis: female sexual attractiveness.
One could build a very IQ-like measure of that by having a few random men rank a sample of women and then taking average percentiles.
Then one could do a lot of correlational studies to find things correlated with beauty and assign them to input and output categrories, drawing a flow-chart from inputs through attractiveness to outputs.
On the input side of the flow-chart one might have things like facial symmetry, make-up, breast size, skin clarity, waist-to-hip ratio, hair length and shininess, etc.
On the output side one would have things like the cost of favours men are likely to do for a woman, popularity, other woman getting angry when their boy-friend looks at the woman, etc.
Also on that side one would have things like the probability of her getting pregnant per sex event and the probability of the baby being born healthy if she gets pregnant.
In reality though only some of those arrows should flow through the central box. Some of the input factors are directly correlated with some of the output factors, namely those that contribute to prospects of offspring. Those arrows aren’t actually going through a real central box, it’s just that the outputs are relevant for basically the same reason.
On the other hand, some arrows here actually flow through a central box one might label “the brain stem’s estimate of biological procreation prospects” or female sexual attractiveness in the strict sense. Things correlating with the outputs not actually going through the central box still tend to correlate with it, because the brain stem is somewhat good at estimating such prospects.
Let’s look at how evolution designed the central box. (I’ll talk of evolution like an agent here. I know that’s not how it works in reality, but everyone talks that way for good reason.)
It had a lot of facts available, like “broad hips make it easier for the baby to get out alive”, “visible diseases are sometimes inherited by the baby”, etc. It also knew that some of these facts were more important than others. Using all those rules directly would be computationally inefficient though, and evolution didn’t want to waste too many resources on a large look-up table. So basically it created a weighted sum of many known female physiological influences on procreation and then tinkered with the weights until predictions with that sum became sufficiently similar to predictions made with the original data. Sufficient meaning in that context, that the better results one would get from the real calculation are not worth the cost of doing that calculation.
Basically this is an efficiency hack for reasoning with the facts evolution had available.
Let’s also look at where the central box works and where it doesn’t.
Nowadays there are a lot ways to confuse the input variables, like make-up, chemical hair-shinyfication, breast-implants etc. They do change the decision the brain stem makes through the central box, like male favor cost, girlfriend look-triggered angryness etc. But they don’t change any of the outputs correlated with the original inputs of the central box, which it was supposed to optimize. In other words, from a designer’s perspective these are things the box is bad at.
On the other side, modern medicine has also changed the consequences of some of the traits the box is adding up. For example, the correlation between sexual attractiveness and probability of conception is probably smaller than in used to be in the ancestral environment, because nowadays people having sex might be contracepting. Also there are now caesarians, so the correlation between obstacles and the baby not getting out alive probably went down. In other words, from the designers perspective even the original inputs are now probably weighted wrongly.
Still, the central box still works reasonably good where nothing has changed.
Now compare the situations that evolution considered in designing the central box with the situations it does a good job on. As you might have noticed, they are identical.
This is not a coincidence. Remember that box was created by taking all the correlations available at the time and then throwing some of the data away until it was boiled down to a single score. In other words it doesn’t contain any information that didn’t already go into constructing it and not even all of that. So it does a reasonably good job on the kind of correlations it was built to simplify, but there is no reason why it should work on correlations that came up later and so it doesn’t.
So even though make-up correlates positively with sexual attractiveness and sexual attractiveness correlates negatively with miscarriage even the dumbest conceivable doctor wouldn’t prescribe more make-up to prevent miscarriage, because the make-up-attractiveness-correlation is not among the ones the box summarizes and reasoning through the box and unsummarized correlations is not a valid argument.
Compare this to one of Scott’s examples, blood pressure. I’ll come back to his points about blood pressure not working that well, but first let me talk about why it works when it does.
My oversimplified layman’s understanding is that we basically have a good idea of how blood pressure works. If it is high, blood will press against blood vessels more and sometimes that will make them break. This is a bad thing, particularly if it happens in the brain. On the other hand, more pressure makes the blood go faster, which means the cells get more oxygen per time. This is why people with a blood pressure of zero (Or maybe equal to ambient pressure, don’t want to figure the details out right now) tend to go brain-dead a few minutes later.
On the other side, we also know how it is influenced. Things changing blood pressure either change how hard the heart presses or how big the blood vessels are, which in turn changes how hard they press back or maybe how much blood there is in total, which determines how hard the body must push to keep it in.
So, while we only care about changing blood pressure because of its effects, we actually know these effects are mediated through real thing called blood pressure. Eating too much salt will give you seizures by increasing blood pressure and not, say, by chemically corroding the blood vessels. Likewise, a heart attack will kill your brain by reducing blood pressure to zero , not by, say, just phoning the brain and telling it the apacalypse is here so it might as well go home now.
Blood pressure being real in that way means we can validly make arguments on newly found correlations between blood pressure and other things. So if we find a new drug increasing blood pressure and if that drug doesn’t have any other direct effects on the body (to be honest that latter one is the mother of all ifs, but I’m arguing principles here), then yes, that drug will cause the kind of things higher blood pressure causes.
On to the ways it doesn’t work. As Scott explains, pressure is different in different parts of the body. We actually care about blood pressure in the body parts where we care about effects but measure it somewhere else and that’s close enough except whenn it isn’t. So the explanation I gave above isn’t quite right, which in turn means it doesn’t work perfectly. Note however, that the explanation’s success is due to it being like actual reality and its failure is due to it not being like reality. Also, Scott notes measurement methods suck. Fine, but again measuring blood pressure works because the measurement is close enough to the actual physical quantity and fails because it isn’t.
Flow chart wise, we actually know the picture is similar to reality. In particular, the box in the middle corresponds to something in reality, there is only one such thing and the arrows are rightly drawn in going through that one box. They shouldn’t really go directly to the boxes on the right side or maybe through fifteen boxes we omitted that also have arrows between each other, some of which go backwards. Again, this isn’t perfectly true, but blood pressure is a good concept because and in so far as it is close enough and we have a justified expectation of it being close enough for newly discovered correlations.
In cool stats lingo, blood pressure is a causal node. Female attractiveness is a causal node only for the things evolution conditioned on it, but not for the things evolution was trying to achieve.
What evolution did with sexual attractiveness can be done with math for things we care about. The method doing that is called factor analysis. Sometimes we may be lucky and discover an actual causal node that way. But the way the math works, we almost always can build a box from a large bunch of correlations, even if no causal node is out there. Such a box is called a factor. Sometimes it represents something real. Sometimes it doesn’t.
Spearman did this for the various parts of IQ-tests and came up with a box he called the g-factor. (For the pedantic, he used a now-obsolete predecessor method of factor analysis, which hadn’t been invented yet.) This was particularly cool, because at the time he had good evidence for the g-factor being a causal node. That evidence turned out to be a fluke though.
So now we don’t have good proof of the g-factor actually being causal. Some people think it is though, and that matters because some arguments will be valid if it is but not if it isn’t. So they can think they proved things they actually only assumed in the disguise of assuming g to be causal.
At this point I’ll make a slight digression on heritability.
Scott thinks IQ-sceptics are trying to avoid thinking about claims like “Intelligence is at least 50% heritable”.
I’m actually fine with that claim, as long as we stick to its technical meaning.
Problem is, the word “heritable” sounds like it should mean but actually doesn’t mean “unchangeable short of bioengineering”.
To illustrate, let me make up a toy model were the two things differ. I’m not claiming that’s how it works, just showing one easy example of how they could differ. So in my fake model intelligence consists of 10000 binary abilities, each of which you either have or don’t have. Some of these abilities depend on each other, so you can’t learn the advanced ones before the primitive ones. All are learnable. The genetic part is that for each ability you have a genetically determined teaching time necessary to master it. For some abilities some people need less learning time than others, but given enough time everybody can learn every ability. If you get enough teaching time for a given ability you learn it, if not not.
Actual teaching time isn’t strongly enough dependent of needed teaching time, so ceteris paribus, people who would need more time on fundamental abilities (though perhaps less on advanced ones, for an equal total travel time to the ceiling) tend to learn less abilities. Thus intelligence is strongly heritable in our present environment.
As schooling improves we discover abilities many people didn’t learn previously and spend more time on them. Thus the Flynn effect.
Eventually we will figure out all the critical abilities and human differences in IQ will vanish without any bioengineering.
Again, I’m not claiming this is how it works, mostly because its a complex just-so story I just made up. But it is very compatible with all the results we get from twin studies. In fact more so than the story the average “human biodiversity” guy on the intertubes professes as a necessary result of that data. Which is to say, heritability doesn’t prove strict biodetermination.
However, if you assume g is causal and make a second and actually falsified assumption no one really believes to outlaw models like this one and add in the results of twin studies, then you can conclude intelligence differences can’t be much reduced short of bioengineering. Problem is, a lot of people think this follows from twin studies alone. It doesn’t.
Like Scot I’m concerned there’s a motte and bailey tactic involved here, only I think it’s on the other side of the controversy.
When you need to screen people for the kind properties IQ tests are designed to screen for, and when you don’t have more specific tests to screen for a more specific version of what you’re looking for (for example subject matter tests in college admissions) then IQ tests will do a better job than nothing. So far so good, this is pretty obviously true and actually not all that controversial.
Now for the bailey:
Almost all human societies have fairly hereditary social strata, where people tend to end up with approximately the same amounts of power, prestige, and money their parents also had, and so on until the tenth generation. This is somewhat embarassing for societies that follow a nominal ideology of giving everyone equal chances. It is particularly embarassing in America, where the underclass has a distinct skin colour, making it comparatively hard to just ignore the problem.
Here comes the IQ ideologue with cruel but comforting story dressed up as science: The underclass is so stable because they are irredeemably dumb for genetic reasons. No kind of affirmative action can ever alleviate that genetic stupidity so we better don’t even try. See, it’s nature itself that is highly unfair, not, perish the thought, the social structure. So those of us getting fairly high positions on the totem pole won a genetic lottery, not a benefit-from-structural-sin lottery.
There actually is no good evidence for that story. To the extent it’s directly testable it’s wrong. For example, shithole countries tend to have low average IQ’s and one could argue about what causes what. Except that occasionally countries do emerge from poverty and bad institutions, and when they do, the average achievements of their inhabitants go up in degrees this theory declares impossible in a single country.
But such stories are very attractive for an upper class, look how Malthus previously Eulered basically the same consequences from then exotic math (exponential growth) and some people still want to stick with that bullshit.
So if you put in some additional assumptions, assumptions that are so subtle you don’t even need to understand them or know you’re making them, then you can derive this bailey from the actually justifiable motte. And from the unfounded assumptions, but don’t look at that curtain too hard.