# Thingspace is a mysterious answer

[Edit: My interpretation of the thingspace text was probably colored by the context in which I was thinking of it. I still have criticisms of it, but it’s not as useless as I was implying it to be. Read the comments for details or proceed with caution.]

Yvain of Less Wrong fame is presently  investigating scholastic metaphysics and has posted a list of exploratory questions on his blog. The list is a bit hefty, so I basically referred him to a book.  But now I’ll cherry-pick a question I found interesting:

5. On the very remote chance that there’s anyone here who is familiar both with Aristotelian forms and with the idea of cluster-structures in thingspace, does the latter totally remove the need for the former, do they address different questions, or what?

Do follow his link, because that’s the argument I want to bash and that won’t make much sense unless you know it.

On the philosophical side the easy answer would be no, because we also need forms to explain diachronic identity as well as creation and corruption. But right now I feel like arguing something more radical: Cluster-structures in thingspace remove the need for nothing because they are bunk. The idea is less than wrong, it is simply meaningless.

To see the meaninglessness consider this ladder of questions: What is a cluster? Basically a cloud of data points that each are close to the next member of the cluster but further away from data points not belonging to the cluster. What does “being close” mean? It means the distance is small. And here comes the main point: What does “distance in thingspace” mean?

The hand-wavy math metaphor of a space is a trap here, because it makes you think of a “nice” space like the real 3D space we (seem to) live in, where the normal laws of geometry apply. [1] If that was a correct analogy, you could just use the Pythagorean theorem to calculate the distance from the coordinate differences. For example, if I walk 1 meter to the right and then 1 meter forward, I’ll end up $$\sqrt{2}$$ meters away from where I started. So if I know the coordinates of two points I also know their distance.

But thingspace is not nearly that nice. You can see that from its different dimensions having different units. To take the sparrows example, you might measure volume in liters and mass in kilograms. But what is the sum of 1 liter and 3 kilograms? Is it more or less than the sum of 3 liters and 1 kilogram?[2] Even if you assume some conversion ratio of liters and kilograms, this way of calculating distances will not coincide with your intuitive sense of similarity. For example, if another man is a kilogram heavier than I am, that probably isn’t all that dissimilar. But if one amoeba is a kilogram heavier than another one, you’d better watch out for aliens planning to lay their eggs in you.

In fairness, Eliezer Yudkowsky knows it’s not quite that simple and proposes

using a distance metric that corresponds as well as possible to perceived similarity in humans

Mathematically, this doesn’t make much sense either[3], but at that point we get the meaning: “distance in thing-space” is just another formulation for “perceived dissimilarity”. And then a “cluster in thingspace” is just a set of things that seem similar to other things in the cluster and not to things outside of it. The whole pseudo-math-talk is just borrowing plausibility from a prestigious domain of knowledge that turns out not to  have much of a relation to the question at hand.

But dropping the math-stuff, doesn’t the idea make sense?  Can’t the squirrality of a squirrel consist simply in being similar to other squirrels and not so much to dogs? Well, it depends on what you mean by “similar”.  If you apply it naively, it will turn whales into fishes, because a whale clearly looks a lot more similar to a shark than to a cow. So basically your standard of similarity will have to use a lot of your knowledge of the problem domain. But at that point it gets circular: Essentially similarity is supposed to impose a structure on reality while being defined by that structure. Different fish are all fish by virtue of being similar to each other, but that similarity pretty much consists in them all being fish. At the end of the day you still need an explanation of what all fish have in common. Which means the whole idea of phlogiston thingspace buys you nothing.

While I’m at it, I think the “clusters in thingspace” idea is an instance of a more general failure mode that is fairly common in Less Wrong style arguments. The steps to reproduce the problem on other questions are (1) hand-wavingly map your question to a mathematical structure that isn’t well-defined, (2) use that mapping to transfer intuitions, and (3) pretend that settles it. Note that doing steps 1&2 without step 3 is a fine way to generate ideas. But those ideas can still be wrong. If you want them to be right, you either need to replace step 1 by something much more rigorous or restate the ideas without the mathematical analogy and check if they still make sense.

Major examples of the Less Wrong groupthink falling into this particular trap include their vulgar utilitarianism, where the individual utility functions and their sums turn out not to be well-definable, and their radical Bayesianism, which basically assumes  a universal probability measure that has no  sample space or σ-algebra to live on.

Footnotes    (↵ returns to text)

1. In this context the technical meaning of “nice” would be that it’s a Hilbert space and the given coordinates refer to an orthonormal basis. But don’t worry if you don’t know what that means, it’s unimportant for the argument
2. Actually that should be squared liters and kilograms, but let’s not get pedantic
3. because that metric would necessarily presuppose knowledge of the clusters, violate the triangle inequality, or result in highly counter-intuitive clusters
This entry was posted in Arguments and tagged , , . Bookmark the permalink.

### 17 Responses to Thingspace is a mysterious answer

1. Yvain says:

>  Essentially similarity is supposed to impose a structure on reality while being defined by that structure. Different fish are all fish by virtue of being similar to each other, but that similarity pretty much consists in them all being fish. At the end of the day you still need an explanation of what all fish have in common. Which means the whole idea of phlogiston thingspace buys you nothing.

I don’t understand this point. Fish have things in common like spines, underwater breathing, leglessness, scales, fins, tails, genetics, egg-laying, et cetera. These things are noticeable even to someone who doesn’t already know that we want to classify all of these as fish. I agree they also have differences among them like size, color, etc, but the whole point of cluster-structures in thingspace is that these differences generally aren’t well-correlated: that is, the category “red animals” has almost as much variability within it in non-color dimensions as the category “animals” does, whereas the category “animals that breathe underwater” has some power predicting things like having fins, having tails, having certain genetics, etc. So it seems justifiable to concentrate on the spines-breathing-leglessness-scales-fins-tails-genetics-eggs cluster as an actual cluster once we’ve noticed it.

Whales have some of these things similar to fish (spines, leglessness, fins, tails) and others not (underwater breathing, scales, genetics, egg-laying). I agree it might be possible to come up with a classification system in which whales are fish, but this doesn’t seem necessarily wrong – our society uses a classification system based on evolutionary history, but if some other society (like the Biblical Hebrews who may have been referring to a whale when they talked about Jonah’s “great fish”) wants to make a classification system based on observable characteristics, I have no problems with whales being a “fish” atypical in its evolutionary history instead of a “mammal” atypical in its shape/habitat.

I think our difference might be that you are still thinking of categories as potentially objectively correct, whereas LW (and me) thinks of categories as a convenient grouping mechanism. So it’s not wrong to categorize a whale as a fish (same caveat as above). Classification on different schemes is perfectly reasonable – for example, a Jew wanting to know what’s kosher might have a category that looks like “fish” but is subtly different from the biologists’ who want to know taxonomy. But to classify a whale into, say, the category “electrical appliances” would not be wrong in the sense that 1+1=3 is wrong, but it would be stupid, in the sense that the UN trying to draw an Israel-Palestine border that stuck Tel Aviv in Palestine would be stupid.

More generally, I think the important hypothesis of “cluster structure” is that we form categories not based on necessary-but-sufficient conditions, but rather on many dimensions simultaneously, not all of which need to be satisfied to grant category membership as long as the others are a sufficiently good fit. Thus, Plato was wrong calling Man a featherless biped (because of Diogenes’ plucked chicken), Aristotle was wrong in calling Man a reasoning animal (because of the chimps in Planet of the Apes, not to mention babies and brain-damaged people who can’t reason), and the true definition of Man (which in the absence of real “forms” means “the sort of thing that would convince us to classify someone as a Man”) is that we have an idea of a typical person, from which we are more reluctant to call something a human the greater the total sum of the differences in every dimension may be. A sptial metaphor seems like a natural way to express this idea, and I think this conception is pretty hard to dispute, unless you either think you can come up with necessary-but-sufficient conditions or you want to go full-on Aristotelian objective forms.

(see http://lesswrong.com/lw/7tz/philosophy_by_humans_1_concepts_dont_work_that_way/ for a more complete analysis of this idea; the author doesn’t seem to realize he’s talking about cluster structures in thingspace, but I’m pretty sure he is)

So overall, aside from saying that the metaphor is not a mathematical proof (…) I don’t really see your objection.

A friend of mine in the machine learning field seems to think she can get computers to actually classify things based on the cluster-structure-in-thingspace model. I should talk to her and let you know if that’s true.

Mind if I post a link to this on LW?

• Gilbert says:

I don’t mind the existence of several correct classification schemes, though more hardline Thomists might disagree with me on that one. One example is the classification of humans. If he had seen Planet of the Apes, Aristotle would have concluded that the Apes were human. So that actually isn’t an objection to his definition, he just had a different concept in mind. I don’t think your kosher example counts as an objectively correct scheme though, because that is a relational property. You could still be a rational animal if you were the only one, but the kosher/treif distinction would not exist in a world without Jews, i.e. it is actually external to the foods it classifies.

Getting to how we identify categories, I’ll have most of my reaction in my answer to lucidian, but here I’ll just remark that the post you linked to has an ingenious comment explaining how looking for necessary and sufficient definitions of concepts still makes sense.

You’re free to link me on Less Wrong though I wonder in what kind of context you might want to do that. Depending on that context I may or may not turn up to reply to it.

• Yvain says:

“If there are no objective categories there clearly are no forms either and that doesn’t depend on how our subjective categorizations of things work. So I interpreted your question as granting that at least for the sake of argument and then read the thingspace text metaphysically…So the more salient answer is that forms and thingspace answer radically different questions.”

The reason I brought it up was that given that one should not multiply entities unnecessarily, and since forms are an extra entity in order to justify them you’ve got to show why they’re necessary. I thought Aristotle’s original justification was that they were necessary in order to solve the Problem of Universals, ie why we perceive all fish as having a certain fishness. If a more cognitive science-y approach to categories, of which thingspace is one example but not the only one, can explain that without forms, then forms start to look pretty useless. So that’s why I asked.

Just so I know exactly where you’re coming from: do you think that Lucidian and people like her will never be able to invent a categorizing algorithm that comes up with categories about as good as our categories are from a similar amount of a priori data, because the mathematical problems you mentioned above and in your response to her comment are insurmountable? Or do you think even if they do, it won’t really matter for the broader philosophical point?

• Gilbert says:

Well, once you’re talking of objective categories cogsci explanations of how we might understand them are pretty much irrelevant, because they depend on an external observer and the objective categories don’t.

I think the root question here is if the objects we experience actually really exist. I think the Yudkowskyan answer is pretty much no, the only thing really existing is some timeless wave-function. The Eleatics actually thought basically the same though they of course couldn’t dress it up in modern physics. I think that is basically rationality pulling the rug from under itself. A few days ago you said you had read about 2/3s of The Last Superstition. The rest is basically an explanation of and polemic about that rug-pulling, so don’t think I need to explain it here.

Once you do except that objects exists you need explanations of how they have an identity, how they can change (I think this one is particularly important for Plato&Aristotle), and what kind of rules they obey. Since all this depends on what exactly we’re talking about you’ll pretty much need some kind of objective whatness.

I’m not sure what your question is after.

I think “a categorizing algorithm that comes up with categories about as good as our categories are from a similar amount of a priori data” is pretty much AI-complete. AIs are very probably possible in principle. (The singularity is bullshit though.) Right now nobody has a clue how to  make one and I don’t see signs of that changing during our lifetimes. So Lucidian probably won’t live to do it, but people like her probably will. If Lucidian is even smarter than I think she is and gets the job done next week, I would be very surprised but it wouldn’t matter for the broader philosophical point.

If they get that level of concept identification by thingspace methods alone without the program also having other parts I will have been wrong about something though I wouldn’t know quite what. I can’t say whether this particular question would be affected, because there would have to have been something wrong with my thinking several steps back.

• Yvain says:

Okay, so it sounds like the real work of our disagreement is being done by arguments other than whether cluster-structure Occam-shaves away the need for forms, and that you’re rejecting my argument “If a naturalistic/reductionist explanation of categories is possible, then there’s no point in coming up with a non-naturalistic/reductionist one” on the grounds that forms are useful for other reasons. I’ll try to address some of those arguments when I review Last Superstition.

I of course agree that cluster-structure is a sketch of how one might go about cog-sci-explaining categories and not anything remotely near the actual cognitive science explanation. If you agree that an AI might theoretically be able to able to use something *like* cluster-structure with a thousand times more formality and math and auxilliary insights, I don’t think we have any disagreements left on this point.

• MugaSofer says:

I think the root question here is if the objects we experience actually really exist. I think the Yudkowskyan answer is pretty much no, the only thing really existing is some timeless wave-function.

I don’t understand this. Your chair still exists if you learn it’s actually made of quantum magic or tiny billiard balls or whatever. It still continues to perform chairlike functions. It is still useful to classify that particular subset of the waveform/collection of atoms/whatever as a “chair”, belonging to the same category as the other thingies you already classified as chairs.

• Gilbert says:

To say it in a more LessWrongy way, the root question is if objects we experience exist in the territory or only in the map. I think the Yudkowskyan answer is only in the map. I believe if you think it through that world-view basically rules out the possibility of the map describing the territory

• MugaSofer says:

The chair exists – it just doesn’t have any extra chair-ness property. Since we’re speaking Yudkowskyan, it doesn’t have little tags on the atoms saying “these are part of a: CHAIR”. Those labels exist in the map, not the territory. But the chair exists in the territory.

Suspect I’m missing something here.

• Gilbert says:

For your thinking to make sense, the map must somehow be similar to the territory. Your brain’s physical configuration while thinking about the chair is very different from the chair’s. They still have something in common and that something isn’t a projection of the state vector or a configuration of particles or whatever.

• Alexander Stanislav says:

“For your thinking to make sense, the map must somehow be similar to the territory. Your brain’s physical configuration while thinking about the chair is very different from the chair’s. They still have something in common”

I think I understood where this conversation was going until this point. I don’t know what you mean by similar. Maps are often made of paper, mountains are not. There is nothing contradictory about a map describing a mountain.

I also don’t see what this has to do with forms being necessary to account for categories. It still seems to me that a universe without objective categories (but with human conceived categories) would be indistinguishable from this universe.

PS: Is there a way to have multiple answers to the anti-spam quiz? I tried “cluster structures” and “cluster-structures” before realizing the answer was “clusters”.

• Gilbert says:

But a map actually is similar to the mountain, i.e. it shares a structure with it. There is a reciprocal relationship between points on the mountain surface and on the map. (Not quite, because a map exaggerates more important features and separates things that would overlap at its resolution, but you know what I mean.) The map encodes heights of some of the points and those match those of the mountain points. Things on the mountain correspond to pictographs on the map. Features of mountain-parts may be color coded. So the only way the map is useful is that it presents a structure the parts of which relate to each other like the parts of the territory relate to each other.

So on to the philosophical metaphor of our understanding being a map of the world. Here again this is only possible because our understanding shares a structure with the world. But that can only work if things actually exist at a level corresponding to our understanding’s level of abstraction, otherwise there simply are no things to relate to each other in the same way their images in our understanding relate. (Putting the same thing more nerdyly: In Gödel, Escher, Bach,Douglas R. Hofstadter calls the relationships between such structures isomorphisms. As long as we remember this is just a metaphor it’s actually a good one. But then actual mathematical isomorphism can’t be defined without defining the spaces they connect, which unrolling the metaphor would basically be categories. And anyway “isomorphic” is just Greek for “of the same form”.)

Now you could bite the bullet and just say the map doesn’t actually represent the territory. But since all our thinking happens on the map that would mean we actually can’t think about reality, which is basically a way of saying our thinking doesn’t make sense.

I don’t think the words “a universe without objective categories (but with human conceived categories)” have a meaning.

On the PS, “cluster structures” and “cluster-structures” now work. Multiple answers are easy, but only if I think of them beforehand, which in this case I didn’t. Sorry for the inconvenience.

And a PS of mine: If you reply to this I’m afraid my sur-reply will take a while. Right now I’m busy with a writing project I gather you’re also busy with. Then after the 18th I need to write posts that should have been comments replying to Scott and Andrew more than a month ago. Plus I’m hunting for a flat in meatspace. Which is a long way of saying I’ll be happy to engage your arguments eventually, but right now there is a significant back-log.

• MugaSofer says:

Here again this is only possible because our understanding shares a structure with the world. But that can only work if things actually exist at a level corresponding to our understanding’s level of abstraction, otherwise there simply are no things to relate to each other in the same way their images in our understanding relate.

Patterns.

• Gilbert says:

As long as it’s understood they are actually out in the territory (and not just features of the map) I think the difference between “patterns” and “things” is basically semantic.

• Alexander Stanislav says:

Good luck on said writing project. Unfortunately I wasn’t chosen to participate.

I agree with everything you said regarding maps and isomorphisms, that was very well put.

Regarding objective categories, I should have been a bit more carefully in my wording. What I intended was “a universe in which most human conceived categories do not correspond to objective categories”. In other words most categories are things that humans came up with to conveniently represent the world rather than most categories being ephemeral properties that humans discover.

Of course I should probably at this point explain what I mean by objective, which I will do via an example:

There is no such thing as objective blueness. The reason why we call things blue, why we have a category called blue, is because most humans perceive objects that reflect light in a certain way, “blue”. So blueness is not a property of an object, but rather a combination of how an object reflects light combined with how that light is perceived by human eyes. Does that mean that I think that blueness doesn’t exist? Of course not. Just that it doesn’t exist independently of humans. While its true that most humans do perceive blueberries as blue, it is not an objective property of them and sure enough there is a tribe in Africa (the Himba tribe) that doesn’t have a word for blue. My position is that most human conceived categories are like color in this regard. Categories are not features of the territory, but features of how humans represent that territory.

• MugaSofer says:

As long as it’s understood they are actually out in the territory (and not just features of the map) I think the difference between “patterns” and “things” is basically semantic.

There you go then. The chair is a pattern in the sub-atomic particles or the waveform of whatever, and that pattern is what’s recorded on the map, not the stuff it’s a pattern *in*.

2. lucidian says:

Hello!  =)  I’m Yvain’s friend; I study machine learning, where we often work with clusters in thingspace.  I’ll do my best to defend them: far from being some half-baked analogy, clusters-in-thingspace is a legitimate mathematical model that’s often used in machine learning and cognitive science.  However, before I discuss the math, I want to make sure that I’m understanding your post.  As a warning, I have no background in metaphysics or Aristotelian forms.

On LessWrong, clusters in thingspace are primarily used for epistemology.  On the other hand, it seems that Aristotelian forms are part of metaphysics.  Now, I’m pretty convinced that clusters in thingspace are great for epistemology.  But if I’m interpreting your argument correctly, you’re asking the question “Are clusters in
thingspace good for metaphysics?”, and concluding that they aren’t.  In order to figure out whether I’m understanding your argument correctly, I’m going to try rephrasing it.  I’ll stick with Yvain’s fish example for convenience.

So, as far as I can tell, you’re starting out with the premise “There is an objective category ‘fish'”, and you’re asking “How can we describe that category?”, and then claiming that clusters in thingspace are not a good description of the category.  And I have to say, I agree with you.  A much better approach would be to list some
fundamental attributes of fish, like the ones Yvain gave (“spines”, “no legs”, “scales”, “fins”, etc.).  From what I can tell, your objection to clusters in thingspace is as follows: suppose we use clusters in thingspace to describe the category ‘fish’.  Well, then something is a fish because it’s similar to other fish.  But in what
way is it similar to other fish?  Well, it has a spine and scales and has no legs, and it breathes water.  So we’re right back to describing fish in terms of attributes (which form the axes of the thingspace), and there’s no need to bother with the clusters as all.  If this is your argument, I agree with it – clusters in thingspace are pretty
useless for metaphysics.

However, I’ll argue that they’re extremely useful for epistemology. Let’s go ahead and assume for the moment that objective categories exist.  (I don’t actually believe this, but whatever.)  Now suppose my knowledge about these categories is limited, and I want to learn about them.  Here’s two cases in which clusters in thingspace could be useful.

– Suppose you’ve seen a bunch of fish.  You also know a bunch of possible attributes – has scales, has fins, color, weight, whether they breathe fresh or salt water, etc.  However, you don’t know which
of these attributes are part of ‘fishness’, and you’d like to figure that out.  (This sort of problem is frequently posed in machine learning.)  Well, you could look at a bunch of fish, and see which attributes they all have in common.  If all your fish have scales and fins, then those attributes are probably important; also, since fish vary wildly in their color, weight, and whether they breathe fresh or salt water, those attributes must not matter.  Thus, knowing the similarities between the fish allows you to infer the qualities of ‘fishness’.  To put this more mathematically, you could plot all the fish in a space whose axes are the attributes you’re considering. Then, you could look at the variance in each attribute; the lower the variance, the more likely the attribute is to be a quality of ‘fishness’.  (This is, of course, a bit of an oversimplification, but hopefully you get the idea.)

– Suppose you know a bunch of possible attributes, but you don’t know what the objective categories are.  Well, you could take all the animals, plot them in a space whose axes are your attributes, and discover that there’s a cluster in the “spines, no legs, fins…” quadrant.  Since this cluster is so nice and neat, it must correspond to one of your objective categories!

It’s getting late, so I won’t go into detail on how this is done mathematically in the field of machine learning. Very quickly, in cognitive science, there’s a model of how humans represent concepts mentally, called prototype theory. In this model, each concept consists of a prototype and a similarity metric. The prototype is a representative object of the category; I think it’s like a Platonic form. The similarity metric describes how objects in the category can differ from the prototype; it specifies how we should weight the dimensions in thingspace, and thus solves the issue you brought up with the axes having different units. The similarity metric allows us to specify that fish are more likely to differ in color than in their ability to breathe water.

The multivariate Gaussian distribution is a very convenient way of implementing prototype theory mathematically. The mean of the Gaussian corresponds to the prototype, while the covariance matrix describes the similarity metric. One can view the concept as a probability distribution over thingspace: “given a fish, what is the probability that it will have these attributes?” In order to actually learn the cluster, you just need to plot your datapoints in thingspace and fit the Gaussian to them.

• Gilbert says:

Hi lucidian, nice to meet you.

I think I disagree [Edit: sorry, that typo was rather meaning-changing] with everything you said and i don’t know if you even think the part I would disagree with.

Basically I accept the methods you talk about as useful tools for the formation of some concepts, just not as the whole thing for all concepts. One way to see that is that you are still using a lot of Domain knowledge.

For example, if you do the concept-identification thing, your attribute space probably isn’t a thingspace containing “all the information in the real object itself.  Rather redundantly represented, too”. Contrarywise, you probably have already identified some attributes the machine is supposed to use in making the decision. That would be true even if you haven’t done it explicitly, because machine will have a limited number of data input channels and won’t be able to identify clusters that have other properties. Also, the number of objects of various types you use for input probably relates to those concept. If you tried to do clustering on a reasonably sized sample of all organisms on earth, the sample probably wouldn’t include anything multicellular. So I guess you need to structure your training data to include reasonable numbers of specimen for any cluster you plan to identify. And finally you will need some similarity metric. If you fit multiple multivariate normal distributions you need some way of deciding how many. And then you probably end up judging the results by how well they match the classification an infinitely patient version of you would have come up with.

If you do the identification thing you have all that plus more. You will already have identified the concepts and feed your machine with a number of pre-classified objects to learn from. Also, your multivariate Gaussian is decidedly a feature of the individual concept and not of the space, so it’s not really about clusters. For example, it seems to me, that you could even do concepts with partial extensional overlap, that actually aren’t clusters in the attribute space you’re looking at.

And then more complicated concepts will still include reasoning on the identified. If your machine, for example, is supposed to learn to identify dogs then you might think about using different distributions for different kinds of dogs and then have explicit rules that chiwawas and sheperds are both dogs but rats and sheep aren’t. At least that’s how I think my intuitive dog-identification-algorithm works and short of breeding or dissection I don’t see how else I could do it.

Don’t get me wrong, I’m not trying to diss your work. It is actually way cool that you can make a computer learn classifications you never explicitly code into it. But I still think it also needs to use lots of a priori knowledge to do it. Which is fine, because the same is true for humans.

So, bottom line, I will grant this as a cool set of heuristics, but not as a complete approach to concept-identification and therefore not as a possible definition of what concepts are even in a (counterfactual and impossible) world where they don’t correspond to estimated objective categories.