Ben Kovitz’s home page

I’m a Ph.D. student in cognitive science & computer science at Indiana University.

Email address:bkovitz

"Experiments with Cascading Design", EvoEvo 2015. Slides: Keynote PDF

"Tagging in Metaheuristics" in Workshop on Metaheuristic Design Patterns, GECCO 2014
paper | talk

"Structural Stigmergy: A Speculative Pattern Language for Metaheuristics" in Workshop on Metaheuristic Design Patterns, GECCO 2014
paper | talk

"Slide Rules: Math that Fits the Hand and Eye" at Enfascination 2012

Main research interests

(Slightly edited from my application to grad school.)

I am interested in how representations of a domain grow from being incoherent or poorly fitted (“confused”, “murky”, “poorly defined”) to being highly structured and tuned to the domain (“clear and rigorous”, “able to approach the topic systematically”, “well-defined”). This includes especially the way that a change from one representation to another can suddenly make a complex problem simple or make obscure features obvious.

This is both a computer-science interest and a cognitive-science interest. A lot of thought about logic says, “Everything should be clearly defined—no fumbling around and no vagueness. That’s the only way to approach things rationally.” This kind of advice was once fairly common in software engineering: get systematic and formal during requirements so everything will go smoothly during architecture, coding, etc. “Rationality” and writing computer programs would seem to require that one’s domain be bounded, all criteria be defined, and all terms of discourse be locked onto their meanings before thought even begins.

I am most interested in the “murky” period before there is clarity. The ability to fumble around finding a good way to bound a domain, cast about for good criteria for success or goodness, and let the semantics of terms slip around to become more expressive is what makes us smarter than computers. But those processes are themselves computational, and can themselves be mathematically modeled. Such modeling would serve two purposes: accumulation of useful search heuristics suitable for computers, and helping understand something of the “messy” nature of real-world human reasoning. I believe it would also shed some light on how brains and living things in general—and computers, before long—navigate the unknown and adapt to what they cannot anticipate.

The above are very broad interests, of course. For specific research projects to start right away, I am currently most interested in these:

  1. Automated theorem-proving, where instead of starting with a theorem and searching for a proof, the program starts with some axioms and searches for the interesting proofs. In other words, the program’s job is to explore the “proof graph” and sniff out the most interesting nodes. What makes a node interesting? Well, that’s a research question. But I have a tentative answer: a node is interesting if, before you had it, it was hard to find, and if finding it increases your ability to search the proof graph.

    It will not have escaped your notice that there is a huge problem with this: the vast majority of all the theorems in any system are boring. How does the program get around that? Well, that’s a research question. But I have a tentative answer: by representing the current set of “known” theorems in a very compressed way, which makes searching by simple pattern recognition easy. The direction I’m exploring right now is to restrict all searches for proofs to something close to O(1) time: the system either recognizes a proof as “obvious” very quickly, or it gives up. The system spends most of its time refining its representation of all the known theorems, and searching for “cracks” in its coverage of the proof graph so far.

  2. Increasing evolvability. A certain kind of design becomes easier to modify in radical ways as it becomes more complex; specifically, designs that heavily exploit “design leverage”: each element does almost nothing but trigger the other elements to do things. Genomes and extremely modular software systems are two examples. As the system acquires new small elements, they open up new degrees of freedom for future small changes that produce large, beneficial effects. I’d like to experiment with pushing a computational system based on “design for evolvability” to adapt extremely well to some tricky domain.

Both of these are attempts to find a vocabulary to explain how it is possible to explore domains when you don't already have a conceptual map to guide you.

Other curiosities

Here are some more things I’m curious about. I’m still figuring out how or whether to turn some of these into research projects to do during my Ph.D. Some of these you could call philosophy, others psychology (cognitive science), and still others computer science or informatics. These might all seem like a chaotic mish-mash, but I hope you can see that they are just a variety of different, specific aspects of that one fundamental thing discussed above. Here's a way of clustering them:


A paradox of pedagogy

In elementary algebra, there are reasons why polynomials are considered important, but they emerge somewhat holistically from vast experience. Consequently they are very hard to explain to beginners. This creates a difficulty in any sort of pedagogy, perhaps the greatest difficulty of all: how do you lead a student to understand and appreciate these things, when the student lacks the needed kinds of experience and concepts?

A traditional solution to this problem in many cultures is to “pull authority” on the students by demanding that they do things without understanding the reasons for them. “You’ll understand when you’re older.” I think that, on the one hand, an authoritarian approach cannot be completely eliminated, but on the other hand, if you give some thought to why things are important, and what kinds of experiences and lines of thought would lead a person to see that, you can often find a pretty quick set of activities to do or things to point out, which give the student some very valuable first-hand understanding. (Looking to the history of the subject nearly always turns something up.)

Another example of this sort of paradox is that sometimes you need to understand the whole in order to understand the parts, but you have no choice but to start with the parts. For example, Ancient Chinese tends to be very terse: so terse that it’s difficult to make grammatical sense of it. I’ve heard that a fundamental heuristic for parsing Confucius’s sentences is, “Just think of what Confucius would probably say with these words." Of course, if you don’t already know what sort of thinker Confucius was, you might find this difficult. But if you spent enough time puzzling over those terse sentences, you might gradually get a sense of what kind of thinker Confucius was, which in turn would enable you to parse his sentences.

The immediate research interest, though, is just to collect a lot of examples of these things that are hard to explain to a beginner, and examine the relationship between the “hard to get” knowledge and the ideas that emerged as important.

Rhetorical clash

There are different, incompatible sets of unwritten rules of rhetoric. I think those reflect different, incompatible ways that people represent things in their brains.

There are lots of incompatible underlying frameworks that make things persuasive or relevant, and when people are not using the same framework, they often think the other person is being unreasonable. A quick list of some differences: management vs. understanding, cognitive delegation vs. first-hand understanding, negotiation vs. psychological attunement.

The type of rhetorical clash I find most interesting, though, is between these “rules”, described here in extremes:

By “context”, I don’t just mean “the surrounding words”, I mean everything else in the world, that relates in any way to the subject at hand. I especially mean the concrete reality of whatever you’re dealing with.

Should you make meaning independent of context or entangle meaning with context? I think the answer is obvious: “It depends." But, it appears to me that people have strong preferences about how context-independent or context-entangled their approach to meaning will be, and they hold it fairly consistently across varying circumstances. For example, the uproar over “situational ethics” illustrates two opposed ways of grasping meaning. Hypothesis: If people simply had a vocabulary to describe this variable, they would naturally find themselves becoming more flexible, better able to choose an approach that fits the topic. (If this hypothesis is false, and people can’t really vary this variable, that would be really interesting.)

One thing I’d like to do is just catalog various rhetorical styles and figure out why people found them appealing. For example, the theological argumentation of the Protestant Reformation strikes me as having a peculiar, consistent quality: it’s somewhat legalistic even while it tries to make sense of a not-very-legalistic text, it intelligently seeks consistency at a very deep level, it’s permeated by a dour tone of “you’re not good enough”, and, at least it seems to me, pretty well reverses the main ideas of the very text they’re working so hard to treat as authoritative, and the manner of thinking almost necessitates this. Just the heavy emphasis on exegesis of an authoritative text is a kind of rhetorical style (or even “style of logic”).

Semantic arguments and sophistry

I don’t like semantic arguments. I want to put an end to them. And what better way than to research them in detail and write dry, scholarly papers about them? Seriously, if people had some conceptual vocabulary for describing semantic arguments and what’s wrong with them, I think they’d find themselves almost unable to continue engaging in them.

I expect, though, that examining semantic arguments will yield some surprising insights about other things. Now, I believe that the solution to all semantic arguments is pretty simple: just give some examples, draw the distinction that you want to talk about, designate it with a word, and off you go. I have come across a strange objection to stipulative definitions, though: I’ve heard people object, say, to professors’ declaring that they will use a certain word to have a more-precise meaning than usual, because this is somehow overbearing—that the professor has no right to use a word to stand for a meaning of his choosing. That objection strikes me as insane, but I’ve come across it in enough forms that it seems clear that there is some connection in the human brain between semantics and social dominance.

Logic (in the old sense)

Logical structure

Over time, “logic” has grown more and more to mean “rules to follow mechanically to be sure you never derive a falsehood from a truth." I think there are lots of other important things to know about how ideas connect logically to other ideas. Some are opposed, even. For example, the logical structure of most legal arguments is quite different than the logical structure in The Origin of Species. A lack of vocabulary to name these differences has often led people to discount or misunderstand one or the other.

Perhaps the most important part of logical structure is the way in which single facts gain importance due to the logical structure they’re part of. For example, a few measurements provided strong evidence that there is no luminiferous ether. But, very often when people say “studies show (some conclusion)", the facts in the study don’t really show much of anything. For example, if 70 out of 100 students got higher grades by using Technique X, does that prove that you will get higher grades by using Technique X? Does it mean there’s a 70% chance? The error here is using statistical techniques that help sort out multiple causes, to defend a conclusion that amounts to “this one cause produces this effect”.

A peculiar kind of logical relationship is that between a summary and the details that it summarizes. We all know that one can summarize "logically" or "illogically", but what vocabulary do we have for describing the difference? An illogical summary gives undue importance to trivialities or omits matters of importance. These are matters of how the elements fit into their logical structure, but there is surely more to say than that.

Information is lost in a summary, but something is gained, too: a kind of vagueness. Vagueness is extremely valuable: it’s one of our strategies for dealing with the unknown. There is a kind of logical relationship between vague ideas and both the known and unknown things they relate to—the past basis and the future ability to navigate.

The unstated reasons why various mathematical definitions and theorems are considered important

Have you ever wondered why they spend a year or two of high school making you learn all about polynomials? If you ever looked at group theory, did you ever wonder why groups are so important? Why, for example, is associativity included in the definition?

There are answers to these questions, but they're hard to explain, especially to a beginner. I think one important reason why polynomials are important is that in the physical sciences, differential equations describe causal relations, but in practice most differential equations lack closed-form solutions, meaning that there is no finite series of steps to calculate, say, where the rocket will be at time t. But, in practice, these differential equations can be approximated by polynomials to any desired accuracy, and polynomials are straightforward and easy to calculate.

I'd like to spend a solid month doing nothing but grilling mathematicians about these things, and assemble a "map" of the main concepts across the main topics of mathematics, showing the kinds of difficulties that arise from some pieces and how those difficulties are resolved by other pieces. I find that kind of explanation of importance particularly fascinating.

Describing the unknown

We are constantly poking around in the unknown, finding things and adding them to our knowledge. Just asking someone their name is fishing in the unknown. We expect that the person would have a name; we just don’t know it yet. That’s probably the easiest way that the unknown relates to the known. I call it “multiple choice”.

At the opposite extreme are revolutionary scientific theories, like the microbe theory of disease, relativity, and evolution by natural selection. These are so far “outside the box” of previous understanding that you can’t even ask well-defined questions ahead of time. There is no way to systematically find this kind of unknown. I call this region of the unknown “the yawning void”. The yawning void is everything that doesn’t have a place in your current conceptual framework.

Somewhere in between is any new person that you meet. I think that other people are much further out in the “yawning void” than we usually suspect. When you get to know someone closely, you invariably find that they are different from you in ways that you had no concepts to describe, no way to imagine in advance.

Working out a more-refined vocabulary for ways to talk about the unknown sounds to me like a pretty do-able research project.

*** I once gave a conference talk about the yawning void—at a conference on software specification. Slides, summary.

Incompatible conceptual frameworks for roughly the same things

Sometimes different people, or different cultures, develop conceptual frameworks that seem to be about the same things but actually don’t correspond. Each is tuned in to the other’s yawning void, but they think they’re semantically aligned because they don’t know that a non-alignment is possible. This often leads to conceptual “impedance-mismatches”: ways in which one set of ideas gets woefully distorted when translated one-to-one into another.

For example, when people today read Aristotle, they often think that he was just doing a dopey job of inventing set theory. I think what’s happening is that Aristotle had a weird theory of causation, and his theory of logic is mostly concerned with how human understanding relates to causes—and none of this fits anywhere in modern philosophy. The “transplanted” Aristotle looks like a bumbler, making “undergrad” mistakes with the modern concepts (like “All A are B” implies “There exist some As”). I hesitate even to use the word “causation” to characterize the mismatch, because the kinds of things in Aristotle that get translated as “causes” don’t fit contemporary language well. (For example, Aristotle’s “the from what” gets translated as “material cause”.)

You don’t need to go to ancient philosophy to find conceptual impedance-mismatches, though. (BTW, I think “impedance-mismatch” is a weak metaphor. If you know a better one, please email me.) When people make plans about things involving information, such as software development, they are often surprised to discover, if they rely heavily on written “requirements documents”, that they were semantically out of sync. It simply can’t be avoided. A very simple example from software engineering is trying to export data from one calendar program to another. Even a translation of something so rigid and mathematical invariably results in “Procrustean fits”.

The logical forces that led scientific concepts to change

Definitions change, because the conceptual frameworks in which the distinctions are made change. For example, “gravity” comes from the Latin word for “heaviness”, but eventually we redefined it to mean an attractive force between all masses (and redefined it even more after that).

I'd like to closely research the history of various fundamental scientific concepts and terms, and identify what considerations led to each change. One in particular, that I could spend a solid month on, is energy. The word was originally coined by Aristotle, one of his many clumsy neologisms. Translated into Anglo-Saxon roots, energy is "at-work-ness". That fit into Aristotle's worldview, where biology was primary (rather than mechanistic physics), and one of the most important ideas is that organisms sometimes are engaged in their distinctive activity, and sometimes not (because they're still immature, because they're prevented from doing so, etc.). When the word got transplanted into the Newtonian/atomistic view of the world, energy underwent a very radical change, and still another in Einsteinian physics. Along the way, it spawned a sibling: entropy. (Even the word physics started off with a biological orientation, which it has now completely lost: the ancient Greek word meant something more like "what comes to be by birth and growth".)


Causation is one of the most important ideas we have for making sense of the world, but we seem to almost completely lack vocabulary for talking about it. I think causation is the worldly (“external”) basis of logic and concepts. You can’t really understand logic unless you understand causation.

Usually when people explain something or talk about causes, they talk as if there must be a single cause that produced the effect in question. For example, suppose some people put on a play, and it went poorly. People will often search for a single causal factor to “blame”. But really, everything that ever really happens results from trillions of simultaneous causes. The air in the room (try putting on a play without it), the acoustics of the room, microbes in a performer’s bloodstream, microbes in an audience member’s bloodstream, the overall world economic situation, the length of the day (this affects mood), other plays that the audience has seen, other life events of the performers and audience members, the natural rhythm of getting worse and getting better as you continue to practice, and nobody knows what else.

Another area where I think a vocabulary for causation would do some good is getting past the kind of confusion where one person says something like, “Garages are best placed in the back of the house, facing an alley," and someone else objects, “But in some towns, they don’t have alleys." The first statement is meant to identify a resolution of many simultaneous forces at play in homes and towns, such as the need for parking, the need for buildings to have a “public face”, the need for people to walk along pleasant routes, etc. When we make highly general or universal claims, we are usually talking about causal forces, not proposing simplistic rules like “100% of all garages should go in the back of the house, regardless of the situation." (See "Rhetorical clash" for why such claims are sometimes misconstrued so crudely.) Obviously, on a street with no alley, you couldn’t make a garage face the alley. But by exploring the causal forces involved, you can sniff out good designs.

Christopher Alexander’s work on building and town design is an inspiration here, along with the “patterns” community in software engineering.

Heuristic (discovery)

Note: People sometimes use the word heuristic to mean "a rule of thumb" or an imperfect, rough-and-ready method of making an estimate or guess. I am using it here strictly in the older sense of "the art and science of discovery", or a specific technique for discovery.


What makes something interesting—a news story, a mathematical idea, a scientific paper, a piece of gossip, a move in a game of chess, a period in history, a fact? No doubt there are many factors, but one I'd like to focus on is: portending a discovery of a sort that has no place in your current conceptual framework. Or in terminology from above, something is interesting if it seems to provide an opening into the yawning void. That is, it suggests to you that by pursuing this lead further, you will change the mental framework in terms of which you are understanding the situation.

Now, what about a fact could suggest such unspecified future cognitive shifts? I think it's a combination of these things: the fact seems pretty solid (that is, definitely "known"), it exists within a domain that we have already mapped out with some pretty solid logical/mathematical relationships, and yet it doesn't fit neatly into those relationships even though it "should". More succinctly: unexplained facts are interesting. You expect that the fact has an explanation, but you don't know what it is yet.

Here's how I'd like to research this kind of interestingness: do an activity, or write a computer program to do an activity, in a way that makes seeking out interestingness the highest priority.

I've done this while playing board games: instead of playing what seems like "the best move", defined by the criteria of the game (in chess, capturing the opponent's king; in go, surrounding the most territory), play "the most interesting move". My experience was that my apparent skill level greatly increased. Hypothesis: My play was "smarter than I was", because I was exploiting the unknown; the unknown was on my side in some way. I'm sure it didn't hurt that the unfamiliarity of the resulting situations threw my opponents off their game a little, too.

A computer program that could model interestingness in some domain could generate some, er, interesting results. Modeling interestingness is hard, though. Before you can do that, you need to model having a conceptual framework: both navigating by it and modifying it in response to new stimuli that don't fit the framework. That's hard because it's messy and sloppy in a way that is hard to model (mostly what we know how to do with computers is implement extreme neatness and rigidity), and even beyond that, you would have to implement the "calculation" of the degree of interestingness, and then translate that into an action to explore a given avenue. So this is not a research project to do right away.

I think, though, that the following idea should be workable in some form of relatively modest research: the best move, by the criteria of the game, cannot be calculated, but interestingness can. To illustrate: In go, the common saying is "Play the biggest move", meaning the move that increases the size of your territory the most, relative to your opponent's. If you think about that for a minute, you'll realize that it's almost completely vacuous advice. It's like "buy low and sell high" in stock-trading. To determine the "size" of a move, especially in the early and middle stages of the game, would require more look-ahead than can possibly be done. Almost all of the relevant information is unknown to you, and unknowable, because the mathematical space is too vast and too craggy. But there is a way in which some of that unknown protrudes into your present, knowable situation: the sense that some moves are scary, hard to evaluate, neither obviously good nor obviously bad. That difficulty of evaluation is interestingness. By playing the most interesting move, you tap into the unknown, and you do it in a way that is feasible. "Play the most interesting move" is scary advice, but it's not vacuous.

Ways that information percolates

I’ve read that before the Green Revolution, the water temples of Bali had achieved an extraordinary balance among all the causal forces in the system, including the insects and their capacity to reproduce out of control. I assume (perhaps wrongly) that they had no rigorous, “scientifically” tested mathematical theory of all those factors. Somehow, the real-world ecological information had percolated into their religious rituals.

To take another example, if you are catching a cold, sometimes you get a craving for just the sort of food you need to stop the cold. You don’t consciously know what molecule you are seeking. Somehow, the information about the molecule, its ability to help you fight a cold, and the fact that it is in the food you crave, have all percolated from the raw reality into your conscious mind, without your understanding a thing or doing anything systematically. I think that’s pretty amazing, especially when you consider that the feedback loop surrounding this unknown molecule even gets extended into the economic system, when it leads you to walk to the grocery store and buy carrots or whatever you were craving, and thereby affects the price of carrots.

How groups of people can be smarter or stupider than individuals

Wiki pages are often written by an “intelligence” greater than that of any of their contributors. Wikipedia is a great example. Decisions made by committees are often much less intelligent than what any individual member would do. But there are ways of getting groups to consistently make decisions that are smarter than what any individual member could do. They involve peculiar, slightly unnatural constraints, which a create peculiar, unexpected freedom of exploration.

Wild hypothesis: The mechanisms that make groups of people smarter than their members are pretty similar to the mechanisms that make non-conscious things work together to form "smart" things: enzymes work together to make metabolic networks and entire living things; neurons work together to make a control system for the body; low-level brain representations (presumably) work together to make consciousness. More to the point, perhaps, watching human groups collaborate might reveal mechanisms to search for in those lower-level networks.

How improvisational theater can possibly work

Improvisational theater is much easier than it looks, even though, when it’s done well, every moment is pure discovery and the actors truly have no more idea of what’s going to happen than the audience does. I think the same sorts of techniques are at work in all sorts of human creativity. In improv, these heuristics appear in an especially concentrated form. Improv is also a structure that makes the group behave much more intelligently than any of its members.

I think improv works because the basic techniques “stack the deck” so that you can pretty much stumble onto extraordinary material. Here’s an example of this deck-stacking: a basic trick in improv is to use concrete words. Concrete words trigger imagination much better than abstract words. (Not in all cases, of course; see "Causation" above.) But why does using concrete words work so well? Because concrete words are heavily enmeshed in networks of associations with emotions and other concrete words. They are so richly enmeshed that you can “modulate” the kinds of connections you make, by choosing a style, an attitude, a personality, a motive, etc., to find connections that fit that characteristic—and you can connect in that chosen way with almost any other concrete thing via a hop or two.

Improv culture includes explicit teaching of various heuristics for spontaneously and collaborative making scenes that are interesting, coherent, and emotionally rich. The most famous of these heuristics is "yes-and": whatever your scene partner has put into the scene, say "yes" to it (accept it as real) "and" add more information to it (add detail, move the action a step forward, connect to it). Others are remarkably close to "play the most interesting move", like “raise the stakes”. When an improv scene is going well, you have the feeling that the scene is "smarter than you are", that you are an organic part of a collective intelligence made from all the players.