FARGish poster from the Midwest Cognitive Science Conference, May 13, 2018
"Experiments with Cascading Design",
EvoEvo 2015.
Slides:
Keynote
PDF
"Tagging in Metaheuristics"
in Workshop on Metaheuristic Design Patterns,
GECCO 2014paper | talk
"Structural Stigmergy: A Speculative Pattern Language for Metaheuristics" in Workshop on Metaheuristic Design Patterns, GECCO 2014
paper | talk
"Slide Rules: Math that Fits the Hand
and Eye" at
Enfascination 2012
Main research interests
(Slightly edited from my application to grad school.)
I am interested in how representations of a domain grow from being
incoherent or poorly fitted (“confused”,
“murky”, “poorly defined”) to
being highly structured and tuned to the domain (“clear and
rigorous”,
“able to approach the topic systematically”,
“well-defined”). This
includes especially the way that a change from one representation to
another can suddenly make a complex problem simple or make obscure
features obvious.
This is both a computer-science interest and a cognitive-science
interest. A lot of thought about logic says, “Everything should be
clearly defined—no fumbling around and no vagueness. That’s the only
way to approach things rationally.” This kind of advice was once
fairly common in software engineering: get systematic and formal
during requirements so everything will go smoothly during
architecture, coding, etc. “Rationality” and writing computer programs
would seem to require that one’s domain be bounded, all criteria be
defined, and all terms of discourse be locked onto their meanings
before thought even begins.
I am most interested in the “murky” period before there is clarity.
The ability to fumble around finding a good way to bound a domain,
cast about for good criteria for success or goodness, and let the
semantics of terms slip around to become more expressive is what makes
us smarter than computers. But those processes are themselves
computational, and can themselves be mathematically modeled. Such
modeling would serve two purposes: accumulation of useful search
heuristics suitable for computers, and helping understand something of
the “messy” nature of real-world human reasoning. I believe it would
also shed some light on how brains and living things in general—and
computers, before long—navigate the unknown and adapt to what they
cannot anticipate.
The above are very broad interests, of course. For specific research
projects to start right away, I am currently most interested in
these:
Automated theorem-proving, where instead of starting with a theorem
and searching for a proof, the program starts with some axioms and
searches for the interesting proofs. In other words, the
program’s job
is to explore the “proof graph” and sniff out the most interesting
nodes. What makes a node interesting? Well, that’s a research
question. But I have a tentative answer: a node is interesting if,
before you had it, it was hard to find, and if
finding it increases your ability to search the proof graph.
It will not have escaped your notice that there is a huge problem with
this: the vast majority of all the theorems in any system are boring.
How does the program get around that? Well, that’s a research
question. But I have a tentative answer: by representing the current
set of “known” theorems in a very compressed way, which makes
searching by simple pattern recognition easy. The direction I’m
exploring right now is to restrict all
searches for proofs to something close to O(1) time: the system either
recognizes a proof as “obvious” very quickly, or it gives up. The system spends most of its time refining its representation of
all the known theorems, and searching for
“cracks” in its coverage of the proof graph so far.
- Increasing evolvability. A certain kind of design becomes easier to
modify in radical ways as it becomes more complex; specifically,
designs that heavily exploit “design leverage”: each element does
almost nothing but trigger the other elements to do things. Genomes
and extremely modular software systems are two examples. As the
system acquires new small elements, they open up new degrees of
freedom for future small changes that produce large, beneficial
effects. I’d like to experiment with pushing a computational system
based on “design for evolvability” to adapt extremely well to some
tricky domain.
Both of these are attempts to find a vocabulary to explain how it is
possible to explore domains when you don't already have a conceptual map to
guide you.
Other curiosities
Here are some more things I’m curious about. I’m still figuring out how or
whether to turn some of these into research projects to do during
my Ph.D. Some of these you could call philosophy, others psychology
(cognitive science), and still others computer science or informatics. These
might all seem like a chaotic mish-mash, but I hope you can see that they are
just a variety of different, specific aspects of that one fundamental thing
discussed above. Here's a way of clustering them:
Rhetoric
A paradox of pedagogy
In elementary algebra, there are reasons why
polynomials are considered important, but they emerge somewhat holistically
from vast experience. Consequently they are very hard to explain to
beginners. This creates a difficulty in any sort of pedagogy, perhaps the
greatest difficulty of all: how do you lead a student to understand and
appreciate these things, when the student lacks the needed kinds of experience
and concepts?
A traditional solution to this problem in
many cultures is to “pull authority” on the students by demanding that they do
things without understanding the reasons for them. “You’ll understand when
you’re older.” I think that, on the one hand, an authoritarian approach
cannot be completely eliminated, but on the other hand, if you give some
thought to why things are important, and what kinds of experiences and lines
of thought would lead a person to see that, you can often find a pretty quick
set of activities to do or things to point out, which give the student
some very valuable first-hand understanding. (Looking to the history
of the subject nearly always turns something up.)
Another example of this sort of paradox is that sometimes you need to
understand the whole in order to understand the parts, but you have no choice
but to start with the parts. For example, Ancient Chinese tends to be very
terse: so terse that it’s difficult to make grammatical sense of it.
I’ve heard that a fundamental heuristic for parsing Confucius’s
sentences is, “Just think of what Confucius would probably say with
these words." Of course, if you don’t already know what sort of thinker
Confucius was, you might find this difficult. But if you spent enough time
puzzling over those terse sentences, you might gradually get a sense of what
kind of thinker Confucius was, which in turn would enable you to parse his
sentences.
The immediate research interest, though, is just to collect a lot of
examples of these things that are hard to explain to a beginner, and examine
the relationship between the “hard to get” knowledge and the ideas that
emerged as important.
Rhetorical clash
There are different, incompatible sets of unwritten rules of rhetoric. I
think those reflect different, incompatible ways that people represent things
in their brains.
There are lots of incompatible underlying frameworks that make things
persuasive or relevant, and when people are not using the same framework, they
often think the other person is being unreasonable. A quick list of
some differences: management vs. understanding, cognitive
delegation vs. first-hand understanding, negotiation vs.
psychological attunement.
The type of rhetorical clash I find most interesting, though, is between
these “rules”, described here in extremes:
- Concepts and criteria should be defined prior to the specifics of the
subject you’re talking about, and then held rigidly in place in order to
“objectively” classify things and make rulings. The meanings of words should
be independent of context, and thereby provide stable anchors by which
to navigate the world and define agreements, such as legal contracts.
- Concepts and criteria should emerge out of the concrete reality of
whatever you are dealing with, in the form of distinctions drawn and
relationships observed. Words are just ways of pointing at stuff; the meaning
of interest is the stuff. You just pick a word that you hope leads the person
to look at the stuff you want to point out, make the comparison or
contrast you found interesting, etc. Thus meaning is entangled with
context, to the point that if you haven’t had some first-hand experience with
the material of interest, you can’t possibly understand what the words
mean.
By “context”, I don’t just mean “the surrounding words”, I mean
everything else in the world, that relates in any way to the subject at hand.
I especially mean the concrete reality of whatever you’re dealing with.
Should you make meaning independent of context or entangle meaning with
context? I think the answer is obvious: “It depends." But, it appears to me
that people have strong preferences about how context-independent or
context-entangled their approach to meaning will be, and they hold it fairly
consistently across varying circumstances. For example, the uproar over
“situational ethics” illustrates two opposed ways of grasping meaning.
Hypothesis: If people simply had a vocabulary to describe this
variable, they would naturally find themselves becoming more flexible, better
able to choose an approach that fits the topic. (If this hypothesis is false,
and people can’t really vary this variable, that would be really
interesting.)
One thing I’d like to do is just catalog various rhetorical styles
and figure out why people found them appealing. For example, the theological
argumentation of the Protestant Reformation strikes me as having a peculiar,
consistent quality: it’s somewhat legalistic even while it tries to make
sense of a not-very-legalistic text, it intelligently seeks consistency at a
very deep level, it’s permeated by a dour tone of “you’re not good enough”,
and, at least it seems to me, pretty well reverses the main ideas of the very
text they’re working so hard to treat as authoritative, and the manner
of thinking almost necessitates this. Just the heavy emphasis on exegesis of
an authoritative text is a kind of rhetorical style (or even “style of
logic”).
Semantic arguments and sophistry
I don’t like semantic arguments. I want to put an end to them. And
what better way than to research them in detail and write dry, scholarly
papers about them? Seriously, if people had some conceptual vocabulary for
describing semantic arguments and what’s wrong with them, I think
they’d find themselves almost unable to continue engaging in them.
I expect, though, that examining semantic arguments will yield some
surprising insights about other things. Now, I believe that the solution to
all semantic arguments is pretty simple: just give some examples, draw the
distinction that you want to talk about, designate it with a word, and
off you go. I have come across a strange objection to stipulative
definitions, though: I’ve heard people object, say, to professors’
declaring that they will use a certain word to have a more-precise meaning
than usual, because this is somehow overbearing—that the professor has
no right to use a word to stand for a meaning of his choosing. That
objection strikes me as insane, but I’ve come across it in enough forms
that it seems clear that there is some connection in the human brain between
semantics and social dominance.
Logic (in the old sense)
Logical structure
Over time, “logic” has grown more and more to
mean “rules to follow mechanically to be sure you never derive a
falsehood from a truth." I think there are lots of other important things to
know about how ideas connect logically to other ideas. Some are opposed,
even. For example, the logical structure of most legal arguments is quite
different than the logical structure in The Origin of Species. A lack
of vocabulary to name these differences has often led people to discount or
misunderstand one or the other.
Perhaps the most important part of logical structure is the way in which
single facts gain importance due to the logical structure they’re part
of. For example, a few measurements provided strong evidence that there is no
luminiferous ether. But, very often when people say “studies show (some
conclusion)", the facts in the study don’t really show much of anything.
For example, if 70 out of 100 students got higher grades by using Technique X,
does that prove that you will get higher grades by using Technique X? Does it
mean there’s a 70% chance? The error here is using statistical
techniques that help sort out multiple causes, to defend a conclusion that
amounts to “this one cause produces this effect”.
A peculiar kind of logical relationship is that between a summary
and the details that it summarizes. We all know that one can
summarize "logically" or "illogically", but what vocabulary do we have for
describing the difference? An illogical summary gives undue importance to
trivialities or omits matters of importance. These are matters of how the
elements fit into their logical structure, but there is surely more to say
than that.
Information is lost in a summary, but something is gained, too: a
kind of vagueness. Vagueness is extremely valuable: it’s one of our
strategies for dealing with the unknown. There is a kind of logical
relationship between vague ideas and both the known and unknown things they
relate to—the past basis and the future ability to navigate.
The unstated reasons why various mathematical
definitions and theorems are considered important
Have you ever wondered why they spend a year or two of high school making
you learn all about polynomials? If you ever looked at group theory, did
you ever wonder why groups are so important? Why, for example, is
associativity
included in the definition?
There are answers to these questions, but they're hard to explain, especially to a beginner. I think one
important reason why polynomials are important is that in the physical
sciences, differential equations describe causal relations, but in practice
most differential equations lack closed-form solutions, meaning that there is
no finite series of steps to calculate, say, where the rocket will be at time
t. But, in practice, these differential equations can be approximated
by polynomials to any desired accuracy, and polynomials are straightforward
and easy to calculate.
I'd like to spend a solid month doing nothing but grilling mathematicians
about these things, and assemble a "map" of the main concepts
across the main topics of mathematics, showing the kinds of difficulties that
arise from some pieces and how those difficulties are resolved by other
pieces. I find that kind of explanation of importance particularly
fascinating.
Describing the unknown
We are constantly poking around in the unknown, finding things and adding
them to our knowledge. Just asking someone their name is fishing in the
unknown. We expect that the person would have a name; we just don’t
know it yet. That’s probably the easiest way that the unknown relates
to the known. I call it “multiple choice”.
At the opposite extreme are revolutionary scientific theories, like the
microbe theory of disease, relativity, and evolution by natural selection.
These are so far “outside the box” of previous understanding that
you can’t even ask well-defined questions ahead of time. There is no
way to systematically find this kind of unknown. I call this region of the
unknown “the yawning void”. The yawning void is everything that
doesn’t have a place in your current conceptual framework.
Somewhere in between is any new person that you meet. I think that other
people are much further out in the “yawning void” than we usually
suspect. When you get to know someone closely, you invariably find that they
are different from you in ways that you had no concepts to describe, no way to
imagine in advance.
Working out a more-refined vocabulary for ways to talk about the
unknown sounds to me like a pretty do-able research project.
*** I once gave a conference talk about the yawning void—at a
conference on software specification. Slides, summary.
Incompatible conceptual frameworks for roughly the same things
Sometimes different people, or different cultures, develop conceptual
frameworks that seem to be about the same things but actually don’t
correspond. Each is tuned in to the other’s yawning void, but they
think they’re semantically aligned because they don’t know that a
non-alignment is possible. This often leads to conceptual
“impedance-mismatches”: ways in which one set of ideas gets woefully distorted
when translated one-to-one into another.
For example, when people today read Aristotle, they often think that he was
just doing a dopey job of inventing set theory. I think what’s
happening is that Aristotle had a weird theory of causation, and his theory of
logic is mostly concerned with how human understanding relates to
causes—and none of this fits anywhere in modern philosophy. The
“transplanted” Aristotle looks like a bumbler, making
“undergrad” mistakes with the modern concepts (like “All A
are B” implies “There exist some As”). I hesitate even to
use the word “causation” to characterize the mismatch, because the
kinds of things in Aristotle that get translated as “causes”
don’t fit contemporary language well. (For example, Aristotle’s
“the from what” gets translated as “material
cause”.)
You don’t need to go to ancient philosophy to find conceptual
impedance-mismatches, though. (BTW, I think “impedance-mismatch”
is a weak metaphor. If you know a better one, please email me.) When people make plans about
things involving information, such as software development, they are often
surprised to discover, if they rely heavily on written “requirements
documents”, that they were semantically out of sync. It simply
can’t be avoided. A very simple example from software engineering is
trying to export data from one calendar program to another. Even a
translation of something so rigid and mathematical invariably results in
“Procrustean fits”.
The logical forces that led scientific concepts to change
Definitions change, because the conceptual frameworks in which the
distinctions are made change. For example, “gravity” comes from
the Latin word for “heaviness”, but eventually we redefined it to
mean an attractive force between all masses (and redefined it even more after
that).
I'd like to closely research the history of various fundamental scientific
concepts and terms, and identify what considerations led to each change. One
in particular, that I could spend a solid month on, is energy. The word
was originally coined by Aristotle, one of his many clumsy neologisms.
Translated into Anglo-Saxon roots, energy is "at-work-ness". That fit
into Aristotle's worldview, where biology was primary (rather than mechanistic
physics), and one of the most important ideas is that organisms sometimes are
engaged in their distinctive activity, and sometimes not (because they're
still immature, because they're prevented from doing so, etc.). When the word
got transplanted into the Newtonian/atomistic view of the world, energy
underwent a very radical change, and still another in Einsteinian physics.
Along the way, it spawned a sibling: entropy. (Even the word
physics started off with a biological orientation, which it has now
completely lost: the ancient Greek word meant something more like "what comes
to be by birth and growth".)
Causation
Causation is one of the most important ideas we have for making
sense of the world, but we seem to almost completely lack vocabulary for
talking about it. I think causation is the worldly (“external”)
basis of logic and concepts. You can’t really understand logic unless
you understand causation.
Usually when people explain something or talk about causes, they talk as if
there must be a single cause that produced the effect in question. For
example, suppose some people put on a play, and it went poorly. People will
often search for a single causal factor to “blame”. But really, everything
that ever really happens results from trillions of simultaneous causes. The
air in the room (try putting on a play without it), the acoustics of the room,
microbes in a performer’s bloodstream, microbes in an audience
member’s
bloodstream, the overall world economic situation, the length of the day (this
affects mood), other plays that the audience has seen, other life events of
the performers and audience members, the natural rhythm of getting worse and
getting better as you continue to practice, and nobody knows what else.
Another area where I think a vocabulary for causation would do some good is
getting past the kind of confusion where one person says something like,
“Garages are best placed in the back of the house, facing an alley," and
someone else objects, “But in some towns, they don’t have alleys."
The first statement is meant to identify a resolution of many simultaneous
forces at play in homes and towns, such as the need for parking, the
need for buildings to have a “public face”, the need for people to
walk along pleasant routes, etc. When we make highly general or universal
claims, we are usually talking about causal forces, not proposing simplistic
rules like “100% of all garages should go in the back of the house,
regardless of the situation." (See "Rhetorical
clash" for why such claims are sometimes misconstrued so crudely.)
Obviously, on a street with no alley, you couldn’t make a garage face
the alley. But by exploring the causal forces involved, you can sniff out
good designs.
Christopher
Alexander’s work on building and town design is an inspiration here,
along with the “patterns” community in software
engineering.
Heuristic (discovery)
Note: People sometimes use the word heuristic to mean "a rule of
thumb" or an imperfect, rough-and-ready method of making an estimate or guess.
I am using it here strictly in the older sense of "the art and science of
discovery", or a specific technique for discovery.
Interestingness
What makes something interesting—a news story, a mathematical idea, a
scientific paper, a piece of gossip, a move in a game of chess, a period in
history, a fact? No doubt there are many factors, but one I'd like to focus on
is: portending a discovery of a sort that has no place in your current
conceptual framework. Or in terminology from above, something is interesting if it seems
to provide an opening into the yawning void. That is, it suggests to you that
by pursuing this lead further, you will change the mental framework in terms
of which you are understanding the situation.
Now, what about a fact could suggest such unspecified future cognitive
shifts? I think it's a combination of these things: the fact seems pretty
solid (that is, definitely "known"), it exists within a domain that we have
already mapped out with some pretty solid logical/mathematical relationships,
and yet it doesn't fit neatly into those relationships even though it
"should". More succinctly: unexplained facts are interesting. You
expect that the fact has an explanation, but you don't know what it is
yet.
Here's how I'd like to research this kind of interestingness: do an
activity, or write a computer program to do an activity, in a way that makes
seeking out interestingness the highest priority.
I've done this while playing board games: instead of playing what seems
like "the best move", defined by the criteria of the game (in chess, capturing
the opponent's king; in go, surrounding the most territory), play "the most
interesting move". My experience was that my apparent skill level greatly
increased. Hypothesis: My play was "smarter than I was", because I was
exploiting the unknown; the unknown was on my side in some way. I'm sure it
didn't hurt that the unfamiliarity of the resulting situations threw my
opponents off their game a little, too.
A computer program that could model interestingness in some domain could
generate some, er, interesting results. Modeling interestingness is hard,
though. Before you can do that, you need to model having a conceptual
framework: both navigating by it and modifying it in response to new stimuli
that don't fit the framework. That's hard because it's messy and sloppy in a
way that is hard to model (mostly what we know how to do with computers is
implement extreme neatness and rigidity), and even beyond that, you would have
to implement the "calculation" of the degree of interestingness, and then
translate that into an action to explore a given avenue. So this is not a
research project to do right away.
I think, though, that the following idea should be workable in some form of
relatively modest research: the best move, by the criteria of the game, cannot
be calculated, but interestingness can. To illustrate: In go, the common
saying is "Play the biggest move", meaning the move that increases the size of
your territory the most, relative to your opponent's. If you think about that
for a minute, you'll realize that it's almost completely vacuous advice. It's
like "buy low and sell high" in stock-trading. To determine the "size" of a
move, especially in the early and middle stages of the game, would require
more look-ahead than can possibly be done. Almost all of the relevant
information is unknown to you, and unknowable, because the mathematical space
is too vast and too craggy. But there is a way in which some of that unknown
protrudes into your present, knowable situation: the sense that some moves are
scary, hard to evaluate, neither obviously good nor obviously bad. That
difficulty of evaluation is interestingness. By playing the most
interesting move, you tap into the unknown, and you do it in a way that is
feasible. "Play the most interesting move" is scary advice, but it's not
vacuous.
Ways that information percolates
I’ve read that before the Green Revolution, the
water temples of
Bali had achieved an extraordinary balance among all the causal forces in
the system, including the insects and their capacity to reproduce out of
control. I assume (perhaps wrongly) that they had no rigorous,
“scientifically” tested mathematical theory of all those factors.
Somehow, the real-world ecological information had percolated into their
religious rituals.
To take another example, if you are catching a cold, sometimes you get a
craving for just the sort of food you need to stop the cold. You don’t
consciously know what molecule you are seeking. Somehow, the information
about the molecule, its ability to help you fight a cold, and the fact that it
is in the food you crave, have all percolated from the raw reality into your
conscious mind, without your understanding a thing or doing anything
systematically. I think that’s pretty amazing, especially when you
consider that the feedback loop surrounding this unknown molecule even gets
extended into the economic system, when it leads you to walk to the grocery
store and buy carrots or whatever you were craving, and thereby affects the
price of carrots.
How groups of people can be smarter or stupider than
individuals
Wiki pages are often written by an “intelligence” greater than
that of any of their contributors. Wikipedia is a great example. Decisions
made by committees are often much less intelligent than what any individual
member would do. But there are ways of getting groups to consistently make
decisions that are smarter than what any individual member could do. They
involve peculiar, slightly unnatural constraints, which a create peculiar,
unexpected freedom of exploration.
Wild hypothesis: The mechanisms that make groups of people smarter
than their members are pretty similar to the mechanisms that make
non-conscious things work together to form "smart" things: enzymes work
together to make metabolic networks and entire living things; neurons
work together to make a control system for the body; low-level brain
representations (presumably) work together to make consciousness. More to
the point, perhaps, watching human groups collaborate might reveal mechanisms
to search for in those lower-level networks.
How improvisational theater can possibly work
Improvisational theater is much easier than it looks, even though, when
it’s done well, every moment is pure discovery and the actors truly have
no more idea of what’s going to happen than the audience does. I think
the same sorts of techniques are at work in all sorts of human creativity. In
improv, these heuristics appear in an especially concentrated form. Improv is
also a structure that makes the group behave much more intelligently than any
of its members.
I think improv works because the basic techniques “stack the
deck” so that you can pretty much stumble onto extraordinary material.
Here’s an example of this deck-stacking: a basic trick in improv is to
use concrete words. Concrete words trigger imagination much better than
abstract words. (Not in all cases, of course; see "Causation" above.) But why does using concrete words
work so well? Because concrete words are heavily enmeshed in networks of
associations with emotions and other concrete words. They are so richly
enmeshed that you can “modulate” the kinds of connections you
make, by choosing a style, an attitude, a personality, a motive, etc., to find
connections that fit that characteristic—and you can connect in that
chosen way with almost any other concrete thing via a hop or two.
Improv culture includes explicit teaching of various heuristics for
spontaneously and collaborative making scenes that are interesting, coherent,
and emotionally rich. The most famous of these heuristics is "yes-and":
whatever your scene partner has put into the scene, say "yes" to it (accept it
as real) "and" add more information to it (add detail, move the action a step
forward, connect to it). Others are remarkably close to "play the most
interesting move", like “raise the
stakes”. When an improv scene is going well, you have the feeling
that the scene is "smarter than you are", that you are an organic part of a
collective intelligence made from all the players.