Cognitive Science of Philosophy Symposium: Iterated Learning

Welcome to the Brains Blog’s Symposium series on the Cognitive Science of Philosophy. The aim of the series is to examine the use of methods from the cognitive sciences to generate philosophical insight. Each symposium is comprised of two parts. In the target post, a practitioner describes their use of the method under discussion and explains why they find it philosophically fruitful. A commentator then responds to the target post and discusses the strengths and limitations of the method.

In this symposium, Sara Aronowitz discusses iterated learning as an experimental method and tool for simulation, with Thomas Icard providing commentary on its application to explanation.

* * * * * * * * * * * *

Iterated Learning

Sara Aronowitz
_________

It’s important to experience life and to describe things in four dimensions, not just two or three; that is, you need to cover all of the angles and explore all of the possibilities, and leave your mind open to new possibilities!

This is a quote from a participant in a study I ran with Tania Lombrozo. If you were in our study, you read an explanation, and then explained what you had read to another participant. Each explanation was passed between 8 people, forming a single chain. The quote above is from the 8th and last person. Here’s the very first explanation in the same chain, which we took from a college-level textbook:

Imagine you have to describe to worried school officials the fire that broke out in your room when your roommate tried cooking shish kebabs in the fireplace. You explain that your dorm is at 6400 College Avenue, a street that runs in the left-right direction on a map of your town; you are on the fifth floor, which tells where you are in the up-down direction; and you are the sixth room back from the elevator, which tells where you are in the forward-backward direction. Then you explain that the fire broke out at 6:23 p.m. (but was soon brought under control), which specifies the event in time.

A substantial shift! But you can see how, slowly leached of all content, this fairly informative explanation about localization became a motivational quote. This technique where each person is given an input generated as output by a previous participant doing the same task is called “iterated learning”. I picked it for this post because iterated learning is used in some very interesting ways, and somewhat uniquely, is both a study paradigm and a kind of model or simulation of a further system.

For example, Kirby et al. (2008) had participants assign nonsense-word labels to little shapes, with each person in the chain expanding the labels they were given to new little shapes. Over time, the mapping of labels to shapes began to exhibit compositional behavior. In their illustration, by the 10th person, each part of the made-up word has come to stand for a feature of the picture:

Here’s the interesting part. Kirby et al. take this finding to be evidence not just about how people use labels or solve new communication problems, but about how language might have evolved through a process of cultural evolution (i.e. evolution of behaviors through social mechanisms like imitation, rather than through genetic changes). The behavior they’re studying directly is a quick task between minimally interacting participants in small chains. But this behavior is acting as a model, licensing inferences about more complex interactions, between many more people, over much longer periods of time. For instance, the interactions behind the conventionalization of idioms, the addition of new polysemous meaning, or even the development of language itself.

Langlois et al. (2021) use iterated learning in a different context. They had strings of participants observe and recall visual stimuli by locating a point in an image. Each participant was shown the image with the point placed by the previous person, and then had to recall where the point was. By the end of the chains, they observed, essentially, a drift toward critical sections of the image: in an image of a triangle, the point moved toward the vertices, whereas in an image of a face, the point moved towards the eyes, nose and mouth. The authors argue that this reveals features of encoding in memory: because each link in the interpersonal chain shifts the point as a function of memory-induced noise, the whole chain magnifies a bias present even in a single individual, merely functioning to make it visible.

Tania and I were using this method for a third purpose: we’re interested in the epistemic benefits of different types of explanation. In particular, “experiential” explanations, which have a narrative structure like the example from our study at the beginning, and “abstractive” explanations, which have a structure more familiar to philosophers: relating particulars to regularities, generalizations or laws. For us, iterated learning was a way to test our prediction that experiential explanations have an epistemic advantage over abstractive explanations when it comes to transmission. (The data so far suggests we were wrong – both types were transmitted with similar fidelity!)

Here you can see the promise and also the potential for overreach in the use of iterated learning. Kirby et al take the interpersonal chain to reflect something about much bigger patterns of language change. Langlois et al. take it to reflect something about much deeper patterns of memory encoding. Tania and I used iterated learning as a model of social transmission more generally, a somewhat less ambitious aim. But in each case, the method is a way of uncovering biases that drive learning. If this method works, it’s an incredible shortcut. We’d have a super fast, experimentally controlled, and non-invasive way to study topics that resist this kind of treatment. Think of other ways of studying large-scale language change, that might involve historical, observational investigation of a particular language. Or methods of modeling memory encoding through electrophysiology or very precise psychophysical experiments that can only work under controlled conditions in the lab.

But the more ambitious the application of the method, the greater the gap between what’s happening in the experiment and what we want to model in the world. The Kirby et al. work is at the extreme end. A study in the lab has at most thousands of participants, each completing a task that takes minutes. The system it’s meant to be modeling in the world involves hundreds of years and millions of people. Most significantly, the interaction between participants in the lab is minimal and only in one direction. In real language development, interaction is (usually) cooperative and bi-directional.

There may be dynamics that emerge in the lab setting that are systematic and reproducible, but don’t apply in other contexts. This is a familiar problem. But more interesting is the way this method is used, at least by Kirby et al., as a kind of analog simulation of the system in the world. I call it a simulation because the behavior in the lab is not an instance of the behavior of interest (whether that instance is highly artificial, or otherwise altered). Instead, the bit of cognition between a study participant seeing a shape and inferring its name is related fairly indirectly to the kind of process that takes place in the learning of natural languages, and particularly indirectly to the cognitive processing of a child learning her first language.

One way to see this is to think: what would happen if we discovered the neural mechanisms behind these two types of thought were very different? Even in this case (and adopting a background assumption that different neural processes usually suggest different psychological ones), there would still be several ways in which the model would be informative: (a) by showing how in both cases, similar rational mechanisms are at work to favor certain symbol-meaning combinations over others, or (b) by revealing a very general inductive bias that applies to many kinds of thinking. If part of the use of iterated learning takes route (a), then those uses treat the behavior in the study as a model, aiming to extract structural features of the situation and their consequences, with the aim of applying these features in a distinct context.

This use of a psychological experiment as a model or simulation of broader dynamics is only implicit, and there might be other ways to interpret the relationship between the lab study and the broader cultural evolution question. The Langlois et al. paper uses the method in a more straightforward way, where the mechanisms they aim to model are treated as an aggregation of the small changes directly studied. Each small memory encoding event is part of the longer process they are trying to study, with the key being that these encoding events are spread out over thousands of people in the experiment, but thought of as related to a string of memory events inside a person in the real world. This use of the group may also be thought of as a model for an individual, or alternately, the group and the individual might be settings in which we find the same, short memory events aggregated in different ways. This latter interpretation is more application than analogy.

Iterated learning has been used in a surprisingly wide range of contexts. It’s also, sometimes, both an experimental method and a kind of sandbox version of a larger, more complex world. For us, it was a way to exaggerate features of explanations that might otherwise go undetected, a part of a project aimed at understanding the varieties of explanatory structures. In our case, and the others, the ambition of this method seems glued together with a background theory of rationality.

* * * * * * * * * * * *

Commentary

Thomas Icard
_________

Thanks so much to Sara for the fascinating work on these topics and for the thought-provoking piece here. I would like to pick up on just one of the many rich themes raised in the post, namely what we might hope or expect to learn from iterated learning experiments on explanation in particular.

As Sara nicely puts it, iterated learning experiments can be seen as controlled “analog simulations” whereby chains of transmission from person to person in the lab are used to elucidate otherwise elusive cognitive phenomena. Of course, we learn from simulations and other representational stand-ins to the extent that they mirror relevant structure in the systems or scenarios of interest. The piece here highlights several potentially relevant sources of discrepancy: time (duration of the process), scale (number of individuals involved), and issues of broader context (e.g., the fact that communication is often bi-directional) all threaten validity of inferences from experiment to target. Perhaps the most critical assumption to interrogate in general is that the experimental task adequately reflect the task under study. This matter is front and center in much of the previous literature on iterated learning. Some of the cited work on language evolution hypothesizes, for instance, that iterative assignment of labels to shapes bears suitable similarity to more general patterns of language learning and imitation. To some degree (tempered in the ways Sara articulates) the natural-language-like phenomena revealed in these experiments lend support to such hypotheses.

What task is involved in giving an explanation, and what would we expect of participants when asked to explain what they read to another participant? A prominent feature of explanation-giving is that speakers tailor their explanations to their addressees whenever possible, taking subtle account of what a listener knows or wants to know (see, for instance, Hilton 1990). However, if the standard iterated learning paradigm tends to minimize listener-specific information, we might predict a retreat to more general-purpose — indeed, more abstract — explanations. Such a motivation makes sense in science too, where explanatory attempts are not targeted at any particular individual, but are rather aimed at a wide range of epistemic situations and potential sources of opposition. 

To be sure, speaker specificity (or lack thereof) is not the only catalyst for abstraction and generality. Familiar rationales such as portability and transfer (e.g., Lombrozo 2011) evidently also play a role. Moreover, studies of iterated learning for tasks that do not overtly involve explanation at all have identified similar trends toward more “schematic” and “compressed” signals over time (see, e.g., Tamariz & Kirby 2014). The suggestion in this literature is that such features may ultimately improve learnability and transmission, typically at the cost of some initial inaccuracy.

In other words, even if more experiential explanations are associated with epistemic benefits in ordinary conversational contexts, those benefits may be eclipsed by the many benefits of abstraction in this relatively impersonal context. What would we expect to happen in a chain of individuals, each of whom is well acquainted with the next? And what if the first person in the chain were (known by all) to generate their own explanation for a given situation? How strongly would the drive toward abstractions and generalities persist?

Aside from pressures promoting abstraction, there is perhaps a more general reason why one might expect responses in these experiments to change quite dramatically, even over short spans. Arguably one of the core features of explanation is that it is not mere recapitulation. It always involves “going beyond” in some way: extracting essences, morals, or gists; identifying hidden causes, laws, norms, reasons, and so forth. Is it conceivable that participants are attempting to distill some deeper lesson in what they read and convey this to the next participant in the chain? Creative leaps at each step are just what we should predict on this view. Notably, such a possibility would be compatible with participants retaining the explanations they are given with perfect fidelity, such that they could simply reproduce it verbatim for the next participant should they be asked to do so. This seems unlikely given the history of iterated learning studies going back to the 1930s, simply for reasons of memory. But it might suggest a more sizable gulf between faithful transmission and what participants take to be their task.

Another prominent idea in the philosophy and psychology of explanation is that, given the right context, simple explanatory statements like “A because C” can encode a surprising wealth of information. This suggests a kind of inverse of the present use of iterated learning: what kind of task might actually induce explanation-giving even when participants are not explicitly asked to do so? For instance, are there contexts where communicative pressures suffice to induce explanatory talk when the goal is mere (re)description? Could such phenomena amplify over longer chains?

As Sara emphasizes with the sandbox metaphor, one of the exciting features of iterated learning experiments is their exploratory potential. The fact that the results reported so far appear to be tracking — even magnifying — meaningful patterns in people’s explanatory behavior certainly appears to vindicate the method. I very much look forward to the many more insights that are sure to come from this exciting project.

* * * * * * * * * * * *

References
_________

Aronowitz, S., Lewry, C., & Lombrozo, T. (ms). Experiential Explanations in Iterated Learning.

Hilton, Denis, “Conversational processes and causal explanation,” Psychological Bulletin, 1990.

Kirby, S., Cornish, H., & Smith, K. (2008). Cumulative cultural evolution in the laboratory: An experimental approach to the origins of structure in human language. Proceedings of the National Academy of Sciences, 105(31), 10681-10686.

Langlois, T. A., Jacoby, N., Suchow, J. W., & Griffiths, T. L. (2021). Serial reproduction reveals the geometry of visuospatial representations. Proceedings of the National Academy of Sciences, 118(13).

Lombrozo, Tania, “The instrumental value of explanations,” Philosophy Compass, 2011.

Tamariz, Mónica and Simon Kirby, “Culture: copying, compression, and conventionality,” Cognitive Science, 2015.

Back to Top