The Language Puzzle.

Steven Mithen (/maɪðən/), a British archaeologist seen in these parts a couple of years ago, published a book called The Language Puzzle: How we Talked Our Way Out of the Stone Age that was recently reviewed in the LRB by Francis Gooding (Vol. 48 No. 7 · 23 April 2026; archived), and even though you won’t learn anything new and exciting, it’s a useful roundup of ideas on the topic. Some excerpts:

Saussure steered linguistics away from questions about the beginnings of language: for him it was a red herring, since words take meaning only in relation to one another, within the boundaries of their histories. The study of words can’t illuminate what came before words: there is no thread to be found in language which would help us trace human speech back to the moment of its emergence. ‘No society … knows or has ever known language other than as a product inherited from preceding generations, and one to be accepted as such,’ Saussure says in Cours de linguistique générale (1916). ‘That is why the question of the origin of speech is not so important as it is generally assumed to be. The question is not even worth asking; the only real object of linguistics is the normal, regular life of an existing idiom.’

Yet whether it is worth asking or not, the question of the origin of language never goes away. It remains one of the most fundamental mysteries of human evolution. So far as we know, true symbolic language is unique to the human species. (On the most generous reading it may go a bit further back in the human lineage. And there is an open question about cetaceans – it was recently discovered that the structure of humpback whale vocalisations is remarkably similar to the organisation of human speech.) And it continually recurs as the most probable explanation for the differences between human behaviour and that of all other living things. If you ask why we have been able to make pyramids and spaceships and musical instruments, while no other animal has managed anything of the sort in three billion years, the answer will always cite language as a decisive factor. So the question of how we alone came to be blessed – or cursed – with words is not to be lightly dismissed. But it does come with a serious difficulty: language is an evolved feature of the human organism, but words don’t fossilise like bones. How then to find the missing links?

The Language Puzzle is a grand tilt at that seemingly intractable problem. In it, Steven Mithen marshals the disparate factors and fields of research that might give us some clue as to how language evolved, and tries to build a plausible account of how we ended up as the only speaking animal. Of necessity, the book ranges very widely, because the fields that touch on the evolution of language are in no way unified. Mithen draws from palaeontology, archaeology, primatology, the study of animal communication, linguistics, neurobiology, philosophy of mind, evolutionary genetics and more.

When investigating the ancient past, typically there is at best only partial evidence for a proposed evolutionary sequence: often that evidence will consist of little more than a few morphologically similar fossils, the remnants of creatures separated from one another sometimes by millions of years. And even by these standards, the evolution of human language is a particularly tricky case. Not only is language in large part a behavioural phenomenon, so that the consequences of its development can only be inferred on the basis of ambiguous and circumstantial archaeological evidence (stone tools, traces of fire etc), it is also dependent on the use of soft parts of the body (the tongue and the larynx, but of course mainly the brain), which don’t leave a fossil trace. As a result, it’s hard to ascertain which physical shifts may have accompanied the development of speech. […]

At a certain point during the development of human speech, and out of a huge array of more or less complex communicative sounds, the true sign, with its signifier and signified, must somehow have emerged. Is it right to say that the sign – the word – ‘evolved’? Or did sign-words emerge into communication and consciousness out of a complex of other mental and communicative functions that previously did other jobs? Did this happen gradually or suddenly? Can we hazard a guess at what sort of creatures first spoke true words, as distinct from making other kinds of sound? Is there any trace of this transformation? Or was the point of entry into the forest of symbols sealed up behind us long ago?

These questions bring us back to those ‘iconic’ or ‘sound-symbolic’ words. In the 20th century linguistics tended not to take much notice of them. With their awkward and apparently not quite arbitrary resemblance to sounds and textures, their stubborn resistance to change, and their long association with the outmoded inquiry into the origins of language, they were relegated to the status of what Steven Pinker could call, as late as 1994, ‘a quaint curiosity’. But as Mithen makes clear, such dismissals were premature. Recent research suggests that iconic words may after all have the crucial originary role that thinkers from Socrates to Herder assigned to them, as a genuine remnant – or analogue at least – of one of the evolutionary staging posts that marked the way to modern human speech. […]

Several decades of indifference later, the American linguist Roger Wescott returned to the problem of iconism. Writing in 1971, he observed that i and ee sounds are preponderant in words signifying ‘small’ (for instance ‘tiny’, ‘light’ or ‘wee’), and suggested that the round-sounding vowels a, o and u are associated with things that are large and slow (as in ‘vast’, ‘huge’ or ‘sluggish’). Many consonants, he went on, have sound-symbolic roles too: words featuring laterals like l (‘in which the tip of the tongue blocks the passage of air’) seem to correspond to smallness or lightness, while labials, made using the lips, as in b or m, are linked to largeness (as in ‘big’, ‘boom’ or ‘massive’). Wescott even pushed beyond the sound and production of words to see iconic elements in morphology, syntax and stress. Words that indicate extension or growth, for instance, often themselves get longer (as in ‘big’, ‘bigger’ and ‘biggest’), and reversals in meaning are frequently signified by a reversal in word order (‘I will’ v. ‘will I?’).

Subsequent investigations in the 1990s and into the 2000s sharpened the accuracy of such observations, finally cementing iconic words and sound symbolism as significant parts of all languages. There is now a mass of work on the subject, which has demonstrated a ‘universal propensity’, as Mithen writes, ‘to associate specific sounds with specific meanings … a considerable proportion of one hundred basic vocabulary items show persistent sound-meaning associations irrespective of language families, environment or culture.’ […]

Infants and children learn iconic words earlier and more easily than they learn arbitrary words, and iconic words remain dominant in the vocabulary of children until around the age of six, after which there is a gradual shift towards arbitrary words. ‘Iconic words are easier to learn,’ Mithen writes, because ‘their meaning is grounded in the sensations experienced by the child – the sound, size, shape, texture, movement and other properties of the object or action being named.’ By providing a fundamental link between speech sounds and objects in the world, they ‘scaffold the entire process of language acquisition’. […]

In 2001, two cognitive scientists, V.S. Ramachandran and Ed Hubbard, published a paper proposing that synaesthetic links between vocalisation, bodily movement and the sense perception of objects could have prompted the creation of iconic sounds in an early human ancestor, thus opening the gateway to speech. They returned to maluma and takete, to the ‘small’ sound of ‘i’, and to other cases in which the movement of the mouth seemed to mimic the meaning of a word, or even the movement of other parts of the body: for instance, when the mouth or lips appear to borrow from the typical action of the hand, as in the numerous words for ‘you’ that involve the ‘pointing’ of the lips towards another person; or the way in which the making of the small i or ee sound could correspond to the pincer action of forefinger and thumb when picking up something small. If synaesthetic links were operating in the increasingly flexible brains of early hominins, perhaps they could have had an effect on vocalisations, resulting in the creation of the first mutually intelligible words – mutually intelligible because their meanings would have been established through shared experience.

Fascinating though all this is, it remains, like so much in the field, a hypothesis lacking crucial evidence. It is also dependent on a loose analogy between early childhood and the evolutionary past – an echo of Ernst Haeckel’s largely discredited idea that ‘ontogeny recapitulates phylogeny,’ or that the developing organism moves successively through the forms of its ancestors. […] Ultimately, Mithen is forced to conclude that, for all the research and thinking done about iconism, synaesthesia and cortical leakage, nobody has any idea when or how the arbitrary sign emerged.

Faced with such frustrating stubs, plausible speculations and partial truths, Mithen turns to archaeology. When were there leaps in the design and innovation of stone tools? When did controlled fire start to be used widely? And after that, the brain. What is the timeline for increases in brain size among hominins? What can we learn from casts of the brain cases of extinct human relatives? Mithen quarters the field with care and imagination, cross-checking, noting correspondences and filling in blanks, finally emerging with a carefully synthesised narrative account of the way language developed and finally flourished into modern speech across three million years of human evolution. We are left with the impression that the ancients’ chief error was in thinking that the process took place consciously among already fully human people.

We now see that instead of a prisca lingua created by principal name-givers or people in a state of nature, the process of language acquisition took place over millions of years in the bodies and minds of a series of ancient beings: Homo erectus, Homo habilis, Homo heidelbergensis, Homo neanderthalensis and, finally, the last human standing, Homo sapiens. No doubt there were many others. At some point, perhaps a hominin made a sound that mimicked the movement of a snake or a fish, or a sound that was small like an insect, and eventually these became words; and then a sound that was originally made in imitation of something steadily departed from it in everyday usage, until it was no longer mimicry but was instead an abstract word; and then another abstract word was needed in order to be more specific about how to make a stone tool, and people told stories around the fire, and mothers cooed at babies and so on and so on, until we arrived at modern language. The potential role of ‘motherese’ in language evolution, and the perspective that a more female-centred history of language evolution might provide, doesn’t get much airtime in The Language Puzzle. There are many more fires, hunts and stone tools in Mithen’s story than babies and comforted children, and when it comes to thinking about talking it isn’t immediately obvious why that should be so. But however the story is told and whatever refinements may be made, the arrival of the arbitrary sign is the crux, and although we know more than ever about what may have preceded it, what was necessary for it to happen, and what the first sort-of sign may have been, the event itself remains stubbornly out of reach. In the beginning was the word, and the word is lost.

I’m grateful for the skepticism of the reviewer; it’s all too common for journalists to take the most exciting suggestions and run with them.

The Language Puzzle.

Tagged with