Are We “Meant” to Have Language and Music?

By Mark Changizi | March 15, 2012 8:31 am

Mark Changizi is an evolutionary neurobiologist and director of human cognition at 2AI Labs. He is the author of The Brain from 25000 FeetThe Vision Revolution, and his newest book, Harnessed: How Language and Music Mimicked Nature and Transformed Ape to Man.”

What do ironing and hang-gliding have in common? Not much really, except that we weren’t designed to do either of them. And that goes for a million other modern-civilization things we regularly do but are not “supposed” to do. We’re fish out of water, living in radically unnatural environments and behaving ridiculously for a great ape. So, if one were interested in figuring out which things are fundamentally part of what it is to be human, then those million crazy things we do these days would not be on the list.


But what would be on the list?

At the top of the list of things we do that we’re supposed to be doing, and that are at the core of what it is to be human rather than some other sort of animal, are language and music. Language is the pinnacle of usefulness, and was key to our domination of the Earth (and the Moon). And music is arguably the pinnacle of the arts. Language and music are fantastically complex, and we’re brilliantly capable at absorbing them, and from a young age. That’s how we know we’re meant to be doing them, i.e., how we know we evolved brains for engaging in language and music.

But what if this gets language and music all wrong? What if we’re not, in fact, meant to have language and music? What if our endless yapping and music-filled hours each day are deeply unnatural behaviors for our species? (What if the parents in Footloose* were right?!)

I believe that language and music are, indeed, not part of our core—that we never evolved by natural selection to engage in them. The reason we have such a head for language and music is not that we evolved for them, but, rather, that language and music evolved—culturally evolved over millennia—for us. Our brains aren’t shaped for these pinnacles of humankind. Rather, these pinnacles of humankind are shaped to be good for our brains.

But how on Earth can one argue for such a view? If language and music have shaped themselves to be good for non-linguistic and amusical brains, then what would their shapes have to be?

They’d have to possess the auditory structure of…nature. That is, we have auditory systems which have evolved to be brilliantly capable at processing the sounds from nature, and language and music would need to mimic those sorts of sounds in order to harness—to “nature-harness,” as I call it—our brain.

And language and music do nature-harness, a case I make in my third book, Harnessed: How Language and Music Mimicked Nature and Transformed Ape to Man (Benbella, 2011). The two most important classes of auditory stimuli for humans are (i) events among objects (most commonly solid objects), and (ii) events among humans (i.e., human behavior). And, in my research I have shown that the signature sounds in these two auditory domains drive the sounds we humans use in (i) speech and (ii) music, respectively.

For example, the principal source of modulation of pitch in the natural world comes from the Doppler shift, where objects moving toward you have a high pitch and objects moving away have a low pitch; from these pitch modulations a listener can hear an object’s direction of movement relative to his or her position. In the book I provide a battery of converging evidence that melody in music has culturally evolved to sound like the (often exaggerations of) Doppler shifts of a person moving in one’s midst. Consider first that a mover’s pitch will modulate within a fixed range, the top and bottom pitches occurring when the mover is headed, respectively, toward and away from you. Do melodies confine themselves to fixed ranges? They tend to, and tessitura is the musical term to refer to this range. In the book I run through a variety of specific predictions. Here’s one. If melody is “trying” to sound like the Doppler shifts of a mover—and thereby convey to the auditory system the trajectory of a fictional mover—then a faster mover will have a greater difference between its top and bottom pitch. Does faster music tend to have a wider tessitura? That is, does music with a faster tempo—more beats, or footsteps, per second—tend to have a wider tessitura? Notice that the performer of faster tempo music would ideally like the tessitura to narrow, not widen! But what we found is that, indeed, music having a greater tempo tends to have a wider tessitura, just what one would expect if the meaning of melody is the direction of a mover in your midst.

For the full set of arguments for language and music you’ll have to read the book, and the preliminary conclusion of the research is that, human speech sounds like solid objects events, and music sounds like human behavior!

That’s just what we expect if we were never meant to do language and music. Language and music have the fingerprints of being unnatural (i.e., of not having their origins via natural selection)…and the giveaway is, ironically, that their shapes are natural (i.e., have the structure of natural auditory events).

We also find this for another core capability that we know we’re not “meant” to do: reading. Writing was invented much too recently for us to have specialized reading mechanisms in the brain (although there are new hints of early writing as old as 30,000 years), and yet reading has the hallmarks of instinct. As I have argued in my research and in my second book, The Vision Revolution, writing slides so well into our brain because it got shaped by cultural evolution to look “like nature,” and, specifically, to have the signature contour-combinations found in natural scenes (which consists mostly of opaque objects strewn about).

My research suggests that language and music aren’t any more part of our biological identity than reading is. Counterintuitively, then, we aren’t “supposed” to be speaking and listening to music. They aren’t part of our “core” after all.

Or, at least, they aren’t part of the core of Homo sapiens as the species originally appeared. But, it seems reasonable to insist that, whether or not language and music are part of our natural biological history, they are indeed at the core of what we take to be centrally human now. Being human today is quite a different thing than being the original Homo sapiens.

So, what is it to be human? Unlike Homo sapiens, we’re grown in a radically different petri dish. Our habitat is filled with cultural artifacts—the two heavyweights being language and music—designed to harness our brains’ ancient capabilities and transform them into new ones.

Humans are more than Homo sapiens. Humans are Homo sapiens who have been nature-harnessed into an altogether novel creature, one designed in part via natural selection, but also in part via cultural evolution.

* Correction: This bit originally mentioned Dirty Dancing instead of Footloose. Thanks to @LaurenAMichael for the 80’s pop-culture refresher.

An earlier, shorter, version of this piece appeared originally on the Nook.

CATEGORIZED UNDER: Mind & Brain, Top Posts
  • kirk

    If an ant colony cultivates a fungus — is that natural selection? Because the fungus, left to it’s Selfish Gene ways, might never discovered merely through ‘descent with fungus modification’ but instead ‘descent with ant modification’ how to be the new, best kind of fungus. At root, when a product of DNA (me and some ants) selects the best sounds to make or the best fungus to grow it sounds kinda, you know, natural.

  • Neuroskeptic

    “human speech sounds like solid objects events, and music sounds like human behavior!

    That’s just what we expect if we were never meant to do language and music.”

    But that’s also what we’d expect if we were meant to do language and music, thanks to evolution building on earlier building blocks. No?

    E.g. suppose that we first evolved to be good at hearing a certain range of sounds. Then we started to evolve to make sounds to communicate. Of course, all other things being equal, it would be likely that we’d end up making the sounds that other people can hear best, because that would be the best way to communicate.

    With writing, clearly writing evolved culturally to make use of our visual system. But I can’t see why speech couldn’t have evolved biologically in exactly the same way. In fact I’d be very surprised if it had evolved any other way…

  • Barbara Mater

    Are we perhaps “meant” (by whom or what?) to invent things to that are not dictated by our evolution? Perhaps it’s in our nature to intervene in the process and influencing the design of our future selves. We have selected body styles by selective breeding in most cultures, e.g. taller people get more chances to breed and lead in Western culture.

    Ironing, which is not as prevalent now as it was 50 years ago, had health benefits in terms of sanitizing clothing, bed linens, towels and bandages, as well as enhancing people’s comfort. So it turned out to be a wise practice for its time. Hang-gliding gives people a personal aerial view of the environment as well as a thrill. But neither of these activities seems to be pre-existent in the design of our world. Language and music, whether they are subroutines of our own brain functions or natural options for understanding our world, raise our abilities to understand each other and to enjoy our own lives. However we came to them (or they to us) they are marvels.

  • Mark Changizi

    Neuroskeptic: But if speech does a good job at harnessing our solid-object-event-processing auditory system, then no special speech-processing auditory mechanisms would have been needed. And if natural selection *had* “gotten involved,” one would expect that the quite different design demands for interpersonal communication would lead to quite different sorts of speech sounds — it would lead to sounds somewhat optimizing the entire communication process, and these sounds would be peculiar to the animal, who would have an auditory system that co-evolved to process these special conspecific signals well. That’s what *appears* to be the case for other animals with innate signaling systems (modulo some nice examples of receiver bias).

  • Mark Changizi

    Kirk: Cultural selection is a fundamentally distinct mechanism leading to “design” in the world, albeit for only one or several species (and principally us). I’d bet that there’s some suitably general way to define “natural” such that it covers cultural selection, but it would be glossing over a crucial distinction, one needed in our attempts to understand the different kinds of forces shaping the world as we find it.

  • Emily Willingham

    “Human speech sounds like solid objects events, and music sounds like human behavior!” Could we not have been selected for to detect and produce these sounds, with selection against those–e.g., via mate choice–who cannot detect or produce them in ways our auditory processing can appreciate/find attractive/receive as communication? I think of the “loneliest whale” and the fact that this whale uses frequencies that appear to preclude its communication–and mating–with other whales because it produces sounds that are some sort of unrecognizable hybrid in whale lingo.

    How does this idea that language or what we call “music” apply for other species with complex linguistics involving auditory reception, e.g., Is that “natural” or “cultural” selection?

    Finally, I’m wondering how clear the distinction is between “natural” and “cultural” selection in the context of the potential influence of transgenerational epigenetics.

  • Elizabeth Able

    Take a deep breath. Relax. Do it again. Feels good, doesn’t it? Want to do it again? Creative expression can be like that… and your hypothesis makes complete sense.

  • Jim Johnson

    Interesting proposition. I do have difficulty imagining a human-detectable Doppler-shifted sound at the speeds experienced in a natural environment.

    In our modern lives, we experience significantly Doppler-shifted sounds all the time, but we are surrounded by fast-moving things. Most speeds a primitive human would have frequently experienced would be very low compared to that. While some creatures do put on bursts of speed, they would rarely have done so while passing in close proximity to humans, and most creatures that emit sounds don’t do so while moving at high speed.

    Predators’ footfalls do produce sounds, and they do make speed. But is the speed and duration such that a Doppler shift would be detectable by human ears? Maybe. I’d really like to see studies before I subscribe to that. It seems the fainter-to-louder change in an approacher’s footsteps would be a more useful discriminator in such low-speed short-approach attacks.

    I suppose a swooping bird of prey might make an audible cry and at the same time maintain sufficient speed both in the approach and departure for the listener to hear the Doppler-shifted difference, but how often would this happen? Frequently enough to have an impact on evolution? Possibly, if this were inherited from our smaller primate ancestors, but then why no signs of such a development as music appreciation by other primates?

    I imagine certain natural phenomena might produce sound at such speed, such as avalanches or landslides, but again, would early humans be affected by avalanches or landslides (at a proximity to sense the different sounds of “coming” and “going”, while still surviving the encounter) frequently enough to affect our evolution? (I dismiss fast moving rivers, because not only are they relatively slow, but the sound of the nearest (passing) water tends to overhwelm more distant “approach” and “departing” sounds.)

    One phenomenon I can think of that might carry all the requirements (sound, speed, proximity to humans, and evolutionary advantage) would be stampedes in plains environments. It certainly would confer an advantage to know if that herd of 13 million wild cattle were headed toward you or away. Detecting the difference between a low rumble and a slightly less low rumble as the difference between “nothing notable” and “run for it” might be enough evolutionary advantage to affect early human hearing/brain evolution. I can see that leading to our love of baselines and drumbeats, but to cause enough changes to explain Beethoven’s 5th symphony or “Yesterday”?

    I concede it’s possible that we detect Doppler-shifted sound from slower-moving things than I’m imagining; studies might show that. I’d like to see them.

    It’s an interesting proposition, but I’d first have to be convinced that Doppler-shifted sound played the necessary role in our evolution before moving on to the question of whether it contributed to the genesis of music.

  • Michael Soso

    “What if our endless yapping and music-filled hours each day are deeply unnatural behaviors for our species? (What if the parents in Footloose* were right?!)”

    Based on my dancing abilities, I conclude the parents in Footloose were correct, but your considering language and music “deeply unnatural” seems a stretch. Anatomy (Broca/Wernicke/Geschwind), genetics (FOXP2), hemispheric lateralization, functional MRIs, etc all attest to some remarkable brain designs facilitating language. To me, these suggest our brains ARE shaped for these pinnacles of humankind.

    “I’d bet that there’s some suitably general way to define “natural” such that it covers cultural selection, but it would be glossing over a crucial distinction, one needed in our attempts to understand the different kinds of forces shaping the world as we find it.” Best wishes for your efforts to distinguish hereditary and environmental factors affecting the evolution of language.

  • Daniel

    @ Jim. I had very similar thoughts to yours. I don’t think our ancestors would have heard Doppler shifts in the natural environment very much at all. Certainly not enough to have affected our hearing abilities.

    However, we do find things beautiful that are different. I can imagine our primitive stone-age ancestors swinging a hollow length of wood around on a strip of hide, hearing a Doppler-shift and thinking to themselves “Wow, isn’t that beautiful? I’ve never heard that before.” And so music is invented, using Doppler-shifts as something that is ‘special’ and ‘unique’.
    The rhythm of music might have a similar origin. It juxtaposes the more random sounds found in nature, like the wind or running water. These more regular, unique and special sounds may have been what our ancestors found attractive in music.

    We didn’t evolve to hear a Doppler-shift. Just having hearing lets us hear the effect. As a completely unnatural effect for our ancient ancestors, it would have been very special, and cultural forces took over till this day.

  • Sean Oliver

    We’re “meant” to do anything and everything that helps the group survive. Not the individual, or the individual genes. That’s the nature of all human learned behavior (culture). We speak using language because it was adaptive. We sing and dance because it was adaptive. Culture is subject to natural selection, period.

    I really don’t get your point at all; you speak of “unnatural” modern-day environments. As soon as the first Homo Erectus figured out how to trap fish in a steam, or build a lean-to, we’ve been living “unnaturally”. “Natural” is one of those terms that is essentially meaningless, in terms of discussing human life styles. You seem to hold a belief that there’s some greater purpose or meaning intended for human beings; I’d like to see you prove that using data.

  • Alan

    The stuff that’s hard-wired into humans and necessary for survival is basically the same for all of us – we all breathe the same way, our hearts beat the same way, our CNS controls motor function the same way. It’s clear that language and music are far less constrained or required. We know from the animal world that language is not essential for survival and while sitting down at a piano and performing a Beethoven concert may help your chances of reproducing that night, it, too, is not required for survival of the species.

    Along the lines of Jim and Daniel (above) what if it wasn’t actually the Doppler effect that is mimicked in music, but the related idea that when our ancestors heard a quiet sound it generally meant it was farther away and therefore was less threatening. On the other hand, a loud sound triggered a more emotional response and put them on alert because it meant something was very close. So is it possible that music evolved not to mimic the Doppler effect of sounds moving past us, which our ancestors may or may not have heard, but instead to use volume and pitch to manipulate our instinctual emotional responses to our own perceived safety.

    It makes sense that rhythm would also mimic nature, as certain sounds profoundly affect our emotions. For example, listening to the repeated breaking of waves on a beach has a very calming effect. Of course, it’s not clear if that’s instinctual or if it’s because it usually means we’re on vacation with a pina colada in our hand.

    Although, after further consideration my theory of music volume and emotion does break down at one point. Test for yourself: Put on a Neil Diamond album at full blast: Terrifying. Now turn it down to a comfortable volume: Still terrifying.

    Back to the drawing board.

  • John

    “Language and music have the fingerprints of being unnatural (i.e., of not having their origins via natural selection)…”

    No particular language is inherent, but the ABILITY to learn language is inherent indeed. A child, even if not spoken to directly, will still learn from hearing those around him, albeit somewhat more slowly and less effectively.

    Huge bodies of evidence show that language is inherent. Naturally, varieties of homnids who communicated better had greater survival skills; hunting parties at first, then agriculture. These things, which confer survival advantages, rely on communication.

    The basic assertions in this post all sound, sadly, like pseudo-science.

  • CoyoteFish

    @Jim @Daniel

    Ya’ll haven’t spent much time outdoors, have you? Certain insects such a flies achieve speeds of up to 100mph. Certainly enough speed for doppler effects.

    If you’ve ever spent time in a meadow in the summertime, wild or pasture, you’ll have enough bugs flying past at speeds fast enough to amply demonstrate the existence of doppler shifts in vast abundance.

    And the human ear is incredibly sensitive. It can hear the difference between cold water being poured into a cup, and hot water being poured into a cup. Try it.

  • Callum James Hackett

    When I read this, I had the same thoughts as Neuroskeptic – this seems rather like a chicken and egg problem.

  • Geack


    Haven’t read the book yet, but here’s the immediate question: Language and music are to all appearances manifestations of the capacity for abstract or representational thought that gives us other characteristically “human” behaviors such as mathematics and a sense of time. How does the fact that we express the output of that mental capacity using sounds modeled on our environment lead to the conclusion that the mental capacity is not due to natural selection? Or to put it another way: our evolved mental capacity to assign and communicate “meaning” would unavoidably exhibit itself in some set of behaviors; the fact that those behaviors which survive to the modern era appear to be derived from the environment doesn’t seem all that dramatic. What am I missing here?

  • dcwarrior

    Well, how about logic? Isn’t logic a byproduct of our need to persuade others… many people seem to do perfectly well without it and those who have it generally have to have it taught to them. Is logic another thing we are not supposed to be doing?

  • Amos Zeeberg (Discover Web Editor)

    I didn’t believe the claim above that insects could fly at 100mph, but apparently it’s (approximately) true: “‘A tabanid fly (such as a deer or horse fly) has been clocked at 90 miles per hour (145 km/h),’ says Rudy Scheibner, entomologist emeritus at the University of Kentucky in Lexington.” (via)

  • Daniel

    @ CoyoteFish
    Hardly a situation that would be considered important in the evolution of our ancestors.

  • m


    Our brains are wired for mathematics. And sounds as a function of time (aka a “beat”) are in essence mathematics in action. The faster the beat our brains tune it in as part of our “fight or flight” instinct. (on a mobile device – sorry I cannot cite the research validating that). The same with volume and pitch.

    Since music is applied mathematics, I would argue that music is indeed a “core” aspect of being human. It has a distinct mathematical timing. That’s why songs are addictive.

    I’m curious what the author has to say about that?

    (only fostering discussion here – please dont think I’m “attacking” your theory or anything)


  • Greengo

    The term in the title “meant” is a very vague and unscientific one. Steven Pinker (The Lanuage Instinct, The Stuff of Thought, et al) makes a powerful case that our brains HAVE evolved with the lanuage absorbtion potential. And, in ways that no other primates can compete with.

  • David Turell

    Since changes in the larynx and tongue started to appear 1.5 million years ago, and chimps don’t have this lower larynx position at all, who all this fuss about humans and language. We were prepared for it, and obviously are supposed to have it. Chimps don’t have it and don’t speak. Our evolution and theirs started to differ many millions of years ago with remarkable different results. It seems Changizi is unaware of this or is ignoring it. His research into how language evolved may be entirely correct, but we were meant to have language.

  • Sean Finn

    If true, a real blow to Universal Grammar and Chomskian linguistics.

  • Joan

    Since reading “Harnessed” I have been listening for these sounds and rhythms in nature. And, they are there! I pretty much have a “tin” ear for musical tones, but I can sure hear the rhythms of nature. Like some others who commented, I doubted the possibility of Doppler effects in nature too because nothing in nature goes as fast as a train. But, I live in an Amish community and I can hear whether a horse and buggy is coming or going. Also, an interesting tidbit, we have a wood-burning stove and the other day when I started the fire, I could hear the fire crackling in time with what could have been musical beats.


Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

About Mark Changizi

Mark Changizi is the director of human cognition at 2AI Labs and the author of several books, including Harnessed: How Language and Music Mimicked Nature and The Vision Revolution.


See More

Collapse bottom bar