Not Through Ignorance

Home » Auditory system » The ears and what’s between them

The ears and what’s between them


I think I started down the path to becoming a neuroscientist when I learned about the visual system in college.  Soon after, I joined a neuroscience lab studying the sense of taste.  As fascinating as these two sensory systems are, I don’t think any set of facts about vision or taste – or olfaction or somatosensation – has blown me away quite as much as some of the more amazing facts about hearing.

Our sound experience is usually characterized by three perceptual qualities:  loudness, which reflects to large extent the amplitude of the sound wave; pitch, which reflects to a large extent the frequency of the sound wave; and timbre, a complex quality that emerges from the complexity of a sound wave.  Two singers, a male and female singer, might both harmonize on the same note (the pitch) and at the same volume (the loudness), yet we will still be able to recognize the distinctive timbres of the male and female voices (provided by the different amplitudes of the overtones in the sound wave).  The strategies that the brain uses to analyze the timbre of a sound is a deeply fascinating topic in and of itself, but one for a future essay.

Today I want to focus on a fourth, and critically important, quality of a sound experience – and that is its origin (or location).  When we hear a sound, even with our eyes closed, it almost always seems to come from somewhere – behind us, over our heads, off to the side.  We may even be conscious of whether the sound’s origin is near, or far away, or outside the room and down the hall.  Indeed, unexpected sounds usually trigger an orienting response – we hear something, and we turn to look at the spot where the sound seemed to come from.

Now, this is remarkable in and of itself.  All sounds, no matter their origin, hit our eardrums.  Yet we do not make the mistake of thinking that the sound arose from our ear drum.  But a sound in front, or off to the side, or ten feet away, or a mile away – these sounds all affect our bodies in the same way – vibrating our ear drums – yet are experienced (correctly) as coming from different places.  Even better, we can simultaneously be aware of the couple conversing at the next table, the wine glasses clinking at the table behind us, and the music emanating from the speakers overhead – we can simultaneously localize three or more sounds even as all three are vibrating those same two ear drums.

Try an experiment.  This works best in an outdoor, quiet space where echoes and background noises aren’t an issue, but it will work well enough anywhere (assuming you have two good ears and don’t have a head cold).  Close your eyes.  Ask a friend to stand 10-20 feet away from you, somewhere in the half-circle in front of you.  Tell the friend he or she can stand anywhere in that arc in front of you.  Have him or her call your name once, then, with eyes still closed, take your arm and point to exactly where you think your friend is standing.  I do this every year in my classes, and with very few exceptions, when the students open their eyes, they can see their fingers pointing straight at me.  Our sound localization is normally accurate to within about 2-4 degrees of an arc.

How do we do it?  For low pitched noises (like the sound of a cello or a male voice), the best cue we have to locate a sound is the interaural timing difference – the delay between when a sound strikes the left ear versus the right ear.  Sounds that originate from in front of us strike our two ears at the same time.  Sounds straight to the left of us strike our left ear first and our right ear second.  The difference in timing – the delay – thus indicates the azimuthal angle of the sound source.

But hold on.  How long does it take for the sound to travel from the left ear to the right ear in the situation where someone is speaking to us from off to the side?  That’s easily computed.  The speed of sound is about 1126 feet per second.  (About because the speed of sound varies with altitude, humidity, and temperature.)  Our two eardrums are about 6 inches apart, which is convenient for the math, since 6 inches = 0.5 feet.  If sound travels 1126 feet per second, it travels 2252 head lengths per second – so sound travels one head length in 1/2252 seconds.  In other words, a trip from the left ear to the right takes a sound wave 0.00044 seconds – or less than half a millisecond.  We learn, in other words, that any sound with a delay of 440 microseconds, with the left ear leading, indicates a sound coming from 90 degrees to the left of our nose.  (A microsecond is 1/1,000,000 of a second.)

That’s amazing enough, but it’s not the end of the story.  Our brains have to register timing differences far tinier than 440 microseconds.  After all, we can discriminate the origin of a sound to within 2 degrees of an arc (at least for sounds originating from in front of us; we do a little worse for sounds off to the side and still worse for sounds arising from overhead).  The delay between the two ears is 440 microseconds for sounds coming from 90 degrees to our left, and the delay is obviously 0.0 microseconds for sounds coming from directly in front of us (i.e., the sound hits our two ears at the same instant).  Put another way, the difference in the delay at 0 degrees vs. 90 degrees is 440 microseconds, so the difference in delay between a sound at 0 degrees vs. 2 degrees is 440/90*2 = 9.8 microseconds.  Our nervous system is capable of representing this small timing difference despite that fact that individual nerve cells are not themselves nearly this reliable.

(How is it possible at all then?  Think of it this way.  Imagine a referee is responsible for timing a runner in a 40-yard dash.  In order to time the runner accurately, the referee would have to press the start button on the stopwatch exactly when the gun went off – requiring accuracy of the referee’s auditory system and sensorimotor system which is beyond the limits of human ability – and would have to press the stop button exactly when the runner crossed the finish line – requiring accuracy of the referee’s visual system and sensorimotor system which is beyond the limits of human ability.  On the other hand, if you had 100 referees all timing the same race and averaged the results, the time would likely be highly accurate.  The nervous system does something like this.)

Ormia ochracea – it looks ugly, but it sounds good. Or rather, it hears good.

But wait, there’s more, which is what I love about the auditory system.  It goes from incredible to incredibler (not a word, but then I think you need new words to describe the awesomeness of our auditory systems).

Enter the fly.

More specifically, enter Ormia ochracea, a parasitic fly.  This is one disgusting creature, at least if you’re a cricket.  Ormia zeroes in on a singing cricket and deposits its eggs on the poor fellow – because the eggs will hatch into maggots who feed on the unlucky cricket.  Behavioral research has indicated that these flies can localize sound to within 2 degrees of an arc or better, just like humans.  But what makes their 2 degrees far more incredible than our 2 degrees is that their heads – their tiny, tiny heads – contain ear drums that are only half a millimeter apart!  Half a millimeter is 0.0016 feet, meaning our heads are almost 1000 times bigger than the fly’s – meaning that to discriminate auditory location as well as we do, the fly’s brain must represent interaural timing differences that are three orders of magnitude – 1000 times – briefer than do we.

I said earlier that humans use interaural timing differences for low-pitched sounds.  What about high-pitched sounds?  After all, the fly is usually trying to find a cricket, which makes a pretty high-pitched sound.  When humans are trying to locate a high-pitched sound, we rely more on interaural loudness differences.  That is, the sound is louder in the leading ear than the far ear.  Timing differences are less reliable for high-pitched sounds for reasons beyond the scope of this essay, but loudness differences become more reliable the higher the pitch.  Why?  Because our heads absorb high-pitched sounds better than low-pitched sounds, dampening the sound for the far ear.  This is true for all objects.  If you hear a radio playing loudly in the neighboring car, or in the apartment next door, all you hear is the low, throbbing bass, because all of the higher-pitched sounds have been absorbed by the walls or other objects.  Likewise, our head absorbs the sound of the cricket’s chirp, making the sound substantially louder in the leading ear than the far ear.

None of which helps our little fly.  Humans can use interaural loudness differences, but flies can’t.  The small head problem again.  The fly’s head is too small to appreciably absorb even the cricket’s high pitched chirp.  Small-headed creatures can only use interaural timing differences, at least for sounds (like the cricket’s chirp) that are in the range of human hearing.

How do they do it?  These flies do indeed represent timing differences on the nanosecond scale!  (A nanosecond is 1/1,000,000,000 of a second.)  The fly’s skill at representing very brief timing differences depends only partly on modifications of the nervous system, but yet more on a unique mechanical linkage between the two ear drums; a lever which is essentially a very stiff hair.  This linkage serves to amplify very slight differences in phase of the two ear drums, making the asymmetry in movement even more pronounced.  A brief delay is thus turned into an ever-so-slightly less brief delay, making just enough difference that populations of neurons can encode the timing difference.

This unique biological adaptation is now inspiring the development of smaller, cheaper, and more accurate hearing aids.  This is another recurring theme in the scientific study of perceptual systems: the biological systems have often evolved mechanisms more accurate and efficient than the engineers had previously dreamed up.  So one day we may owe hearing aid technology to parasitic flies, bomb and drug detection robots to bloodhounds, and face-recognition software to the nuts and bolts of our own human visual systems.

Further reading:



  1. […] my previous post, I described the rather amazing auditory abilities of human beings.  Based on microsecond […]

  2. […] is right after all.  Noe would argue for that more expansive view of our bodies as extended.  The voice from across the room is experienced as being across the room, not in our auditory cortices and not in our […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: