~ OR: Your grandparents were correct: radio WAS better than TV

Podcasts.  Audio books.  Clubhouse.  AM, FM, and XM Radio.  Alexander Graham Bell’s: The Telephone.  Walkie Talkies.  Intercoms.  Tin cans connected by taut fishing line.

What do all of those things have in common?

Ok, aside from almost all of them requiring electricity.

And aside from most of them being ancient artifacts.

Fine.  Aside from them all being nouns.  WORK WITH ME HERE, COME ON!!!!

Yes!  They are all forms of voice-only transmission mediums!

Contrary to popular belief and regardless of widespread availability of video chat methods, voice-only still exists and is thriving.  Despite any attempt to imply otherwise.  Reports of Clubhouse’s imminent death, as an example, are an exaggeration.  Like Guitar Center.  Guitar Center’s demise has been assured every year now for <checking watch> a good 17 years.  Yet here it still is.

Also like terrestrial radio.  I mean, who still listens to that stuff?  According to Pew Research, 83% of Americans ages 12 or older listened to terrestrial radio in a given week in 2020.  Read that percentage again.  Eighty-three percent.  That’s more than 273 million people. So apparently QUITE A FEW PEOPLE STILL DO.

“But George,” you ask, “What’s the point of an voice-only medium when there are so many ways to do video chat now?  Why wouldn’t we want to leverage that kind of technology and actually SEE each other when trying to communicate?  You know, like in the olden days before the telephone when that was the only way to talk to someone?”

To which I would respond, “Please shorten your question to a more manageable length.  Too many words.  Brain hurts.”

I will concede the question though.  Why???  Why would people do this?  Is there some kind of draw that we’re missing why people would be into voice-only delivery methods when so many combined sensory formats exist?

Turns out, there IS.  And that’s what we’re going to dive into today.

Without further ado, I present this freshly-baked blog œffering for your enjoyment.  With accompanying audio version, of course.  Because really.  This would NOT be the week to skip the audio version given the context of the post.  With the implied efficaciousness of the subject at hand, missing out on the audio version would be KINDA SILLY.

So here we go!

Why are we talking about voice-only communication anyway?

One of the major reasons that this subject is so captivating to me is in my role as a voiceoverister.  Since I was blessed with a face made for radio, voiceover has been – for quite some time now – my preferred means of getting my acting fix in.  The ability to convey emotion via the use of audio only is something that has been a significant part of my life for close to two decades.

Once in a while I would stop and wonder about the æffability of just the human voice where conveyance of emotion would be concerned.  Is it still effective?  It seems to have some degree of cultural significance still, even in the face of … er … the face.  The fact that the face and the words that come out of the hole in said face are more accessible than ever before, still voice-only options persist.  And yet I continued to wonder about the efficacy.

That wonder ceased thanks to a document I received that blew apart my fragile little mind.

Quite a bit of this blog post stems from reading this doocument regaling the results of a Yale University Study.  It’s an eye-full, lengthy, and dives real deep into analyses and statistics.  I still recommend taking a look at it if you’re into that kind of thing.  And even if you’re not. But just in case you’re really not, that’s why I’m writing all this stuff down.  Because it’s SUPER fascinating to me.  And I like to share!

The crux of the study indicates that voice-only communication is one of the most effective ways of increasing empathic accuracy.  Let’s get into that here; the subject is very interesting and a new one to me.

What exactly is Empathic Accuracy?

So glad you asked!  (fine, you didn’t ask, I just said that to pretend to be cool)

According to the study document, empathic accuracy is defined as the ability to judge the emotions, thoughts, and feelings of other individuals.  The study continues on to indicate that greater degrees of empathic accuracy bring about better work conflict resolution, relate to romantic partners, and even wade through the quagmires that relate to political organizations and social networks.

I feel that it is worth noting that empathic accuracy and empathy are two very distinct and separate concepts.  While empathy is the ability to feel the emotions of another individual, empathic accuracy is the correct identification of those emotions.  They can certainly go along with each other, however they are not always going to be a package deal.

Continuing on with the results of the study document, researchers worked to determine which communication modes provided for the highest degree of empathic accuracy.  Their assertion was already set in that they felt that voice-only was the most efficacious of the available options.  The study went on to attempt to prove that.

What are the three modes of communication studied?

It’s worth noting that there were a few forms of communication modes that were tested in the study.  Included in those modes were the following:

  • Visual Only – This modality covers imagery either still or moving.  Watching interactions without a audible component to accompany it
  • Voice Only – Audio components only.  No visual or other sensory stimulation included
  • Multi-sensory – The combination of both visual and audible information

One of the experiments even compared results between rooms that were dark and conveyed information vs rooms that were fully lit conveying the same information.  The level of detail in the 5 different experiments is very deep and quite interesting!  Again, I encourage you to take a peek at it.

What were the findings?

The title of the blog post kinda gives it all away so chances are you aren’t even this far into the post.  I’m not very good at click bait or luring people in or drawing them through a start-to-finish journey.  I just write stuff as it comes to mind and hope a few of you stick around!

Given the methods of communication above, each and every test resulted in a greater degree of empathic accuracy when voice-only was the means by which something was communicated.

It wasn’t close.

The study goes on to get into the weeds of why that might be the case though.  The following are notations of some of those reasons.

We’re not as good at multitasking as we think we are

Listening to someone talk without any other distraction is already a metric boatload of cognitive processing.  Think about it.  (there’s irony in telling someone to think about cognitive processing.  like a double positive.  it might be meta, i’m not sure) To process information and emotion conveyed by the voice alone, the brain is attempting to sort through and de-garble tone, inflection, nuance, language, emphases, and volume.  That’s quite a bit just by itself!

Add to this any other form of communication modality and all we’re doing is adding to the already nearly cognitive overload of things to try and parse out.  Sure, we can communicate with each other face to face on a regular basis.  Prior to and since pandemic life, this is a thing.  Even during, we had video chat.  It doesn’t change that – at least scientifically speaking – voice-only communication is still going to bring about more empathic accuracy, which allows us better opportunities to interact on a deeper level. Which makes the entire history of the world prior to the telephone kinda miraculous.

The face is a lie

According to research referenced in the document linked earlier, facial and non-verbal expressions have a greater degree of being able to hide what is happening internally for an individual.  Studies have shown that it is easier to deceive someone via use of the face to mask emotions, which in turn can impact empathic accuracy significantly.  This makes sense; if facial and non-verbal communication can more easily obfuscate the internal state of an individual, then that explains why multi-level marketing scouts are so successful at convincing people to go to seminars when they least expect it.

Vocal cues, on the other hand, are not nearly as adept at hiding the internal state of an individual.  There is a significant degree of training and attention needed to maintain and control a mask for an internal state, which makes leakage of said state more likely to happen.  This ties back into voice acting splendidly, and I’m going to come back to that soon.

“Your eyes can deceive you, don’t trust them”

Thank you, Obi Wan Kenobi.

This one kinda stinks but it’s a reality of how we exist as extremely flawed human beings.  Perceptual biases based on visual information are an unfortunate but real thing.  It’s something that we do.  The whole “first impression” thing is something that takes place and we, as humans, will make judgement calls about someone’s personality based on how they look over anything else.  Once that map is set, it’s challenging to re-write it.  Take that one step further and add the component of racial biases into the whole mix and a perception of someone can be instantly written based on nothing more than their ethnicity.

The study did make one note that was worth taking into consideration: all of the subjects used in testing were Americans with an assumed uniform level of American accents.  As such, biases based on accents were not studied this time around.  It would be worth looking into to really dive deep into the idea of maps that are formed from audible information only when there are accents introduced that are outside of the ethnic circle of someone evaluating a scenario for study.  I can guess what the result might end up being, but it would be interesting to see it studied in a scientific and controlled environment.

Funny example of how effective verbal cues can be, even without actual language

There was one paragraph about vocal bursts without language being sufficient to accurately communicate emotions.  Nonsense words verbalized that still conveyed how someone felt about something.  My absolute most favorite version of this would be an acceptance speech by Rush guitarist Alex Lifeson at the Rock ‘n Roll Hall of Fame induction in 2013.  He had an entire speech written out and, instead of reading the words of the speech, he replaced every single word with the word “blah.”  Every.  Single.  Word.  I encourage you to take a look at the linked video to get a sense of what on earth I’m talking about.  Also, I’m fairly certain that this is a much shorter version than the one that actually took place. I may or may not have been in the audience for that.  I’ve said too much.

Is there a point to any of this voice-only stuff?

Conveyance of effectual emotion via voice-only means is a subject that is near and dear to my heart.  And skill-set.  I mentioned earlier that the study indicated that there is a significant degree of control required to mask the internal state of an individual where voice-only communication is concerned.  Let’s expand on that a teensy bit.

Voice acting is a two part word.  Voice, and acting.  Do we have a voice?  Check.  Great!  Halfway there.  But then comes the acting.  This is where we get real deep into the weeds about training and practice.  The study referenced above notes other studies done that indicate that voice-only communication isn’t as easily capable of masking the internal state of someone speaking.  What that ultimately means is that folks who want to act with just their voice are at a disadvantage compared to their on-camera and on-stage compatriots.  It is much more difficult to hide what is happening internally while providing voice-only communication and someone on the receiving end can tell.

If you’re a voice actor and you have been through acting coaching for just the voice, you know what I’m talking about.  When you think you’ve nailed something and you hear “I didn’t believe it,” it’s frustrating, isn’t it?  There can be a variety of reasons for this.

Back to voice-only training and control

When we’re told, as voice actors, to internalize a script or figure out what question is being answered or just get into the scenario, what we are doing is making the scenario our internal state. Our goal is to convey something that has nothing to do with our actual internal state.  At least, not most of the time.  Your internal state is going to be defined by your surroundings and life. What is happening is your reality.

Example.  If your task is to dub English over a foreign film for Netflix and your cat peed on your carpet 20 minutes ago because your cat is a jerk, your internal state is presently focused on the fact that you’re going to have to deal with that carpet before it sets and stinks up the whole house and next thing you know you have to rip out all the carpets because, again, your cat is a jerk.



Anybody else have a cat that is a jerk?  See me after class. We’ll commiserate.

The point being: in order to successfully act under the conditions of voice-only environments, we must find a way to change what our internal state is to match that of the task at hand.  Which is challenging.  Thus the need for training, coaching, rehearsal, and more training, coaching, and rehearsal.  It’s worth it but it’s not as easy as buying a USB mic and saying “I’m a voice actor now!  Please present to me the basket of all the jobs.”

Attentiveness is key

One final thought I wanted to bring up where voice-only stuff relates to empathic accuracy is that they don’t mention something in the study; the fact that people have to actually pay attention.

They did touch on the fact that we’re not nearly as good at multi-tasking as we think we are.  How many of ya’ll know the telltale glazed look of someone who is in your zoom meeting and is very clearly looking at another screen or a device that has nothing to do with the meeting at hand?  Raise your hand.  You know that look.

While voice-only communications and conference calls do not have the benefit of getting to see what someone is doing when they’re supposed to be paying attention, the mute button is just as effective.  You have no idea if someone has muted you and is having a conversation with someone else entirely or had to run to the rest facility.  Unless they forgot to mute.  And especially in the case of the latter, that is really a w k w a r d.

Empathic accuracy only works when we are attentive.  Focused.  Actually paying attention to the cues.  Without attentiveness, nuances are missed.  Meanings are lost.  Missing out on social cues isn’t pleasant for anyone.  It leads to hurt feelings, angst, and weeping and gnashing of teeth.

The solution?  Give the person or people you’re speaking with your undivided attention.  Become a true æfficionado of the audible.  Having a real sense of the people you’re communicating with isn’t just good form, it’s beneficial to how you interact with them on an ongoing basis. And it makes you look really, really cool.

So, like, pay attention and stuff.

Closing out!

I hope you enjoyed going down this rabbit hole of empathic accuracy and voice communication with me.  This is some super fascinating and interesting stuff and there’s so much more to learn!  If you happen to know anything else about this subject and want to share, please please please send me an email and let’s chat about it!

The end.

Until next week!

-= george =-



Are we having fun yet?

About the Author

Straddling the line between the arts - voiceover, music composition, session performer, album mixing - and the world of durable medical equipment. Probably should have spent more time playing on the balance beam as a kid instead of obsessing over Commodore 64 games.

Subscribe? Superscribe? Surfacescribe?

….. circumscribe?

Where were we?

So yeah, fancy still yet one more thing in your inbox? I’d love to help facilitate that!  Please enter your email below to sign up for once-a-week mayhem.