What’s in an (artificial) voice?


Voice lies at the heart of human identity. When we hear someone speak, we immediately start picking up on accent and intonation and make assumptions about the their class, background and education. We also make inferences about their age, stature and personality, although these are often wrong and compromised by prejudice.

Model Repliee Q2

But what about artificial voices? We’re increasingly hearing our computers speak, whether that is Alexa, a sat nav or a computerised announcement at a railway station. How do our preconceptions learnt from human voices shape our response to artificial voices? What other factors influence our response to speaking computers? These are some of the questions BBC R&D is exploring in a new study.

Scientists have only recently started to explore how people feel about artificial voices. Speech is our main way of communicating and socialising, so naturally we react differently to devices that speak. According to Daren Gill, director of product management for Amazon Alexa, ‘Every day, hundreds of thousands of people say “good morning” to Alexa.’[i] Hundreds of thousands of people have also professed their love for the smart home assistant and some have even proposed to them. Can you imagine typing ‘I love you’ into a computer?

Adding speech to a device suggests agency and changes our expectations for how a device will respond to us.[ii] Clifford Nass, the late professor of communication at Stanford University, believed that some of our irritation with such technology arises from the fact that we treat artificial voices as human, and start to judge our devices for their trustworthiness, sincerity and personality.

One aspect of a human voice that is really important is accent, because it signals whether a talker is ‘one of us’ or not. Accent is of particular interest to the BBC R&D study. Do we want our devices to sound like they’re from our home town? We know people anthropomorphise robots more strongly when the voice matches the gender of the listener.[iii] And we respond more positive to robots that sound like they come our own country.[iv] But what about regional accents? We certainly have strong reactions to human regional accents:

It is impossible for an Englishman to open his mouth without making some other Englishman hate or despise him

George Bernard Shaw

But what about robots with regional accents?

You can take part in the study at https://voicestudy.api.bbc.co.uk/ It’s simple to do and takes about 15 minutes to complete.


[i] Victoria T., 2016. ‘How we fell in love with our voice-activated home assistants’. New Scientist, 3104.

[ii] Cox, T., 2018. Now You’re Talking: Human Conversation from the Neanderthals to Artificial Intelligence. Random House. pp. 181

[iii] Eyssel, F., De Ruiter, L., Kuchenbrandt, D., Bobinger, S. and Hegel, F., 2012, March. ‘If you sound like me, you must be more human’: On the interplay of robot and user features on human-robot acceptance and anthropomorphism. In 2012 7th ACM/IEEE International Conference on Human-Robot Interaction (HRI) (pp. 125-126). IEEE.

[iv] Tamagawa, R., Watson, C.I., Kuo, I.H., MacDonald, B.A. and Broadbent, E., 2011. The effects of synthesized voice accents on user perceptions of robots. International Journal of Social Robotics3(3), pp.253-262.

Follow me
,

2 responses to “What’s in an (artificial) voice?”

  1. As an English speaker originally from South Wales, I was intrigued that you lumped all Welsh accents together as my perception is that there are sounds in accents from North Wales that I would struggle to reproduce. For example, when I say Rhys there is no aspirated H and I’m not sure I could ever learn to pass as from Bangor, lacking this sound.

    I also wonder why none of the artificial accents sounded BAME to me. Is this just my prejudice? I have recently been delighted by my perception that there are an increasing number of BAME voices on BBC radio. Is this because of attractive sound qualities of is it just something about me? (For example that my first doctor was a BAME male and I was terrified of the other doctor in the practice, a white female. By the way, I am a white female!)

    • Agreed that Welsh isn’t one accent, but we had to go for broad regions for practical reasons of making voices and also to enable us to do meaningful statistical analysis.

      I wasn’t involved in recording the human voices that were used as the basis of the artificial ones, so I can’t comment on the ethnic balance in any detail. But I know that there was at least one with non-white ethnicity. You only hear a subset of the voices when you do the experiment, so you might not have heard it.

Leave a Reply

Your email address will not be published. Required fields are marked *

%d bloggers like this: