How to design for voice and smart speakers – with Ilarna Nche

Taking Turns with Ilarna Nche - design for voice
Taking Turns with Ilarna Nche - design for voice

Published by:

Eran Soroka

Making a machine feel human is a tall order, way taller than the average smart speaker. However, it doesn’t mean it’s impossible; you just have to find the right person. When needing a design for voice, Ilarna Nche – CEO and Co-founder of Adassa Innovations – has lots of experience. 

In the latest episode of Taking Turns, the Cambridge, UK-based conversation designer talks with us about lots of great stuff. For example, the advantages brands find in voice tech, how to prevent your conversational experience from becoming a disaster – and how a Christmas present turned a hobby into a career.

How did you become a conversation designer, first of all?

This kind of fell into my lap. Although I did work prior with chatbots for a bit, it was really when I saw Amazon advertised their Echo dot. That’s when I was kind of immersed into this. So I bought one for Christmas for my mother, and when we set it up, we played a lot with the apps. Actually, I didn’t realize that those are made by third-party developers. So when I got an email from AWS saying ‘learn how to build an Alexa skill’, the journey began. From just being completely immersed in this whole world of voice and conversational design, I was really getting my feet wet and learning a completely new skill.

What was the background that led you up to this one?

Mainly, web development. That was what I was kind of introduced to, and then from the mobile apps as well. However, what attracted me to conversational design and development was the fact that I could focus more so on the user interface, the user experience with something where I kind of lacked like the GUI’s, the graphical interface. So I was really attracted to just focusing on bringing the experience to people and solving their requests. 

You’ve come a long way since. You’re now an Alexa Champion, a certified Alexa developer, Bixby, first class Premier partner. Out of the whole lot of projects that you created, what’s the one that you’re most proud of?

It would have to be Music Bop Adventures – that is the reason why I’m here today. When I  was 11 years old, I had this idea about reducing screen time for children. Then when I saw the Alexa platform, I put it on there. Then, I’ve had such good engagement and feedback from parents and grandparents. It’s so rewarding that kids are enjoying something that I put out, and it’s actively helping them improve their imagination. Also, in terms of reducing the screen time – the mission is actually being successful. So Music Bop Adventures is something I’m proud of and I hope to continue to impact lots of kids around the world.

Actually, it’s quite ironic that voice technology is being used to reduce the time when people interact with technology.

Definitely. I think it is kind of ironic because now we’re going towards a companion with a voice that complements screens and multimodal uses. That was one thing I wanted, to create something which actively reduces the free time for kids. Also, that helps them in a social environment and a physical and mental environment too.

Ready to build your bot on cocohub? Start here!

Create a quiz in cocohub.ai in minutes – that’s how to do it

How to create your chatbot’s personality?

That’s amazing. So how do you create this sense of companionship when you design for voice? After all, this is a machine.

When thinking of the context, that’s where you need to start. If it’s for personal interaction, it’s supposed to be very intimate, then I expect a very personalized experience. However, if it’s something like a smart home interaction, I want it just to do the job. Here, it’s not about the complexities of how it would interact with you. So it all depends on the context and that’s kind of what people should identify with first. 

When you develop for a smart speaker, what’s the most important thing for you in a skill?

First of all, I have to identify – why am I using my smart speaker over my phone? If I want to interact with something, it needs to be quicker and more efficient than if I were to go on my phone. So one of the things I like to do especially is checking what channel the football’s on, for example. So what time is Liverpool playing today? You have to identify where that person will be. In this scenario, I could either be in the bedroom, the living room, or the kitchen, but it’s like identifying the user case. So if it’s something related to food, you would expect them to be in the kitchen. When it’s booking issues, like a restaurant or a Lyft, then you have to identify where the user would be and the environment, and also the type of person as well. It’s so important to understand the user. 

Although you can’t build for every type of user, you need to identify the different scenarios. So that’s one thing. That’s a key point, I would say, when designing for smart speakers.

Did it also lead you to some funny or awkward exchanges with the smart speakers?

Yes. I definitely get the odd people always talking, trying to communicate with another experience. When you design for voice experiences, if you get those awkward and funny situations, it means that there’s still work to be done. You want to make sure it’s clear how people can converse with your application. So, yeah, there’s been some funny ones.

Previously on Taking Turns | Watch the whole playlist

Einath Apel tells us how she navigates different accents and jargon

Alessandra Cherubini shares her tips: how to build an emphatic healthcare bot?

Esha Metiari talks about the art of building a great bot persona

Now, let’s talk about the place where you design for voice – Adassa innovations. 

So, Adassa innovations is a voice consultancy and also a voice studio. Here we make products in-house like applications, games, entertainment and productivity and education. Also, we make products for clients and brands. Our main mission is to bring as many brands & companies onto the platforms and try and impact as many users as possible, so both parties can win.

What’s the biggest advantage brands find in designing for voice tech?

For example, with mobile, there are two operating systems of Android and iOS, the two main ones. So brands have access to millions of users on those platforms, and the same kind goes for voice tech. If you go for a platform like Alexa, Google, even Bixby, there’s access to millions of users, potential and existing ones, who you can bring closer to your brand. So from a marketing point of view, brands can really get involved in voice. Because there are loads of use cases where brands could surface their content. For example, if you say ‘find me a car’, who is it that’s going to be surfacing that? Here, a car brand, for example, should definitely hop on and create something for that kind of use case.

What’s the most creative way that brands have used to leverage this technology so far?

Definitely, from a marketing point of view, that’s where I’ve seen the most effective use cases. Especially when movies are being released, having interactive experiences where the user feels involved with the project. Clearly, when you as a user become connected to something, you feel more entitled to be a part of the journey. Also, you feel more supportive and you feel part of the brand, which is what I think brands like to do. They want to make sure their users, their customers are always a part of them. So from a marketing perspective, creating some interactive, engaging experiences is the way forward. 

Also, it bears some specific challenges because no brand, no company wants to have a “bad PR” chatbot or skill. So how do you prevent it from happening? 

Definitely, when you start to design for voice, making chatbots and conversational experiences – hire people who are very knowledgeable in that area. Lots of the mistakes people make happen when they hire developers without hiring some conversational expert, who is able to kind of guide and create that high quality experience. Because at the end of the day, a developer wouldn’t have that experience to create a conversational flow. So when creating something like that, they should definitely think about a linguistics person, a conversational designer. Then, and just keep making that team instead of just thinking it’s the only development that we need.

Outside of buying your mom a Christmas gift – if somebody wants to enter this field of developing for smart speakers, of conversational AI, what other tips or advice would you have for this person?

Personally, the best advice I was given was to network and reach out to other people in the field. There, I realized that the community that we’re in at the moment is very helpful, very welcoming, very open. Furthermore, people are willing to share their advice and experiences. Also, it’s just a matter of – don’t be afraid to reach out and get involved in meetups, community projects, talks and speeches. So I would say, yeah, just network. There are so many resources out there as well. Also, we should identify and highlight that there’s no right or wrong way of doing or approaching this; I feel conversational AI and design is something that’s “the human”. There’s no one perfect human, there’s no way to define THE human. The same goes for conversational design.

As an experienced person, give us one forecast for the future of smart speakers, conversational AI, and where it can go.

It’s so funny you say that, because if you asked me that question last year, it would have been a different story. So that showcases how quickly and volatile this industry is. Now, it’s definitely to do with Web3 – this is the future as you’ve seen. Something I didn’t even know about this time last year, and yet it’s becoming such a quick and kind of interesting piece of emerging technology that people are really looking into, and brands are actually investing heavily in. I’ve always said that voice is something that will be a companion to another tech, so the multimodal focus is something that will continue to strive on. So voice is not at the forefront, but it will definitely complement these other technologies. Definitely, Web3.0 is something to watch out for in voice, and where it will actually take off the most and be heavily used there, in regards to web too.