“Hey Alexa”, “Hey Siri”, “Hey Google,” “Hey…”

After listening to Fergus Donoghue’s talk on Amazon Alexa, I decided to dig a little deeper on the uses of voice recognition technology. COVID has changed every industry and human buying/adoption rates. It’s no different for speech recognition technologies. Adaptation and use for voice recognition technologies has increased through COVID as it’s a more sanitary option, can be used for entertainment and its useful in the healthcare industry as well. “ The voice and speech recognition market is expected to grow at a 17.2% compound annualized rate to reach $26.8 billion by 2025” (might want to consider investing in this area). I had asked Fergus the question of whether he feels Alexa is at its peak right now, he had adamantly said not even close, and he was right…

Voice technology isn’t realty a standalone industry but rather a transformative technology that will cause disruption across industries – like what smartphones and the internet did. However, with the smartphone and internet there was a learning curve; much of the older generations are still trying to learn how to use these technologies. With speech recognition, there is no learning curve as this technology adapts to human behaviors and not the other way around.

  • Everyone who already owned speech recognition technology was using it more
  • People want to buy more of the technology
  • We are using it in more creative ways – soon we will be using it in every aspect of our life rather than just ask it for the time, set a timer or play music.

Well that’s creative…

Robotic friendships are now a real thing, not just a movie concept. Replika is a chatbot phone app through which people can have human-like conversations with “friends.” This may sound like a repulsive idea, but studies show people feel increasingly lonely with the rise of technology and social media; so this may have given folks a outlet during isolation.

There is also Zora, a robot caregiver controlled by humans – it provides companionship. As hospital staff is stretched, Zora helped reach the many patients. A nurse can type words into the laptop for the Robot to speak, it also leads exercises and plays games.

Spotify got a patent in 2021 that would allow them to read emotions based on speech recognition – this would allow them to recommend you songs based on your emotions. Amazon and Google also in pursuit of patents related to speech emotion recognition.

Lets deep dive in Healthcare Industry…

Speech to text software has made significant difference in physician’s efficacy. Its especially helpful with patients with complex heath issues – they have more elaborate history to share. Doctors may not be effective typers to capture all the details. “Speech-recognition software allows physicians to “think out loud,” said Dr. Hsiao. That leads to richer content, less cutting and pasting of notes, and more complete problem lists.”

One of the issues in healthcare industry has been lack of diversity. One thing that diversity in Physicians would help solve is understanding various accents. This is another area where Speech Recognition software would be useful. It can help Physicians understand their patients better and as a result diagnose and treat them better. It can also make a patient feel more comfortable going to Physicians for care.

With speech emotion recognition therapists may be able to detect suicidal tendencies/mental state.

Other benefits include:

  1. Finding Medical Records quickly
  2. Giving instructions to Nurses
  3. Nurses can ask for administrative information, such as the number of patients on a floor and the number of available units
  4. Less paperwork
  5. Less time inputting data
  6. Improved workflows

Buckle up for the Future

Right now (as Ravi mentioned in class) we use speech recognition for very basic applications, but the potential is HUGE. Right now we use Voice search on Siri or Google. We may use it for Voice to text; I personally use Siri often to text when I am driving. We may also use voice commands for our smart home devices.

But think about the possibilities here. What if we started using voice biometrics for security, where we use our voice for authenticate ourselves versus providing our personal information every time we go to the bank or call our service providers on the phone for example. The technology is definitely in its infancy right now, but the potential is known and expected to be realized in the next few years. Business should start preparing for this shift. As the accuracy rates improves, there is more trust and buy-in from consumers across industries. Right now, we see a lot of applications in smart home technology and entertainment but that will soon expand to other areas. But let me ask you…


  1. albertsalgueda · ·

    I found your blog really interesting. You might find this source interesting:
    As you explained in your blog this technology is in its infant stage but together with AI and combined with the exponential growth of digital technologies, we could expect a major impact on a wide variety of industries in the following 10 years.
    Healthcare applications are dangerous in a sense but can be really helpful and one day even save lives, I am thinking about applications on mental health, as mentioned in the video, these conversational tools can be useful to tackle solitude… I still prefer some human conversation ;)

  2. barrinja1 · ·

    interesting post! this line really stuck out to me: “With speech emotion recognition therapists may be able to detect suicidal tendencies/mental state.” …. as a psychology undergrad, and coming from a family very involved in the mental health field, I’d like to learn more about this concept – I am hesitant. Typical psychological strategies involve the true need for human understanding and empathy, rather than the computation or search powers that do so well with speech to text and AI. Asking open ended questions and being free-flowing are part of the exchange between patient and doctor, and I’d like to know more about how that would work in this use case!

  3. allietlevine · ·

    Like you, I think there is real potential for the use of voice recognition technology for the elderly. My grandparents had an Alexa and they were easily able to figure out how to use it. In fact, my grandfather discovered how to ask Alexa for a fart, much to the dismay of my grandmother. After Fergus’ presentation I searched a bit and found Amazon’s Alexa Together Service. The service will be launching later this year in the US. The features I think will be most useful are the ability to set reminders on a loved one’s echo and the 24/7 Urgent Response service. I have no doubt that this would have been something that we would have added to my grandparent’s device. I can imagine insurance companies or Medicare subsidizing the service. With that being said, I do think it is important to mention a potential challenge. While the voice recognition technology is easy to use the initial set up could be a challenge for an elderly person.

    Read about Alexa Together Service:

    1. Funny I read this blog and Allie’s comment…I just helped my neighbor set up reminders on her mother’s Echo Dot and installed a Google Nest Doorbell, Google Nest Fire Alarm, and Google Nest Thermostat. It all came about because my neighbor’s mom called her to ask her to come over to turn off the fire alarm that had been ringing all night. Yes, all night. Instead of calling the fire department, her mother innocently waited to call her daughter at a better hour. (Note, there was no fire and she is ok). Now my neighbor can notify her mom of medicine reminders, doctor appointments, check who is at her door, be notified if the fire alarm goes off, turn the heater on, and SO much more. My neighbor said she sleeps better now knowing that she has the ability to care for her mom while not being there 24/7. Technology can be so powerful.

  4. parkerrepko · ·

    Great post that adds more detail and info to last week’s presentation. Personally, I do not use any voice recognition simply out of habit, but I am open to trying it. Advances in voice recognition always push me towards something called the Turing Test, which is a test where a person communicates with an AI and a person and tries to figure out which is which. Seems like we are getting closer and closer to that. (https://en.wikipedia.org/wiki/Turing_test)

  5. Bryan Glick · ·

    Thank you for doing some deeper investigations around technology that Fergus presented to us; this blog really hammers home its potential for the future. I think you really hit on the key at the end of the blog- ambient technology is really still in its infancy stage… even with how successful Alexa and Siri have been, there is so much more for these technologies to offer. I think we are very close to having a level of confidence in this technology for biometric voice authentication. Now that is an aspect of “movie magic” that I think is so close to real-world production (think any important scientist door or spaceship where they need to verify the person by reciting a phrase lol)!

  6. What a great, substantive deep dive into the topic from last week. Nice work. Do you mind if I share it with the speaker?

    1. Kanal Patel · ·

      Please do!!

  7. DropItLikeItHox · ·

    Awesome post – I was interviewing at Nunace last year who had just finished a full pivot into focusing on using their dictation software within the healthcare space. As part of the solution, they partner with hospitals to implement their software and hardware in hospital rooms. Doctors will have a normal conversation with the patient while the software will note down everything that’s stated. From there, some machine learning is involved that’s able to grab key words and form a diagnosis. Super interesting and cutting edge, and right in our backyard!


  8. Nice post and I appreciated the shout out! ;-)

    I think the voice activated “friend” concept is really interesting and I was thinking about some of the questions I had after our last class, on the barriers for myself to not go beyond those four or five commands. One thing I wondered is if Alexa (or other voice assistants) could evolve to respond not ONLY with their name command. For instance, if I’m talking with my wife, I don’t use her name every time I want to say something–and even if the kids are around, there’s certain nuances and contexts that key the receiver to know that they are being addressed. In fact, if someone is living alone, shouldn’t there be a setting that they can just talk freely to their voice assistant without having to formally command it every time? That forced initiation is akin to drafting a telegram: it’s unnatural and halting.

  9. Great post! Like Parker, I don’t really use voice recognition but when I first bought Echo I was fascinated by it and would ask it random question to see if I can stump Alexa (this was a few years ago, I don’t do this anymore!)

    I think this would be a helpful feature inside cars – I know some have voice commands but its not as sophisticated as we would hope. I like high tech gadgets therefore I’m excited to see how they might refine this technology but am not too sure how functional it would be. i.e. Can it recognize different accents? Can it be conversational? How beneficial will it prove to be within the healthcare industry?

  10. rjperrault3BCCGSOM · ·

    Cool post Kanal! I share the same sentiments as Parker and Noor in that I don’t use voice technology much other than the voice remote for cable. I think I might see myself using it more often in the future, particulary Siri. I did not realize you could use it for text messages so that’s pretty cool that you can use it while driving.

  11. Shannon Reardon · ·

    Nice post, Kanal. I found your chart of the willingness of people to purchase a smart speaker (post-COVID) fascinating. Really goes to show the impact of COVID on the democratization of tech products. Also, with Amazon’s recent creation of their pharmacy brand, I am interested to see how Amazon will update the voice recognition software to accommodate the change.

%d bloggers like this: