From Spell Check to Samantha

Comedy great Bill Burr has a bit where he says, “Did you ever spell a word so bad that your spell check has absolutely no clue what you’re trying to spell? What do you end up getting? You end up getting, like, a question mark. You got a million dollars of technology just looking back at you like, You got me, buddy. Which is pretty amazing because I have all the words, and that doesn’t look like any of them”

This joke prompted me to research how does basic spell check work. It’s a long answer, but the main ingredient is called the Minimum Edit Distance Algorithm. The minimum edit distance is the distance between two strings. In regard to spell check, the algorithm is calculating the distance between the misspelled word you typed and the word you probably meant to type. Now, I could stumble my way through an explanation, but instead I choose to preserve my thin veal of credibility and will forward you to a YouTube video from Stanford University Professor Dan Jurafsky.

After years of patiently fixing all of our grammatical mistakes, the algorithms behind spell check have branched and evolved beyond simple word suggestions. They have become digital wordsmiths, such as Bloomberg’s AI journalist named Cyborg. Cyborg excels at ingesting quarterly earnings reports and churning out readable stories for a broad audience. In contrast, this same technology powers AI that can churn out fake news articles faster than any human moderator can flag them.

So how/when did spell check get so smart? Part of the answer is nonchalantly written in Microsoft’s Diagnostics, feedback, and privacy statement. “If you choose to turn on Improve inking & typing, Microsoft will collect samples of the content you type or write to improve features such as handwriting recognition, autocompletion, next word prediction and spelling correction, and we use this data in the aggregate to improve the inking and typing feature for all users.” In case you were wondering, a fresh install of Microsoft Windows has this feature enabled by default. The end-user has to disable it manually.

To be clear, I am in no way attempting to bash Microsoft for collecting this data. I think whoever reads this should be aware of the mechanisms behind the machines that have morphed from useful to indispensable. The motivation is simple. To make these algorithms better, they need lots of data and reinforcement. Here’s an interesting factoid, even hitting the ignore button teaches the AI something. It’s like saying thanks but no thanks, but to the AI it’s an excellent reason to ask its programmers Why did user X9174BG ignore my suggestion? Considering that Microsoft office has 1.2 billion users, the data is bountiful.  

All this data is a treasure trove for Natural language processors (see image above). Natural language processing (NLP) refers to the branch of computer science—and more specifically, the branch of artificial intelligence or AI concerned with giving computers the ability to understand text and spoken words in much the same way human beings can. And because I can’t write about tech without thinking about SciFi. Ladies and Gentleman, I present to you a clip from the film Her (2006).

So, where are we today? While not on the autonomous level, like Samantha from the film referenced above. Look no further than Unfortunately.io. Unfortunately.io is a startup that provides customers with AI-generated rejection emails. Oh, and when I say startup, I really mean startup. The domain name for the company was literally registered last week (3/1/21). One of the steps to becoming a member of the service is to submit a rejection email that you have composed. Gotta feed that AI machine. Unfortunately.io wants to solve the problem of telling other would-be startups, “No”. Which while not as deflating as receiving a rejection notice, can still present mental stress on the bearer of bad news. So why not outsource this emotional burden to a bot with zero feelings. Sounds like a solid business model. The way I see it, it won’t be long before we see similar features incorporated into dating apps like Bumble, or Match.com.

If you’ve gotten this far into this blog post I would like to say. Don’t Panic. The robots that live in the cloud are not ready to be your pen pal or tell your Ex to loose your number, but they are watching, and they are learning with each and every keystroke.

7 comments

  1. conoreiremba · ·

    Awesome post Andrae. “You mind if I look through your hard-drive?”… if ever there was a request to put the fear of God into even the calmest of individuals. But I cannot tell you how many times I have been impressed by Microsoft’s ability to correct even the most badly-spelled words I have attempted to type (which is more often than it should be), so it was great to see how the technology works and how it is continually improving. Also, I have discovered that my “English (Ireland)” setting in Microsoft Office spells many words differently from how they are spelled here in the US (labor vs labour, recognize vs recognize, etc.). It is something that just seems part of our everyday lives now but if it wasn’t for the magic of spellcheck, I do not think I would ever have been able to write a legible paper in my life.
    I think the Unfortunately.io idea is fantastic. Not only is it reducing time spent on an unpleasant task that can be pretty time-consuming for companies but I think we all take for granted the fact that giving bad news is not an enjoyable experience, and for the bearer of said news it can be a pretty mentally-draining process for them also. Great solution to remove a pain point for companies who have to send these letters regularly.

  2. This post was eyeopening. I did not know that we are by default opting into sharing our keystrokes to Microsoft by just having the product. I’m sure that was buried in the terms and conditions that the vast majority of people just blow past. I think this type of data sharing does not bother me as much if it is being used for a product that I actually use. Having spell check helps me everyday, so I am getting the reward of the data that I do share with Microsoft. It would have been nice if they just made that more well known.

  3. sayoyamusa · ·

    Informative post! I had no idea that every time I make typos, I nurture AI to be more accurate and even ignore button works as a feedback…! Since language is a living thing that is growing and changing over time, it makes perfect sense to teach AI constantly to learn new vocabularies.
    I’ve been appreciating my computer to correct my English misspelling as it is extremely helpful to write especially non-native language. On the other hand, however, I sometimes feel irritated by Microsoft’s suggestions when I deliberately choose the word which is not in the dictionary. This reminds me the lesson that productive machines are not good at creative works (at least yet).
    I also like the idea of Unfortunately.io.! It’s a fascinating concept to have machines deal with negative emotions. It seems based on human insights that people want to avoid meaningless conflicts with someone who is not so important to them.

  4. changliu0601 · ·

    Interesting post!I feel really excited to learn the technology behind the spell check because as a non-native speaker it saves my life!!!!I rely on the Grammarly to fix my grammar mistakes.I feel like i contribute a lot data to the AI.

  5. courtneymba · ·

    Awesome post! Informative, entertaining, and scary all at the same time ha! I’m a big fan of spell check and the likes, but for my bi-lingual husband, they are absolutely maddening. I wonder at what rate the technology is improving for multi-lingual users.

  6. Amazing post. What I thought was particularly eye-opening was unfortunately i.o. What I think has more legs is having this be incorporated into staffing companies or hiring managers that have to interview an inordinate amount of candidates. While staffing agencies already have automated responses at the higher end of the review process. They could still be benefit from this service at the 2nd round. I also see potential for this tech to take hold for the interviewee who often had to send 4-5 thank you emails to respective employers.

  7. Chuyong Liu · ·

    I have always wondered how the spell check works since, as a non-native speaker, the spell check constantly gives me question marks. I have noticed that recently Grammarly, the spell check I count on all the time, has improved at recognizing words. But as a user, I can feel they still have a lot to improve in terms of grammar check and understanding the sentence to help me best express what I am trying to say.
    However, for apps like Grammarly to improve in the grammar check, they’ll have to fuel their algorithms with our data and really understand ours imputes to spit out meaningful, even poetic sentences. So I definitely agree that they are watching, and they are learning us all the time.

%d bloggers like this: