2021

Microsoft Cortana: Conversational AI


Easier Said Than Done

 
 

People are juggling more meetings, emails, chats and information than ever. Microsoft’s 2021 Work Trend Index annual report illustrated how the digital intensity of workers’ days has increased substantially and shows no signs of slowing down. As the world moves to a hybrid working model with people working together across a wide range of scenarios it’s critical that we find new ways to help people connect and get tasks done on mobile devices with experiences that help you be as productive as you are on your PC, but are designed for the unique capabilities and limitations of a mobile device.

Microsoft is investing in natural language and voice capabilities that transcend the challenges of small screens and tiny keyboards to help people communicate and manage tasks when they’re working on-the-go.

 

 
 

My Role

On the Microsoft Search Assistant and Intelligence team, I am a design lead and manage a team of designers from incubation to completion of Cortana and embedded intelligence experiences in M365 products. On this conversational AI project, my role evolved over time. I started off as the sole designer during incubation and early development, then grew my team over the course of it.

  • Product/experience incubation, strategy and envisioning

  • Define and articulate the scope of MVP in collaboration with PM and engineering partners.

  • Designed information architecture framework, user flows, wireframes, prototypes, visuals, motion, and oversee development for most feature areas.

  • Review and provide direction for team’s designs

  • Developed partnerships with M365 product teams and built up the team relationship through collaboration and clear communication

  • Collaborated with client and platform engineering teams to ensure all designs were feasible and made adjustments as necessary through the development process


 
 
 

The Journey from Science to Product

In 2018 Microsoft acquired Semantic Machines Inc., a company that has developed a new approach to building conversational AI. Their approach orchestrates user input, conversational and on-screen context, and real-world APIs into a single machine-learned conversational system that is richly contextual and highly grounded. Combining the Semantic Machines technology with Microsoft’s world-class products, the team aims to democratize access to technology by delivering a more productive and natural user experience that will take conversational computing to the next level. I was asked to help productize their technology in the M365 suite starting with Outlook mobile.

a new interaction model

Being built-into a product meant we needed to consider how conversational experiences should integrate with the app so we established new design principles based off of learnings we had from our Teams mobile integration and developed a conversational UI (CUI) interaction model that augments the app UI, layering on top of it and altering information on behalf of the customer.

Threaded Chat CUI - Microsoft Build 2020 Demo

Augmented CUI - Early 2021 release in Outlook Mobile

 

Build off what’s familiar
Leverage the app UI whenever possible, don’t duplicate UI. Only create new UI patterns where none existed.

Optimize for flexibility
The experience should work, even when the system fails by designing for resiliency and reducing friction between modalities (voice/tap/type).

Leverage context
Utilize the app UI and content as context to provide intelligence that can shortcut steps for users when we can.

Coherence is key
Use the same interaction patterns and components across endpoints so customers only need to learn once.

 

Defining Capabilities & Aligning Patterns

For the initial release on iOS and Android, we strived to make core Outlook mobile customer scenarios simpler for everyone voice including: creating & finding email, finding and creating meetings, finding information about people, and changing select settings. These capabilities are powered with Microsoft Graph, which aggregates and organizes billions of signals across your organization as people work in Microsoft 365 all day. The true magic happens when we layer AI on top of the Graph to get really valuable insights, so that when you use your voice to do things like request “find the 2022 budget report” the correct document appears or “email Kim’s boss” the name appears in a snap.

 

These new AI-enhanced voice capabilities quickly contextualize your voice requests and provide a rapid response, making it more easy and natural to work on the go.

 

This is Just The Beginning

We’re continuing to refine and iterate on this initial release, learning from customer feedback, building out additional platform capabilities and support for a more complete customer experience as well as working on our coherency across products and bringing this to Teams mobile as well.