Working with Voice – Voice First not Voice Only

The discussions within the Verbz team are firmly in the Voice First camp. We have long believed in the rise of voice first in the consumer market and eventually into the workplace.

Voice input is one of the most natural and intuitive means of user interaction we have. Using one’s voice to make a request or convey a thought is instinctive when directed at another person and requires no effort or attention. The first challenge for any Voice First application like Verbz is to seamlessly capture that simplicity without adding effort or demanding thought. The second challenge is how to allow the recipient to consume these delivered actions in just an effortless manner.

We don’t mind recording actions and thoughts one at a time and sending them off. We might be dealing with different issues and multitasking in our problem solving, but we tend to execute those actions one at a time so using our voice is intuitive and effortless.

When we look to consume content however, we need to prioritize what we resolve first and listening to a stream of audio one at a time is both inefficient and as infuriating as suffering through the list of options of an automated call centre.

The ongoing challenge is to ensure any voice driven solution itself is not a source of Cognitive Load. First mentioned in John Sweller’s 1988 study on problem solving; Cognitive Load theory highlights the impact of distractions and user experience on perceived mental effort.

Cognitive Load refers to the demand on the user’s Working Memory which is both short and constrained. If your Long Term Memory is your brain’s hard drive then your Working Memory is more like your Clipboard.

Throughout our design process, we’ve been worried about this and asking ourselves such questions as;

  • Does our workflow create more busy work for the user?
  • Does the interaction feel natural and using existing human skills to minimize any learning curve?
  • Are we expecting the user to remember command words or complicated gestures?
  • Do we need the user to recall previous chunks of information?
  • Are there too many steps or unreasonable complexity to achieve something productive?
  • Have we maintained the sense of immediacy or does the user feel a latency?
  • Can we adapt to the increasing skills of our users and be more efficient and brief?
  • Can we remove any more distractions or shiny objects?
  • Are we able to pre-empt our users to accelerate the workflow?
  • Lastly, is it much better than existing alternatives?

At Verbz, we’re building to offer the effortless capture of thought and action via the user’s voice whilst taking advantage of displaying transcribed content for easy selection and consumption. Being Voice First doesn’t mean Voice Only.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s