The PJC Voice Thesis

Rob May
Inside PJC
Published in
3 min readJun 2, 2020

--

For about a year, I’ve been looking at the voice space. I have no idea if voice will become a ubiquitous user interface, but, it’s a good bet as in investor to figure out IF that happens, what kinds of things will be valuable. The best overview I’ve read so far is James Vlahos’ “Talk to Me” about how voice computing will transform the world. Most of the rest of these ideas were formed by talking to entrepreneurs and investors in the space.

We are still digging into voice, but, our current working hypotheses consists of the following ideas and assumptions:

  • Voice will be in many many devices, but that voice control will be simple — similar to what your tv remote may do.
  • Language is too complex and the AI tools for understanding longer instructions and conversations are still quite a way off. We don’t expect broad voice applications, or applications that require many conversational turns, to be successful just yet.
  • NLP/NLU work best in narrow domains where the expected context around the utterance allows us to make assumptions about the meaning.
  • This means that broad based assistants will be difficult to build, and indeed, in my opinion, the performance of Alexa has actually declined as it has tried to do more.
  • Covid, and increased fear from pandemics, may increase use of voice because the more we can control the world through voice, the less often we have to touch things and possibly transmit disease.

We divide the voice world into a Infrastructure, Applications, and Add-Ons, for now.

Infrastructure — This consists of technology to capture voice, convert voice, transmit voice, and navigate pieces of voice infrastructure. The companies built here will be things that are scalable and horizontal and that other companies can use to build and manage voice. But as investors, we are focused on the things that are new — not things that can be done with existing telephony technology. This is about putting voice in places it wasn’t before. We are thinking about voice app discovery, voice clouds, voice chips and sensors that make voice at the edge possible. What is the DNS equivalent for a voice world? Who owns search and discovery?

Applications — Digital assistants are the obvious use case in this category, but in general, any application that is built to be voice driven from the ground up. We expect 5g, combined with edge processing, will make more things voice controlled and there will be some markets in which re-segmenting the market with a voice application will make sense. The key here will be to understand how defensible the voice position is in the market, which means we have to look for use cases where doing a good job on voice means doing a bad job (in the strategic sense) on whatever the status quo was before. (probably mobile app?) If there isn’t a strategic advantage and a different value chain, then a voice version of an app will lose to the incumbent apps.

Add-Ons — This will no doubt be the largest category of voice applications — add-ons to make other things voice activated or voice controlled. Unfortunately, while we believe voice will be a powerful interface, we think that making an app voice controlled has limited economic value, and furthermore, that more of that economic value will accrue to the existing application rather than the voice provider. If this is your vision, better to play at the infrastructure level than say, build a Salesforce voice add-on.

Those are the areas we are looking at. Now the question arises, what do we think current investors are getting wrong about voice? The biggest thing that I believe most investors are getting wrong right now about voice is the fear of Google, Microsoft, and Amazon. While they have strong voice capabilities from a technical perspective, and will probably be strong providers of voice related computing, I don’t think they will dominate voice applications.

The economic incentives of those companies is to use voice to improve their existing businesses, not to explore new areas. Even when they may be positioned initially as market expansions, they will ultimately face strategic pressure to reinforce their core value chains. It always happens.

This was written in June 2020, so if you are reading this more than 6 months later, it’s most likely wrong, and our theses have probably moved on a bit. We will try to keep this post updated if/when our voice theses change significantly. We have made some investments here already that haven’t been announced. If you are working on a voice related technology, particularly one with strong AI components, or strong distribution partnerships, we would love to talk to you.

--

--

Rob May
Inside PJC

CTO/Founder at Dianthus, Author of a Machine Intelligence newsletter at inside.com/ai, former CEO at Talla and Backupify.