Tech Profile: SpokenLayer

Tarah Benner
Tarah Benner




What is SpokenLayer, and how did you and your team get started in this field?

SpokenLayer is a platform that works with publishers of great content. It takes their already great content and turns it into audio using human voices so people can consume it on the go — while they’re exercising, while they’re too busy, when they’re just too tired of staring at a screen or when they aren’t able to stare at a screen.

We say that I started the company when I was about 3 or 4 years old. When I was really young, my dad used to always read stories to me, but he was a TV commercial director, and he was on the road a lot jumping to Vancouver and Los Angeles. When he was on the road, he would record books, and at each page turn he would hit a tea cup with a spoon to turn the page. He would mail those cassettes back to me, and actually we have a bunch of them still at the office with the original books.

When I got into middle school, I found out that I was dyslexic, so I started using Reading for the Blind & Dyslexic audio books and other forms of audio to learn and get through school. Then, when I went to graduate school to study computer engineering, I thought, “Well, I care about reading on the Web, not books anymore,” so that was kind of the spark of the idea for SpokenLayer. Jumping to the more current timeline, I spent about a year and a half doing two ventures, building recording and audio tools for both musicians and for general use on the iPhone, iPad and the Web. So I have a lot of expertise in audio recording and distribution.

That mix of my own personal experience enhanced with reading, learning differences and reading issues coupled with my expertise in audio and audio engineering and distributed recording — those two came together, and SpokenLayer was born. And it so happened that the team that came around it was a mix of people who have worked in radio, digital publishing and blogging. We pulled in all those skill sets because we need to know what worked and what didn’t work in all those areas to make SpokenLayer the best platform it can be.

What trends and changes in the market led you to realize that SpokenLayer would fill a void? Describe the void.

The concept of SpokenLayer and turning written content into audio has been tried before. It was tried in the mid '90s, it was tried in the early 2000s, but there are a few big market shifts that are really making it possible. To almost sound like a cliché, one of them is the iPhone, mobile consumption and turning what was an iPod into an Internet-connect iPod. I still really view the iPhone as predominantly an audio device — even though everyone has their fingers on it and they’re looking at the screen all the time. The predominant accessory — the most common accessory that every person has with an iPhone — is a set of headphones. One of the huge market drivers is that there is an on-demand audio thing in your pocket 24/7.

The other is the ability to create content — voice content — in a distributed manner. And that’s both a mix of being able to leverage crowdsourcing and distributed work models as well as speech synthesis and text-to-speech advances. Both of those are ways to distribute and improve the quality and quantity of content you can produce in a given time for a given amount of money.

Cloud-served content has become the norm now. Music radio and Pandora was this amazing shift. Blockbuster to Netflix was this amazing shift, yet that same shift is still in the process of happening, and I think SpokenLayer is catalyzing that shift: written content to on-demand consumable audio content. Putting it in your pocket or putting it in your hand in a cloud-consumable, on-demand experience is the other consumption pattern that has changed.

Why do you think audio content has exploded in the last few years?

The shift to audio is a lifestyle change. Audio has this really interesting relationship to humans and individuals since it is the original way to consume information. Text came after speaking, and photos came after text. You look at this evolution over time, and sound as the oldest form of communication is still incredibly powerful.

In the beginning of computers and computing, text was the easiest thing to convey since it can be stored in one of 26 characters. That’s very easy to hold onto in individual form. Photos, again, can be represented in pretty clean formats in digital and still convey a lot of their information. Sound took longer to get to a point that we can appreciate as humans.

I think that was kind of a latent trend. Now, everything has really caught up. Text is at maturity, video is at maturity, and the photo is at maturity and — oh wait! We’re actually wired as humans to consume sound, and it resonates with us emotionally more so than any other medium.

What barriers have you encountered with streaming, bandwidth and data rates for the mobile experience?

We work in a spectrum that is a fraction of the size of even music because we understand the human voice and the “mono” experience. We’re a quarter or a sixth or a tenth of the size of Pandora in terms of streaming audio because we understand the medium we’re working in.

So the bandwidth costs and the streaming costs and data rates and buffering and caching — all that becomes much more advantageous financially and experience-wise. We understand our medium so much better because we’re focused on a single type of content. We understand that whole audio continuum of different experiences in the audio space, which gives us the ability to provide that in a much more cost-effective way than would otherwise happen.

Tell us about your experience with Matter One. What’s it like working intensively in an incubator with other media startups for four months?

The cool thing is we’re the first class, so we get to figure out a lot of things together as a community and as a program, which is incredibly exciting. We spent the first week doing an incredibly intensive distillation. We did a semester of the Stanford —an exclusive, integrated, interdisciplinary program — in a week. Everyone who is in the Matter program is incredibly intelligent and can handle that kind of information exchange.

We all pull together and leverage our own knowledge and our different viewpoints on the ecosystem of media to help each other, critique each other and move each other forward in ways we wouldn't’ be able to if we were just working in our own bubble. And everyone’s so invested in the community. Being able to jump around and get 15 or 20 people’s mindshare 100 percent focused on what you’re doing — that’s pretty awesome.

How does SpokenLayer hope to change media for good?

We really see that we have the ability to unlock a body of content that has been held in text form. We want to bring that content to life and make it consumable for anyone and everyone. But what most motivates me is the 40 or 50 million text-impaired people in the U.S. alone who are dyslexic, blind, have learning disabilities or physical disabilities that prevent them from holding a book or computer — being able to get that content into people’s heads. If you’re driving, you’re blind, too; you’re blind to content at that point. It’s not a disability, but there’s no reason you can’t bring that content to someone in a different form. These things drives me every day.

How should publishers utilize SpokenLayer to make their content more accessible, and how can marketers apply this sort of technology to get more mileage out of the content they create?

In designing the platform, we realize that publishers are busy and don’t have time to think about this stuff. The publishers have very little work to do with integration, and they also don’t need to be figuring out how to distribute their content or how to promote it. Right now, we’re focusing on getting content flowing through the SpokenLayer system and into their native experiences — whether that’s web or mobile and mobile web.

We know what’s next, and we know how the publishers fit into that ecosystem. And one of the reasons we have a lot of value is we understand that, and we spend every waking moment understanding that ecosystem, where it’s going and how to take advantage of it in every way possible.

How do you see the way we consume content evolving in the next three years?

Content will be consumable at the user’s discretion and at the user’s beckon call. If you’re in front of a computer, it’s on your computer. If you switch halfway through an article to a mobile device, it’s on your mobile device. If your reading it one place, you can be listening in another.

If you read an article that you found interesting, other ones will be brought to you in a way that’s actually intelligent. There will be a much higher-level understanding of what content is and how it relates to each other, including consumption patterns across not only individual sections or verticals within a publisher but even across publishers and industries.

It’s about understanding how all of that content moves around, how advertising fits into that, how sponsorship fits in, and how all those cogs move together. Everything’s still a bit siloed, whether by publisher or author or by sponsorship. I think those silos will break down; a ubiquitous consumption method will spread across those, and I think SpokenLayer is something that’s starting to break down the walls of those silos in really interesting ways — and not in a way that’s devaluing the content. If anything, it’s actually adding value and bringing life to the content that exists already. We’ve got to plant a flag and really believe in content, which we do, and I do. And I think everyone does, but understanding how to make it something that can move to that model is something that many startups are vying for — even many of the people at Matter.

Will-MayoWill Mayo started AUDIOis Inc. during the last year his of M.S. in computer engineering and ergonomics at Lehigh. AUDIOis developed SoundPipe, a web-based technology that enables users to simply record and share audio content in the cloud, which morphed into SpokenLayer.

Will loves creating, manipulating, listening and everything audio. Will sings in a large NYC men’s choir, which recently performed in Carnegie Hall and also created and runs another small mixed singing group, VocalCraft. Will is also into cooking and the new social dining experiences. He enjoys playing piano, guitar, or some other instrument daily. Seven years ago, Will stopped typing in QWERTY and has been using Dvorak ever since. You’ll also catch him rocking a Kilt on a regular basis.

Related Articles

We're committed to your privacy. HubSpot uses the information you provide to us to contact you about our relevant content, products, and services. You may unsubscribe from these communications at any time. For more information, check out our Privacy Policy.

Outline your company's marketing strategy in one simple, coherent plan.