January 24, 2006

They told me

I was really impressed by the pitch from Tellme earlier today. (Although, despite them being great voice usability experts, their web site is a pile of Flash-infested crud.) I’d assumed IVR stuff was dull. On the basis you’ll get the best insights from the session that superficially is least attractive, I selected this session. Turns out it was a good choice.

I really liked the idea as telephony as the most intimate medium — a whisper in your ear. But what Tellme really have cracked is making the IVR experience much closer to interacting with a human, and not a string of audio files tacked together with some shell scripts. They played lots of examples of really, truly awful IVR experiences. And then what they did to them.

This is important because good experiences drive real business. The customer’s impression of your business and brand is derived right from the experience they have. The example Tellme gave was UPS. Their old IVR system was very, very slow. The messages went on and on, slowly read. Do you really want UPS to be associated with “slow”? Thought not. Most of all, a good experience creates trust in your brand.

The first thing they’ve cracked is making the voice experience more seamless. They’ve created vast libraries of all sorts of clever combinations of phrases which get blended together by cognitive psycholgy and linguistics experts. And the result is super-impressive.

Their voice libraries go beyond what’s known as “single prosody”, the old-style IVR where you heard broken-up phrases glued together like “departing | Saturday. | July. | 22nd.” Instead they have multiple prosody — “departing | Saturday, | July 22nd” etc. (note the comma after Saturday.) It works. But they’ve had to record over 37000 WAV files just to read back numbers!

They’ve also cracked “points of co-articulation.” You can’t record every possible combination of terms. So record the first term followed by an example second one starting with one of the 40 phonemes in English — “Hi John”. Then record all the possble second terms: “James”, “Jim”, etc. Then splice in the right second term just in place of the example one. Again, the result is impressive.

You really can tell the difference in terms of comprehension and memory retention.

They also did a great pitch on optimising the usability of IVR systems. The phone is a linear presentation, and taxes short-term memory. You don’t have a 2D screen with bold, drop-down boxes, etc. The boundaries are also invisible. It’s not like the Web. There’s a strong “recency effect” — the last thing said (“press 0 for operator”) is first thing remembered.

So they have a bag of tricks. Personalise. For example the sports team “squeaked by” if you support that side vs. “lost a close one to” if you don’t. They “instruct as you go”, deferring navigation instructions to the time they’re needed. (Lazy evaluation always deserved a comeback…) They use “progress markers” - “First, tell us…” “Next,” “Lastly”. Adopt colloquial language, not written English. Optimise to meet user goals, not sub-tasks. And so on.

I’ve glossed over a lot of ineresting detail, and good stuff. If only they could put up a few corporate blogs and share their cool innovations and work on an everyday basis!

Posted by Martin Geddes at 10:13 PM
Trackback Pings

TrackBack URL for this entry:
http://www.telepocalypse.net/cgi-sys/cgiwrap/mgeddes/MT/mt-tb.cgi/648.

Comments

Martin,

Thanks for this info. About 2 months ago I interacted with the UPS system and I was astonished as to how human like it was that I in fact said my goodbyes to an automated system!

Well I didn't know at that time who developed it.

Carlos

Posted by: at January 25, 2006 12:39 PM

Glad to see you're blogging again! I have referenced your last two posts in my SOA blog, as examples of Context-Aware services. (What's happened to your trackbacks?)

http://www.veryard.com/so/2006/01/personalization-and-presence.htm

I'm intrigued by the claim that TellMe can "Optimise to meet user goals, not sub-tasks." Perhaps I'm reading too much into that, but it sounds brilliant if they can do it. How do they know (or infer) what the user goals are?

Posted by: at January 26, 2006 02:20 PM

I'm afraid it's manual, not automatic. The example they gave was of someone getting a stock quote and then trading. In the old system they went down the "quote" menu, and then had to go back up and start again to trade. Now it offers a trade right away.

Posted by: at January 26, 2006 03:24 PM

As a refugee from the speech industry consolidation wars earlier this decade, I can certainly attest to the fact that IVR is most certainly not boring. It's fun as all heck. But the big problem with speech is getting it to pay. There are no shortage of speech scientists, user interface gurus, and software engineers doing really cool stuff. But developing that cool stuff costs A LOT of money, and it's not clear to me that enterprises can justify the HUGE expense of creating a "good experience" replete with the latest TellMe, VoiceGenie, and/or Nuance technology when nobody's yet losing business with their reliable old "good enough experience" of touch-tone hell.

I can't wait for the day when comanies start losing business by not providing a great IVR experience. Back in my speech days in the late 90s, early 00's, my bosses always seemed to think such a day was always right around the corner. It's been 10 years since I first broke into the speech industry and those promises still have yet to be fulfilled.

Posted by: at February 2, 2006 10:29 PM
Please enter your comment below. Your comment will not appear immediately -- they all go for pre-approval by me because of the volume of spam I receive.







Remember personal info?