Archive for the 'speech' Category

IBM and the Future of the Mobile Phone

Hear what Big Blue has to say about the future of mobile phones. This discussion of a IBM mobile learning executive and a Motorola director touches upon some interesting directions such as text to speech and location based technologies. Technology companies realize that this is the time to define their territory and are eager to share their visionary thinking. One of the point worth mentioning from this conversation: phones have to fit well with people’s lives and not the other way around.

Over the coming years, mobile phones are expected to get “smarter”, adjusting to our usage patterns. Innovations such as larger projectable screens and other e-commerce and social networking features will enable the phone to provide much more value to our lives.

Future Of theMobile Phone - IBM

Speech Recognition on Mobile Phones

vlingotechnology.gifHere’s one more company claiming to fix the yet unsolved problem of speech recognition on mobile phones. I’d like to see their speech-to-text solution in action or hear from someone who has tried it out. Read the complete article about Vlingo’s voice-recognition interface at Tech Review site. The key differentiators are a) that it uses Hierarchical Language Models and Adaptation techniques and b) you can train this software easily by fixing the text it gives you. And by the way, this article is available in audio as well (registration required). Here are some excerpts:

dropin-home-phone.pngVlingo, a startup in Cambridge, MA, is coming to market with a ­simple user interface that provides speech recognition across mobile-phone applications. “We are not developing the core speech-recognition engine,” says cofounder Michael ­Phillips, a former MIT research scientist and founder of SpeechWorks, which developed call-center speech interfaces. “We don’t need to do that again.” Instead, Vlingo takes speech, turns it into text, and provides a simple way to correct errors using the phone’s navigation keys, helping the system “learn.” The user’s spoken words travel over a mobile Internet connection for analysis on Vlingo’s server, sparing the phone the heavy computational work; the transcription appears less than two seconds later.

“Small platforms need speech, and search is a powerful way to find information,” says James Glass, head of the spoken-language systems group at MIT’s Computer Science and Artificial Intelligence Laboratory. “The combination of the two is very powerful,” he says, adding that Vlingo is working at that frontier.

Mazin Gilbert, executive director of natural­-language processing at AT&T Labs in Florham Park, NJ, says others, including AT&T, are also developing speech interfaces for mobile phones; he thinks one problem will be “providing the right user experience in a cost-­effective, scalable way.”

Ideas to Beat Phone Snatchers

mzins.jpgWe have talked about fighting mobile phone snatching previously in this post and mentioned the negative impact this problem in a few other posts. This topic keeps coming up elsewhere and I wanted to share this post from Karachi Metro Blog. The discussion there is more interesting actually. For instance Malaika suggests shooting the snatchers at sight. Just in case that you miss that shot (don’t tell me you step out without a gun …) , here are some other ideas, including mobile insurance. 

It is generally agreed that the PTA sponsored IMEI approach is not effective. Lets start off with a software based solution. Mantissa writes:

For my actual phone (which I’d really hate to lose), I use a program called PhoneGuardian(runs on Symbian phones and cannot be uninstalled), that will communicate back critical location information as well as subscriber information (secretly) in case it is ever stolen. I can also invoke a siren remotely as well! It also gives me the option of remotely immobilizing the phone completely (to prevent the party from getting into your address book/ media etc.) At least with the above, it improves my chances of locating my phone as well as kill the value to the phone to just it’s scrap parts (so no one benefits from my loss) - note however, that once repossessed, I can restore it easily too.

My thought is that it would be good if such anti-theft program is bundled with the phone (installing apps is not for everyone), it would be a good deterrence solution.

Another reasonable option is of insuring your mobile phone against theft or loss. Mobile Zone, a mobile handset retail chain in Pakistan, offers insurance service. Some comments from Kashif from the above post about their service:

MZ doesn’t charge anything extra for insurance. Its just that their rates are bit higher than open market. I was told that in case a set is snatched/lost, I have to register an FIR and will get replacement within 10-15 days.

Rate Comparison (Nokia N70):
MZ: 15,800
Shophive: 16,190
Beliscity: 14,930

On the idea of how a phone attracts thieves, here is a response: “I dont think a phone snatcher would pick n choose the model of the phone. They will more likely select the target based on how easy or hard it would be to snatch/steel. If the person is an easy target it would not really matter if they r holding an iPhone or some old crappy model.”

Then there’s the issue of the second-hand phone market in which most of these stolen phone end up. Is there a way to stop it? One idealistic recommendation is below. With newer phones however security is expected to be much better though.

Each phone must have an serial number. That number should be listed (submitted) to all service providers. (via police or 3rd party online database - This could be a great business opportunity too by the way) So when the thieves steel a phone - they sell it in the black market - the purchase of that phone goes to the phone service provider and will not be able to get the service and will be told that “Sir/Madam you have a stolen item, we cannot provide u service. Besides that the buyers/consumer will have the option to check/verify “before” purchasing a used cell phone if it is stolen or not.

CTIA Wireless 2007

The annual conference of CTIA, the wireless association, was held last week of March 2007 in Florida. The CTIA annual event is said to be ”the world’s largest technology event dedicated exclusively to wireless, broadband convergence and mobile computing technologies”. It is a great place to showcase new technologies, network and socialize with top decision makers and create buzz about upcoming work. For example the much awaited voice search applications from Google and Yahoo were announced here.

The event covers the entire industry from network infrastructure to microprocessors to applications to content to end-user hardware. More info at the CTIA site.

The covered topics for 2007 conference include:
- The Quadruple Play
- Mobile Enterprise
- Mobile Payments
- Mobile Entertainment
- Social Networking & Mobile Communities
- Advertising
- Globalization
- WiMAX

See the webcasts of the conference here.

One of the major highlights is the Emerging Technology Forum track at the conference. The forum focussed on the following four tracks:

  • Wireless IP - Media, data and voice applications and services, including architecture and platform requirements for handsets.
  • Multimedia Trends - Displays, chips, power, MDTV, wireless standards, spectrum, antennas, and more.
  • Handset Processors - Silicon trends and requirements for future wireless media driven handsets.
  • Mobile Software Integration - Operating systems, application software, content, DRM, services, user interfaces, and more.
  • For detailed roundup of the conference see this post.

    Voice Search On Mobiles

    Voice search for mobile phones is one of the killer applications  - just think of all the people for whom mobile is the main tool for connectivity and they do not prefer to - or simply cannot - type. How nice it would be if the phone understood your speech and provided accurate information based on your voice commands? Well we are not there yet … but there are some positive developments. When Google and Microsoft step in, things move fast and people pay attention. Here’s a summary based on a recent WSJ article.

    Google released a free experimental service last week called Google Voice Local Search. It allows users to dial a number 1-800-GOOG-411 (in US) and search for businesses in specific cities, using technology that recognizes what callers speak. It will connect you to the business or you can get the results via SMS.

    Google’s test announcement comes a few weeks after Microsoft announced plans to buy Tellme Networks  for a price that people familiar with the matter put at $800 million. The closely held Silicon Valley company specializes in services that combine voice-recognition technology with the Web, and already provides automated directory-assistance services for AT&T Inc. and Verizon Wireless, a joint venture of Verizon Communications Inc. and Vodafone Group PLC.

    Yahoo is also planning to enter the voice search race, which is largely driven by the huge opportunity to sell ads that will run on mobile phones– and by the fact that Google doesn’t dominate that business, as it does for searches that use computers. Yahoo officials say spoken queries could eventually become an option; two executives from Tellme recently joined Yahoo.

    The Journal Adds:

    Until recently, voice recognition has mainly been used by telephone carriers and companies to lower their costs by reducing the need for live operators. Recently, that technology also has been used by some new entrants to provide free, ad-supported alternatives to paid directory assistance, such as Jingle Networks Inc.’s 1-800-FREE411 service.

    The latest push by technology companies is also designed to make voice-based searches better, not just less expensive.

    Google’s experimental service, like the Web, can work even if callers don’t know the name of a business they want. A user can ask about a type of business, such as a coffee shop, and specify an intersection or ZIP Code. The service will read off a list of nearby businesses that fit the criteria.

    Another step, being pushed by Tellme in a service it has been testing, is to let users start with a spoken query, but display the results from that question on the display screen of their handset. Besides the name of a pizza shop, for example, a user could instantly see a map to it. That capability, which requires software downloaded to a handset, could also ultimately help the user complete a transaction, such as order a pizza.

    “Voice is a great way to input information,” said Angus Davis, a Tellme co-founder. “It’s not always the best way to get output.”

    Combining other kinds of information also can improve searches. Verizon, using technology from start-up Medio Systems , allows users to speak the name of ringtones, games or other things they want to buy. The technology can guess whether callers are interested in, say, the weather in Seattle or a band called Weather in Seattle by analyzing their past searches, said Brian Lent, Medio founder and chief executive.

    Microsoft, besides mobile search, says it plans to use Tellme technology to add voice input for many products, including computers and hand-held devices. A spokeswoman for Google said, “having quick, free access to local business information over the phone may prove to be very valuable to our end users.”