By: Matt Weinberger
Source: Business Insider
Microsoft just announced a bunch of updates to Project Oxford, a set of online services that help developers build more intelligent apps with complicated features like the ability to recognize faces.
The promise of Project Oxford, which was announced last spring at Microsoft's Build event, is that Microsoft has done a ton research on machine learning that most companies couldn't do themselves. Microsoft also has the computing power in its data centers to do the processing necessary to carry out these tasks. Put the two together, and Microsoft can help developers do a lot of interesting things they would never be able to do themselves.
The updates announced at the Future Decoded conference in the United Kingdom include new capabilities for speech recognition, even in loud stadiums or busy streets; for identifying the person who's speaking; for stabilizing video; and for spell-checking.
But, the coolest update is to the Project Oxford facial-recognition service: It can now "look" at photos and rate how the people in them are feeling, ranking them according to emotions like happiness, anger, or disgust. It's like Pixar's "Inside Out," but in real life.
Take a look:
Right now, the Project Oxford site has a simple demo that lets you upload a photo and see how the service rates you on emotional scales. If there are multiple people in the shot, it'll give you its best guess as to each's feelings:
The whole point of Project Oxford, which launched earlier this year, is to let programmers do their own thing with these services.
Using the Project Oxford Face API — programmer jargon for the "hooks" that programs use to talk to one another and the web — apps could use this capability for their own ends.
So for instance, programmers might start to make iPhone apps that let you sort all your shots by how happy your girlfriend looks. Or, conversely, you could filter social networks so they don't show you any sad pictures.
The business applications are equally interesting. For instance, market research firms could build software that takes pictures at key spots during, say, a commercial, and get a more scientific look into how a focus group is reacting.
"People are doing sentiment analysis on text, but not data," says Ryan Galgon, a senior program manager with Microsoft Research.
The other new Project Oxford capabilities have similar potential for making smarter apps.
Video editing. The video features, coming to developers in beta by year's end, can automatically edit video, letting you do things like chop down smartphone videos so it shows only when people are moving in your shot — and fix your shaky hands, besides.
Recognizing who's talking. The speaker-recognition feature, also available by the end of December, could be used as an extra security measure in the enterprise. The new CRIS, or Custom Recognition Intelligent Service, gives an option to train speech recognition to better understand the unique acoustics of places like loud public spaces. That one will be available in an invite-only beta later this year.
Spell-checking. But the most weirdly useful is a spell-checking API service. Galgon acknowledges that it's weird, but consider that any spellchecker developed before, say, 2012 or so, probably doesn't recognize "Lyft," the famed car-hailing startup, as a word. With this new spelling service, Microsoft can constantly update a dictionary with new slang and brand names, and let developers automatically have the latest versions in their apps, no intervention required. It means a smarter dictionary.
Galgon says his division has indeed collaborated on this service with the Microsoft Office team, so a smarter Word spell-checker could be coming.