Microsoft Corp. has made yet another big bet in its quest to
help lead the development of artificial intelligence with the release of
a new dataset containing 100,000 questions and answers.
Called
MS MARCO,
or Microsoft Machine Reading Comprehension, the dataset is being made
available for researchers wishing to train their AI systems. The company
says the anonymized data is based on real-world queries typed into its
Bing search engine, and that the aim is to make AIs better able to
understand questions in a conversational context than they are now.
Microsoft explains that while virtual assistants like Cortana and
Siri are already quite adept at reciting facts and figures like the
population of certain cities or previous World Series winners, they’re
not quite so comfortable with more complex or ambiguous questions. For
example, if someone asks Siri what’s the current state of the war in
Syria, most virtual assistants will simply provide search engine results
that the user then has to comb through to find the answer.
That simply isn’t good enough for Microsoft, which believes its
dataset can be used by virtual assistants to provide more definitive
answers to such questions. The idea is that instead of simply providing a
page of search query results, AIs might be able to analyze those
results themselves and come up with an actual answer to the question.
“In order to move toward artificial general intelligence, we need to
take a step toward being able to read a document and understand it as
well as a person,”
said
Rangan Majumder (above), a partner group program manager with
Microsoft’s Bing search engine division who is leading the effort. “This
is a step in that direction.”
Microsoft said the MS MARCO dataset contains questions that its
researchers found “interesting.” The answers were based on existing web
pages and verified to be accurate by real humans, so as to try and teach
AIs to do the same thing themselves. Microsoft said the dataset is
available for researchers for free.
The release of MS MARCO came at the end of a busy week on the AI
front for Microsoft. Last Monday, the company made headlines with the
announcement of a new fund for AI startups, which has already taken a startup called
Element AI
under its wing. Element AI, is based in Montreal, is working to build
commercial-grade AI systems and support the work of local startups
trying to apply neural networks in new fields.
Also last week, Microsoft announced a preview of the Cortana Skills
Kit and Devices SDK, which are designed for manufacturers that want to
integrate Cortana into various smart hardware devices, from cars to home
appliances.
With the Cortana Devices SDK, Microsoft is
hoping
to take on Amazon.com, Inc.’s Alexa-powered Dot and Echo devices, and
also Google Inc’s smart home speaker Google Home. To do so, Microsoft is
collaborating with Harman Kardon, a brand under Harman International
Industries Inc., to create an Amazon Echo-like device that’s integrated
with Cortana’s AI capabilities.
Source:
http://siliconangle.com/blog/2016/12/18/microsoft-releases-ms-marco-dataset-train-ai-systems/