Recently there was a lot of news on the Web about IBM’s natural language processing system Watson. As you might have heard right now Watson is challenging two of the best Jeopardy players in the US. A lot of news magazines compare Watson with Google which is the reason for this article. Even though the algorithms behind Watson and Google are not open source still a lot of estimates and guesses can be made about the algorithms both computer systems use in order to give intelligent answers to the questions people ask them. Based on this guesses I will explain the differences between Google and Watson.
Even though both systems have a lot of things in common (natural language processing, apparently intelligent, machine learning algorithms,…) I will compare the intelligence behind Google and Watson to demonstrate the difference and the limitations both systems still have.
Google is an information retrieval system. It has indexed a lot of text documents and uses heavy machine learning and data mining algorithms to decide which document is most relevant for any given keyword or combination of keywords. To do so Google uses several techniques. The main concept when Google started was the calculation of PageRank and other Graph Algorithms that evaluate the trust and relevance of a given resource (which means the domain of a website). This is a huge difference to Watson. A given hypertext document being hosted on two different domains will most probably result to complete different Google rankings for the same keyword. This is quite interesting because the information and data within the document are completely identical. So for deciding which Hypertext document is most relevant Google does much more than studying this particular document. Backlinks, neighborhood, context, (and maybe some more?) are metrics besides formatting, term frequency and other internal factors.
Watson on the other hand doesn’t want to justify its answer by returning the text documents where it found the evidence. Also Watson doesn’t want to find documents that are most suitable to a given Keyword. For Watson the task is rather to understand the semantics behind a given key phrase or question. Once this is done Watson will use its huge knowledge base to find the correct answer. I would guess that Watson uses a lot more artificial intelligence algorithms than Google. Especially supervised learning, and prediction and classification models. If anyone has some evidence for these statements I will be happy if you tell me!
An interesting fact worthwhile mentioning is that both information retrieval systems first of all use collective intelligence. Google does so by using the structure of the Web to calculate the trust of information. Also it uses the set of all text documents to calculate synonyms and other things specific to the semantics of words. Watson also uses collective intelligence. It is provided with a lot of information human beings have published in books, on the web or probably even in knowledge systems like ontologies. The systems also have in common that they use a huge amount of calculation power and caching in order to provide their answers at a decent speed.
So is Google or Watson more intelligent?
Even though I think that Watson uses much more AI algorithms the answer should clearly be Google. Watson is highly specialized to one certain task. It can solve it amazingly accurate. But Google solves a much more universal Problem. Also Google has (as IBM of course) some of the best engineers in the world working for them. The Watson team might have been around 5 years with 40 people and Google is more like 10 years with nowadays over 20’000 coworkers.
I am exciting to get to know your opinion!
Data Science, Data Analytics and Machine Learning Consulting in Koblenz Germany
Extract knowledge from your data and be ahead of your competition