In a WSJ article last week, Google announced plans to add semantic capabilities to its search algorithm. Google has prospered for many years by providing dozens of links and ads in response to any search request. However, a top Google executive admitted that for many searches today, “we cross our fingers and hope there’s a web page out there with the answer.” Soon, Google will start to blend in “relevant results” in addition to the long list of links to try to get closer to what the user was actually searching for. Hooray!
Still, semantic search isn’t enough. The aim of the “semantic web,” pioneered about a dozen years ago, was to convert unstructured documents into a “web of data.” The process was to connect terms into groupings of a subject, predicate and object called “triples.” These triples would have a relationship that can be ascertained by computers and reasoned with. The problem with semantics, as Google surely knows by now, is that there are still massive amounts of redundancies and ambiguities in web data, and thus it is virtually impossible to reason with.
Code-N has joined the crusade to improve semantic capabilities by extending keywords into concepts. The Concept Web eliminates ambiguity, inconsistencies and redundancies by employing an open-source thesaurus and then assigning a unique universal ID to each term in a triple. Context and provenance meta-data is then attached to create a mini-file of distilled knowledge, sometimes called a nano-publication. Synonyms can then be folded into each key term and a “cardinal assertion” can be formed. Now here is where the magic happens: thousands of nano-publications can be linked together to form Concept Clouds and can be searched easily to extract key insights to reason with. Voila’!
Using Concept Clouds, Code-N will provide break-through business solutions to help multiple industries extract intelligence and reason with Big Data, starting with the pharma industry to help them speed up drug discovery and repurposing.