As previously reported, I was lucky enough to host a panel at Alternative Search Engine Day last Monday. I thought I’d type up some of my notes from the panel and talk about what I thought was most interesting.
For the demos, we got a sneak peek into TrueKnowledge, which is still behind a closed beta. TK currently houses over 106M facts and is constantly growing. TK does inference, which means that it can derive facts from knowledge it has cataloged and knowledge it has about the world. For example, TK might not have an explicit fact for Queen Elizabeth II’s grand-daughters, but it knows her children and their children and can derive the fact. Users can query the system in natural language and, when a fact isn’t returned, can add that knowledge to TK.
Truvert is a vertical search engine for the green market (a popular topic at Alt Search Engine Day) which his built on technology from OrcaTec. When a user types in an ambiguous term like “palm” into Truevert, the results are centered around “palm oil,” not the PDA. OrcaTec claims that it can build a vertical with a small training set of a few thousand documents in just a few hours on a single machine. Their search is based on Yahoo BOSS.
I was especially excited to see TextDigger, which I’d never seen demoed live before. TD is currently in private beta with about 1000 users and a long waiting list. Launch is anticipated in Q3 of this year. TextDigger is meant to be a high-end research tool where users have lot of power over the semantics of their search. TextDigger shows all of the possible senses of words in a given query and the user can select which sense they really meant, which will filter the search results accordingly. I found this especially interesting because it’s a feature we always considered at Powerset, but ended up leaving out of the final product because of concerns about confusion of users (and that we thought that the heavy lifting should be done by the engine, not the end user). I’m really excited to get an invite to the beta and I’m sure I’ll have a more detailed analysis once I get in and play around.
During the Q&A session, a number of interesting topics came up. Both TextDigger and Truevert both rely on results from a bigger search engine and rerank them based on semantic algorithms. TrueKnowledge has built their own index of facts, which is focused on structured information as opposed to free text.
When talking about what “semantic search” is defined as, Dr. Musgrove of TextDigger delineated a difference between “soft” and “hard” semantic search (reminiscent of strong vs. weak AI and the hard vs. easy problem of consciousness). Better query suggestions, similar to what Google recently announced as soft, whereas actually trying to understand queries and read text is hard. The distinction still isn’t completely clear in my mind, though my instinct suggests that it would be useful to discover. I’m going to noodle that and see if I can come up with something.
Overall, great panel and I look forward to the upcoming Semantic Demo Session on 4/14 at Microsoft in Mountain View, Web 3.0, and the Semantic Technology Conference.
Comments