Report from SemTech 2008 & the SD Forum Semantic Web SIG
Carla already did a great writeup of the panel I was on yesterday, so I won't bother taping out my disjointed thoughts. However, I was at the SD Forum Semantic Web SIG panel last night (hosted at the Semantic Technology Conference) entitled, Will Semantics Give Web Search a Face Lift.
A common theme at this conference seems to be on the semantics of "semantic," and I found the problem to be especially apparent on this panel. Oddly enough, there were no demos, but here's a summary of the panelists' opinions. Consider the following:
- Healthline (Dr. AJ Chen) - since Dr. Chen was moderating, his brief slides focused on what a semantic search solution looks like under the hood. Specifically, Healthline has health-specific ontologies that they're using for refinement, improved results, and improved ads. Though I think Healthline's solution is very intersting, this is a pretty much the standard ontology implementation for vertical search companies. Semantic web = vertical ontologies used in a vertical domain.
- Google (Dr. Fernando Pereira) - Dr. Pereira seemed to have been sufficiently brainwashed by Google's Borg culture. Though he admitted that Google is looking into next-generation technologies, his slides were a laundry list of why-not reasons: lexical information always lags behind real world usage, ontologies are messy, categorization schemes are difficult (impossible?) to create, annotations are noisy and potential spam channels. He cited utilitarian philosophy as Google's mantra: the greatest good for the greatest number of people. Semantic web = interesting ideas, but I don't care unless it shows a marked improvement in horizontal results and can scale immediately to the entire Web.
- Yahoo (Dr. Peter Mika) - this talk was the closest to the standard W3C version of the Semantic Web. Through Search Monkey, Yahoo is enabling content publishers to produce rich abstracts in their search results. The idea is to provide a better user experience in search results, allowing users to get from "do to done faster." Semantic Web = encouraging publishers to use standards and using them to improve search result listings.
- Hakia (Dr. Christian Hempelmann) - Dr. Hempelmann's slides began with a sigh about needing beer. Fair enough. His slides then talked about what he doesn't consider semantics: proximity, syntax, 80% solutions, small incremental improvements. In contrast, he argued that Hakia's extensive ontology was the only way to get to true semantics. Semantic Web = our way or the highway.
Wow! Four people on a panel, essentially talking past each other. On one end of the spectrum, Google seems to poo-poo semantics and NLP, generally. On the other side, Hakia seemed to be promoting a dogmatic view of what semantics is supposed to be. Yahoo sits somewhere in the middle, trying to use approved technologies to improve results incrementally. Healthline uses targeted ontologies to improve their vertical, but with little chance to improve Web search generally.
Luckily, Dr. Ron Kaplan, our beloved Chief Science Officer from Powerset, made some important comments. First, he noted that his (admittedly biased) version of the Semantic Web could be called the "Syntactic Web," i.e., that structural relationships are a necessary condition for finding meaning. He worried that some of the panelists were saying "Search is good enough. Semantics can't get us to God's Own search engine. Therefore, Semantics isn't worthwhile." Ron pointed out that if semantics can get us better search results, the real question is whether it's good enough for users to appreciate the difference without them rebelling because it isn't perfect. My favorite quote related to this point was: "We know that stupidity scales. That's not the problem. The question is: can we do better than that?" (thanks to Uldis for capturing this on Twitter)
My Powerset-influenced opinion is that either extreme is wrong; that a general solution over incremental vertical improvements is key to overhauling Web search; that being open is better than being dogmatic; that a "semantic" web search will include semantics, syntax, statistics, and god knows what else; and that none of the major players seems are outwardly focused on this direction.
*Update* - Reposted on AltSearchEngines.
May I republish this post - with full attribution - on the ReadWriteWeb network blog AltSearchEngines.com?
Thanks,
Charles Knight, editor
Charles@ReadWriteWeb.com
Posted by: Charles Knight | May 21, 2008 at 12:02 PM
You know it's funny, the Google POV on all this as you've outlined above was brought up in almost every conversation I had today. He definitely ruffled a few feathers!
Posted by: Josh Dilworth | May 21, 2008 at 08:23 PM
Thanks for blog this panel discussion. It's healthy to see different approaches in applying semantics to web search. Which one is more effective will be ultimately determined by users' experience. I'm all for semantic search. However, completely abandoning full-text search may not be wise because some of the critical advantages from full-text search just can not be replaced by pure semantic approach. This is true for a vertical and especially the whole web.
In my brief presentation, I highlighted the hybrid strategy: semantic search on top of full-text search. Improving on current state of full-text search engine using semantics and NLP is probably a winning bat. Google or yahoo search is the benchmark - doesn't matter you like it or not. If your new search engine can not beat google or yahoo for your target audience, it's hard to call it a success.
Posted by: AJ Chen | May 22, 2008 at 10:25 AM