Earlier today, I commented on a blog post that had wild speculative remarks about Powerset. The author sent me an e-mail back, noting that the piece was actually satirical. I would have left the comment even if I had noticed: what about other people that stumble upon the article and don't realize that it's satire? As a steward for Powerset's brand, I want to make sure that people know what's true and what's false.
But, I was reminded that the line between truth and fiction on the Web is very fuzzy. People's doctored profiles oftentimes bear little, if any resemblance to their real-word referents. One might suggest that reference is foundational to truth, but then one would be adhering to a much maligned philosophical tradition. There are also facts that are only true in some contexts. For example, in the Wikipedia article about Saddam Hussein in South Park, it's true that Saddam was killed by Satan, but we know that in the real world, Saddam was hanged. The "right" answer to "Who killed Saddam Hussein" depends on what you're talking about; and both answers might be interesting, regardless of their truth value. Satire presents another problem. Though something might be written full of "incorrect" information, the piece overall might have a lot of meaning and value. Normative statements are especially hairy. The truth of "George Bush is an idiot" depends on what your definition of "idiot" is. Truth seems to be elusive.
Inbound links, anchor text, and hundreds of other signals might be useful in determining the popularity of a page, but it's not clear to a dumb marketing guy like me how to construct a signal for truth. The source isn't much of a help, because sites like Valleywag report truths, half-truths, and fantasy all under a single umbrella. A fact's frequency might be some kind of help, but what happens when a fictional story of a historical character gets more internet play than his real story?
Part of the problem might be our inability to give an adequate theory of truth, but the same might be said about popularity in a pre-Google world. Google helped to shape our version of popularity by giving popularity a formal definition, but only because that definition agreed with our common sense notion of popularity. When you see bad search results, Google can't scold you and tell you that your ideas of popularity is wrong according to their definition. Rather, it has to figure out how to conform it's definition to the majority of people.
From that angle, maybe my worries are unfounded. People seem to have a good eye for differentiating fact from fiction, even with (or especially because?) all of the information available on the Web. My selfish concern, of course, is to make sure that Powerset returns the "right" answers to questions, but the definition of "right" is the crux of the problem.
No answers here, just a bunch of questions. That's why I love philosophy!