2006-12-28

"Wikipedia Distance"

I found the website I was talking about earlier, its available here: http://www.omnipelagos.com. It seems very rough around the edges, in addition to having very outdated wikipedia content. I might just make my own version of that site soon, if I find the time for it.

Wikipedia seems incredibly well-linked, since everything seems to have a distance of 4 or 5. This means that measuring links qualitatively becomes important: a very simple first step to doing this is to start considering the language-links as having higher "connectedness" than random linked words in sentences. Next step could be simple template-based recognition of common high-valued links such as "X is a Y", "X is a sort of Y" and so on. Doing more than this though very quickly becomes an academic exercise in implementing general AI.

Oh, oh, this thing would be so much easier if wikipedia started implementing semantic tags a la semantic mediawiki and ontoworld