From Visual Developer Magazine #56, July/August 1999


Virtual Encyclopedia: The Long View



Classification has become part of the way I use the Web. I read, I learn, I classify, I snag the bookmark. Get enough people doing that, and the list grows very quickly. Like debugging, classification scales very nicely.

After a recent furious research session during which I added two hundred links to my Netscape bookmark.htm file in a single week, I realized that I now had close to three thousand links in the file, gathered and classified into a folder hierarchy that ran, in places, five levels deep. My frustration in dealing with Netscape's gimplegged bookmark support led me to begin work on my own bookmark manager, which I call Aardmarks. Now, having given it some thought, I realize that Aardmarks could become what I have long hoped for and called the Virtual Encyclopedia.

Long-time readers of this magazine will remember my articles encouraging content tagging systems and a "knowledge explorer" using a hierarchical classification scheme not unlike the Dewey Decimal System in a window, minus decimals. Content classification seems to be an acquired tasted—I'm obsessed with it, but most people consider it a pointless nuisance, and the crackpots on the left condemn it as paternalistic oppression of innocent ideas or something. Bottom line: Nobody bothered.

So classification is best left to people, like me, who like to classify. Suppose I opened the Aardmarks design up, and allowed other users to contribute their lists of bookmarks, classified using the Knowledge Explorer hierarchy I've been tinkering with for almost five years now. I lay down thirty or forty bookmarks a week these days without even trying. It's become part of the way I use the Web. I read, I learn, I classify, I snag the bookmark. Get enough people doing that, and the list grows very quickly. Like debugging, classification scales very nicely. I won't pull a Tim O'Reilly faux pas and call it "open-sourcing," but the idea is plainly to harness the collaborative spirit of Open Source development. It's like those little butter dishes by the cash register at Samurai Sam's Teriyaki: Give a penny, take a penny. Give a link, take a link. Or fifty. Or a thousand.

The current Aardmarks design does more than just store bookmarks. It returns to the Web (in the background) and parses a bookmarked document for title, keywords, and description metadata, and—if I can figure out how to do it—a small graphic thumbnail of the page. All of this is stored in a database. The database will be freely downloadable, but in short order it'll have to be shipped on CD-ROM. (My bookmarks alone are, today, over 300KB in size.)

But why stop there? DVD-ROM technology specs out at 4.7GB to 17 GB capacity. 17 GB! That's a lot of bookmarks, guys. Or…I could set up an indexing robot, and create a keyword index of all the links in the Aardmarks database, and include the index right on the DVD. Sure, bookmarks degrade, but we've learned how to live with that. I could reissue the DVD on a quarterly basis…or a bimonthly basis…or even a monthly basis. It would be like Yahoo magazine without the ads and the blather—and I'd pay real money for that.

Note well: It's unclear that I'm actually going to do all this. The point I want to make is that I'm designing the Aardmarks system so that if I chose to, I could. There will be no arbitrary limits. It starts out as a bookmarks manager that comes with a hierarchy of useful links to reference sites. It can grow to include a massive, collaboratively generated database of hundreds of thousands or even millions of links. Or, as our machines improve in speed and capacity, and with DVD-ROM to store it, Aardmarks can include the index itself and grow to be, in a sense, Alta Vista on a disk—minus all the porn sites, false hits and doctored rankings. I'm taking the long view on this one. 64-bit integers all the way around! Billions are wimp stuff these days. 2 to the 64th power bytes has no name—but Aardmarks will be ready for it when it happens.