Category Archives: discovery

“The Internet is a Garbage Dump”

A strong article in John Dvorak’s typical style.

The Internet is full of broken links and accrued junk.

Let’s archive it and start over again.

The Internet is a giant garbage dump filled with abandoned images, blogs, Websites—abandoned everything. And no one cares enough to clean any of it up, hoping instead that it will magically fix itself after years of neglect and server shutdowns.

I have joined or tried out most of the online products and ideas that have sprung up since AOL first introduced a convoluted tool to let people design hokey pages, back in the 1990s. Most recently, I tried Posterous, one of the hottest up-and-coming sites in the country right now. Essentially you e-mail something to these folks and they post it—whatever it might be—on their servers and give you a URL that you can pass around. It’s pretty similar to sites like Drop.io, except for the e-mail gimmick.

I have no idea how backed up Posterous is, but the assertion that the site replies “instantly” after you send an e-mail could not be further from the truth—that is, unless you have a very liberal definition of the word “instant.” I tried the service using my private e-mail system. I gave up after getting no response for half an hour. I tried Gmail next. That took 20 minutes. Here’s the link, if you’re curious.

That photo is now on a server someplace, languishing, like most things on the Internet. I once joined Facebook under an assumed name and never bothered going back. It’s wasted junk that still exists on a server. I must have a half-dozen blogs that I’ve started and since forgotten about.

Yahoo did the right thing when it decided to shutdown Geocities, close down servers, and take all of the junk offline. Oh course some important sites were probably shuttered in the process, but thanks to all of the junk the service had accrued over the years, it was impossible to save them.

This brings up a parallel problem. People create canonical one-shot Websites and post them on various blogging platforms. They generally get very light traffic, but they may be referenced by a link someplace. So you’ll read something and run into a link to Thomas Jefferson’s unique formula for wine preservation. You click on the link, and the site has been taken down for one of any number of reasons.

I know some of my Blogger sites disappeared after Google bought the company. I lost a complete backup of all of my contact information when an “always free” Website went out of business. And I can’t access my Flickr photos ever since Yahoo bought the site. It’s one thing after another, and the end result is a collection of junk, missing pages, and dead ends. And all the while, site like Posterous, Reddit, and Twitter come and go. Does anyone even use LiveJournal anymore?

The usefulness of the Internet—the Web in particular—has peaked, thanks to the limitations of search engines, a problem I’ve addressed before. Missing or moved pages, combined with an accumulation of crap dumped on the Internet for no particular reason, don’t bode well for the future. There’s no evidence that the junk accumulation and missing pages are going to stop any time soon.

So instead of just complaining, we need to start the clean up—in a way that works. Personal responsibility alone won’t do it. I think the cache of information should be archived in a closed Internet—an elaborate version of Archive.org’s Wayback Machine, only without the history. Just close the Internet as we know it today. Archive it and start over. Make the current Internet read-only, and search and study it, so it can be organized properly. Everything from now on can be fluid, but let’s start over from scratch. Now that would be an interesting solution.

Advertisements

The Implicit Web-an exploration

Here’s a presentation from USID 2008.

The Implicit Web–what it means for us

The implicit web is a fascinating and of late, a practicable idea. Here’s Brad Feld’s blog post :

I’ve been fascinated with the notion of the Implicit Web since I determined that I was tired of my computer (and the Internet in general) being stupid.  I wanted it (my computer as well as the Internet) to pay attention to what I, and others, were doing.  Theoretically “my compute infrastructure” should learn, automate repeated tasks (automatically), figure out what information I actually want, and make sure I get it when I want it. Continue reading

Open data is the future of web discovery | VentureBeat

Twitter cofounders have talked about the importance of discovery in interviews and at conferences over the last several months. This week a new design for Twitter.com went live featuring top tweets and a search box to find more of what you want, but Twitter and many other web companies could improve discovery much more by incorporating other players’ data.

Also, a year and a half ago, Google vice president Marissa Mayer said that social search is part of the future of search. Now, the question is what data can help make social search and discovery advance faster. Continue reading