<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>your browsing library</title>
	<atom:link href="http://hooeeywebprint.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://hooeeywebprint.wordpress.com</link>
	<description>build your web</description>
	<lastBuildDate>Wed, 25 May 2011 07:14:57 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='hooeeywebprint.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://1.gravatar.com/blavatar/d4b1a1fd754c8f8b88ae1b5d5c94aefd?s=96&#038;d=http%3A%2F%2Fs2.wp.com%2Fi%2Fbuttonw-com.png</url>
		<title>your browsing library</title>
		<link>http://hooeeywebprint.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://hooeeywebprint.wordpress.com/osd.xml" title="your browsing library" />
	<atom:link rel='hub' href='http://hooeeywebprint.wordpress.com/?pushpress=hub'/>
		<item>
		<title>hooeey webprint 2.0 released!</title>
		<link>http://hooeeywebprint.wordpress.com/2011/05/25/hooeey-webprint-2-0-released/</link>
		<comments>http://hooeeywebprint.wordpress.com/2011/05/25/hooeey-webprint-2-0-released/#comments</comments>
		<pubDate>Wed, 25 May 2011 07:14:56 +0000</pubDate>
		<dc:creator>hooeeywebprint</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://hooeeywebprint.wordpress.com/?p=58</guid>
		<description><![CDATA[We&#8217;ve revamped hooeey webprint, the website as well as the subscription service&#8211;hooeey webprint Plus. The  new offerings are available through the main page (www.hooeeywebprint.com) from today. What has changed? hooeey webprint: A completely redesigned UI which is optimised for users &#8230; <a href="http://hooeeywebprint.wordpress.com/2011/05/25/hooeey-webprint-2-0-released/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hooeeywebprint.wordpress.com&amp;blog=10843279&amp;post=58&amp;subd=hooeeywebprint&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>We&#8217;ve revamped hooeey webprint, the website as well as the subscription service&#8211;hooeey webprint Plus. The  new offerings are available through the main page (www.hooeeywebprint.com) from today.</p>
<p>What has changed?</p>
<p><strong>hooeey webprint:</strong> A completely redesigned UI which is optimised for users to interact with their browsed pages is just the beginning. Robust search, multiple filtering methods, easy navigation, faster upload to cloud services, etc. are some of the improvements that bear mention. More  details at <a href="http://www.hooeeywebprint.com/faq.html">&#8216;What&#8217;s new?&#8217;</a>  or better still <a href="http://www.hooeeywebprint.com/download.html">download</a> the new hooeey webprint and check it out yourself!</p>
<p><strong>hooeey webprint Plus:</strong> Sporting a fresh, light look, it&#8217;s now faster and more robust. All core functions such as search, page retrieval, topic creation, automatic meta-tag lists, etc. are more responsive. See a <a href="http://upload.hooeeywebprint.com/hwpplusdashboard">live demo</a> or better still <a href="http://www.hooeeywebprint.com/signup.htm">subscribe</a>!</p>
<p>We are confident that current hooeey webprint users will find the new version even more indispensable than the first version since we have incorporated many, many, suggestions (and yes, complaint resolutions) to Version 2.0. Many thanks to our users for their feedback.</p>
<p>A big thank you to every team member for their unstinting efforts over the last 4 months in bringing out hooeey webprint Version 2.0.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hooeeywebprint.wordpress.com/58/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hooeeywebprint.wordpress.com/58/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/hooeeywebprint.wordpress.com/58/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/hooeeywebprint.wordpress.com/58/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/hooeeywebprint.wordpress.com/58/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/hooeeywebprint.wordpress.com/58/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/hooeeywebprint.wordpress.com/58/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/hooeeywebprint.wordpress.com/58/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/hooeeywebprint.wordpress.com/58/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/hooeeywebprint.wordpress.com/58/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/hooeeywebprint.wordpress.com/58/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/hooeeywebprint.wordpress.com/58/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/hooeeywebprint.wordpress.com/58/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/hooeeywebprint.wordpress.com/58/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hooeeywebprint.wordpress.com&amp;blog=10843279&amp;post=58&amp;subd=hooeeywebprint&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hooeeywebprint.wordpress.com/2011/05/25/hooeey-webprint-2-0-released/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/1af2b8ce609ac07583d4b9a16ef921b6?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">hooeey webprint</media:title>
		</media:content>
	</item>
		<item>
		<title>Welcoming a million new users</title>
		<link>http://hooeeywebprint.wordpress.com/2011/02/03/welcoming-a-million-new-users/</link>
		<comments>http://hooeeywebprint.wordpress.com/2011/02/03/welcoming-a-million-new-users/#comments</comments>
		<pubDate>Thu, 03 Feb 2011 12:14:42 +0000</pubDate>
		<dc:creator>hooeeywebprint</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://hooeeywebprint.wordpress.com/?p=54</guid>
		<description><![CDATA[hooeey webprint has emerged as the most downloaded application during a 3-week promotion run by Adobe in Dec&#8217;10-Jan&#8217;11. Thank you, all 1,207,783 of you. And thank you, Adobe. &#160;<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hooeeywebprint.wordpress.com&amp;blog=10843279&amp;post=54&amp;subd=hooeeywebprint&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>hooeey webprint has emerged as the most downloaded application during a 3-week promotion run by <a href="http://www.adobe.com" target="_blank">Adobe</a> in Dec&#8217;10-Jan&#8217;11.</p>
<p>Thank you, all 1,207,783 of you. And thank you, Adobe.</p>
<p>&nbsp;</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hooeeywebprint.wordpress.com/54/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hooeeywebprint.wordpress.com/54/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/hooeeywebprint.wordpress.com/54/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/hooeeywebprint.wordpress.com/54/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/hooeeywebprint.wordpress.com/54/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/hooeeywebprint.wordpress.com/54/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/hooeeywebprint.wordpress.com/54/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/hooeeywebprint.wordpress.com/54/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/hooeeywebprint.wordpress.com/54/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/hooeeywebprint.wordpress.com/54/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/hooeeywebprint.wordpress.com/54/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/hooeeywebprint.wordpress.com/54/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/hooeeywebprint.wordpress.com/54/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/hooeeywebprint.wordpress.com/54/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hooeeywebprint.wordpress.com&amp;blog=10843279&amp;post=54&amp;subd=hooeeywebprint&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hooeeywebprint.wordpress.com/2011/02/03/welcoming-a-million-new-users/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/1af2b8ce609ac07583d4b9a16ef921b6?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">hooeey webprint</media:title>
		</media:content>
	</item>
		<item>
		<title>&#8220;&#8230;web of infinite info&#8221;</title>
		<link>http://hooeeywebprint.wordpress.com/2010/12/30/web-of-infinite-info/</link>
		<comments>http://hooeeywebprint.wordpress.com/2010/12/30/web-of-infinite-info/#comments</comments>
		<pubDate>Thu, 30 Dec 2010 10:07:28 +0000</pubDate>
		<dc:creator>hooeeywebprint</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://hooeeywebprint.wordpress.com/?p=48</guid>
		<description><![CDATA[Twitter&#8217;s co-founder, Evan Williams ruminates about the ever-expanding nature of not just information, but rather, an explosion of relevant information. He says that the current generation of tools or &#8220;information managers&#8221;  will prove to be inadequate as there is a &#8230; <a href="http://hooeeywebprint.wordpress.com/2010/12/30/web-of-infinite-info/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hooeeywebprint.wordpress.com&amp;blog=10843279&amp;post=48&amp;subd=hooeeywebprint&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Twitter&#8217;s co-founder, Evan Williams ruminates about the ever-expanding nature of not just information, but rather, an explosion of <em>relevant</em> information. He says that the current generation of tools or &#8220;information managers&#8221;  will prove to be inadequate as there is a limit to how much relevant information people can actually consume.</p>
<p>The rise of intelligent, &#8220;thinking&#8221; apps/machine/web is inevitable and is already happening on Facebook, Twitter and YouTube where the crowd data is used to direct a single user&#8217;s attention to the most relevant &#8216;relevant&#8217; data. <span id="more-48"></span>Read on for the <a title="Evan Williams GigaOm interview" href="http://gigaom.com/2010/12/29/evan-williams-on-web-of-infinite-information/" target="_blank">full interview</a>:</p>
<h1>Evan Williams: The Challenges of a Web of Infinite Info</h1>
<div id="post-meta-281281">By <a title="Posts by Om Malik" rel="nofollow" href="http://gigaom.com/author/om/">Om Malik</a> Dec. 29, 2010, 9:00am PDT</div>
<div>
<p>Evan Williams and I have known each other for a long time. From a  struggling entrepreneur who started Blogger, to a successful founder who  got liberal funding for his podcasting start-up Odeo, to the accidental  launch of Twitter — to me, he has been pretty much the same person. He  prefers to stay out of the limelight, leaving (most if not all the media  duties) to his co-founder Biz Stone. And even in crowds he is quiet.</p>
<p>But occasionally he speaks freely. A few weeks ago, he and I  discussed the future of the Internet, Twitter and the curse of too much  information. It was a long conversation, sometimes rambling, but quite  enjoyable. I have edited it down to make it a quick read for you folks.  As we enter 2011, Ev’s comments can help you understand what he calls  the web of infinite information.</p>
<p><strong>Om Malik: </strong><em>Ev, when you look at the web of today,  say compared to the days of Blogger, what do you see? You feel there is  just too much stuff on the web these days?</em></p>
<p><strong>Evan Williams:</strong> I totally agree. There’s too much  stuff. It seems to me that almost all tools we rely on to manage  information weren’t designed for a world of infinite info. They were  designed as if you could consume whatever was out there that you were  interested in.</p>
<p><strong>Om: </strong><em>A scaling problem? </em></p>
<p><strong>Ev</strong>:  It was true with browsing web and (that is  when) Google came in. There was too much to browse on the web. We are  thinking the same way about Twitter. Twitter itself isn’t designed for  this world of infinite information.  (But) I want Twitter to be an  antidote to infinite information, not a cause of it.</p>
<p>We can let people follow as many accounts as possible. We just need  to let them find the right stuff. We have been going in this direction.  It is just not necessarily obvious. For example, the <em>native re tweet (RT)</em> is a way to share best stuff more widely than that account’s followers.  It sort of adds an editorial layer. So do top tweets in search. Here’s  what people are saying most about right now. It brings up Twitter in  different context. It is only possible when we have enough data.</p>
<p>OM: <em>Do you think that the future of the Internet will involve machines thinking on our behalf</em></p>
<p><strong>Ev:</strong> Yes, they’ll have to. But it’s a combination of  machines and the crowd. Data collected from the crowd that is analyzed  by machines. For us, at least, that’s the future. Facebook is already  like that. YouTube is like that. Anything that has a lot of information  has to be like that. People are obsessed with social but it’s not really  “social.” It’s making better decisions because of decisions of other  people. It’s algorithms based on other people to help direct your  attention another way.</p>
<p><em>OM: If you were starting Twitter today – same service, but in a  world that is very mobile, very multi-touch driven and a very portable  web – what would it look like?</em></p>
<p><strong>Ev:</strong> I’d have to think about that for a while but i  don’t think it looks that different than what we have today. Twitter is a  natural fit for mobile – it has the immediacy. There is nothing  significantly missing, but (we) need to really boost relevancy. If you  can’t read everything, then (what is that) you really do need to know  right now. We are working on location because that’s a signal that will  help us tell you what’s interesting for you right now.</p>
<p><em>OM: There is a lot of talk about the web being dead. When you  look over next five years into the future what does the Internet look  like?</em></p>
<p><strong>Ev:</strong> I think there is misunderstanding of the whole  “web is dead” thing. What’s “dead” is the original model of the web,  which was completely distributed and decentralized. In the beginning, it  was like a million little islands, some of them were bigger islands. If  you create something on the web, you’re your own island and you try to  get people to visit your island.</p>
<p>Websites realized they couldn’t create everything themselves so they  started to import things — advertising, search, and more and more things  that were better created by someone else — especially things that had  network effects. Companies like DoubleClick or Google owned that whole  market. That’s been the case for quite some time. Biggest thing that no  one explored until recently was identity. Facebook was the first to be  successful in exploring identity. It is obvious why that was a big thing  (for them.) On the mobile phone, you don’t have your own island. You’re  renting land. It’s a good deal because there’s infrastructure provided  (like moving into full service condo).</p>
<p>[Today] there is a completely different pitch. ‘Do I build something  on here [iPhone] or on the web?’ There are various options for where to  rent. Facebook, Google App Engine — those things will continue to gain  traction because consolidation has powerful effects. Things get  consolidated because more economical and there are network effects in  all these things. The idea of creating something from scratch, which is  independent from the web… no one will ever create something that is  wholly their own.</p>
<p>There is some risk to the Internet becoming more closed (although  it’s not really about closed). It’s that there are fewer players who  own, sort of, the land. And that will have implications long term for  everything.</p>
<p>OM: <em>Do you have any views on the design and user experience over next few years?</em></p>
<p><strong>Ev: </strong>If you think about user interface (UI) paradigms  over the next few years, you have to think of the mobile handset. I  think most of the web still isn’t prepared for mobile in general –  especially when you look at content sites. There are apps — lots of apps  are great — but other than maybe video, there aren’t really great apps  for consuming content.</p>
<p>The way we’ve gotten used to consuming content on the web, it’s a lot  more broken than we realize because there’s so much stuff around it.  Big monitors, multiple tabs… we do that unconsciously now. All that  stuff won’t work on here [picks up his iPhone]. We need a different way  to navigate. People are doing interesting things, especially on the  iPad. I’m interested in all that stuff because they’re trying to figure  out a different way to consume web information and it’s pretty cool. I  don’t think they’re doing that on phones yet though.<strong> </strong></p>
<p>OM: <em>How should technology industry and entrepreneurs be thinking about the information consumption problem that is coming onto us</em>? <strong> </strong></p>
<p><strong>Ev: I think we need to design (our products) </strong>for a  world of infinite information. Gmail’s priority inbox is a great  example. They’re recognizing we may not read all our email. I don’t know  what the others would be. <img src='http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /><br />
We should also think about — for the good of society — how do we  actually help people? Google has always wanted people to come to Google  and then go away. They don’t want you hanging out on Google. That’s very  different than lots of other services that measure success by time on  site.</p>
<p>If you’re more of a utility — a site where you come in, get what you  want, then leave. We want to be that. It’s how do we deliver the most  value. Because info is infinite and there’s always somewhere else to go,  delivering more value in less time should always be the focus.</p>
<p><strong>OM: </strong><em>So what does a start-up or even Twitter take into account in this scary new future?</em></p>
<p><strong>Ev: </strong>It’s a really significant decision about what  platforms you’re building for. No one is going to limit themselves to  one platform, which is actually kind of annoying — costs go up because  have to build for android, iPhone, web, etc. It’s hard to decide. You  want to be everywhere.</p>
<p>OM: <em>So how do you think people should think about Twitter? Like electricity — you don’t even think about it; it’s just there?</em></p>
<p><strong>Ev:</strong> <em>[Laughs]</em> I would like people to know they’re using Twitter but they shouldn’t have to think about *how* to use Twitter.</p>
</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hooeeywebprint.wordpress.com/48/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hooeeywebprint.wordpress.com/48/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/hooeeywebprint.wordpress.com/48/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/hooeeywebprint.wordpress.com/48/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/hooeeywebprint.wordpress.com/48/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/hooeeywebprint.wordpress.com/48/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/hooeeywebprint.wordpress.com/48/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/hooeeywebprint.wordpress.com/48/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/hooeeywebprint.wordpress.com/48/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/hooeeywebprint.wordpress.com/48/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/hooeeywebprint.wordpress.com/48/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/hooeeywebprint.wordpress.com/48/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/hooeeywebprint.wordpress.com/48/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/hooeeywebprint.wordpress.com/48/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hooeeywebprint.wordpress.com&amp;blog=10843279&amp;post=48&amp;subd=hooeeywebprint&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hooeeywebprint.wordpress.com/2010/12/30/web-of-infinite-info/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/1af2b8ce609ac07583d4b9a16ef921b6?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">hooeey webprint</media:title>
		</media:content>
	</item>
		<item>
		<title>&#8220;The Internet is a Garbage Dump&#8221;</title>
		<link>http://hooeeywebprint.wordpress.com/2010/07/14/the-internet-is-a-garbage-dump/</link>
		<comments>http://hooeeywebprint.wordpress.com/2010/07/14/the-internet-is-a-garbage-dump/#comments</comments>
		<pubDate>Wed, 14 Jul 2010 07:30:23 +0000</pubDate>
		<dc:creator>hooeeywebprint</dc:creator>
				<category><![CDATA[discovery]]></category>
		<category><![CDATA[innovation]]></category>
		<category><![CDATA[search]]></category>

		<guid isPermaLink="false">http://hooeeywebprint.wordpress.com/?p=39</guid>
		<description><![CDATA[Dvorak writes that the Internet is littered with abandoned blogs, web sites and the like. And this prevents search engines from working effectively since users often land up at psges that no longer exist. Archiving the web on a 'closed' Internet is one solution says Dvorak.. <a href="http://hooeeywebprint.wordpress.com/2010/07/14/the-internet-is-a-garbage-dump/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hooeeywebprint.wordpress.com&amp;blog=10843279&amp;post=39&amp;subd=hooeeywebprint&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>A <a href="http://www.pcmag.com/article2/0,2817,2366104,00.asp" target="_blank">strong article</a> in John Dvorak&#8217;s typical style.</p>
<p><strong>The Internet is full of broken links and accrued junk. </strong></p>
<p><strong>Let&#8217;s archive it and start over again. </strong></p>
<p>The Internet is a giant garbage dump filled with abandoned images,  blogs, Websites—abandoned everything. And no one cares enough to clean  any of it up, hoping instead that it will magically fix itself after  years of neglect and server shutdowns.</p>
<p>I have joined or tried out most of the online products and ideas that  have sprung up since AOL first introduced a convoluted tool to let  people design hokey pages, back in the 1990s. Most recently, I tried <a href="http://posterous.com/" target="_blank">Posterous</a>,  one of the hottest up-and-coming sites in the country right now.  Essentially you e-mail something to these folks and they post  it—whatever it might be—on their servers and give you a URL that you can  pass around. It&#8217;s pretty similar to sites like Drop.io, except for the  e-mail gimmick.</p>
<p>I have no idea how backed up Posterous is, but the assertion that the  site replies &#8220;instantly&#8221; after you send an e-mail could not be further  from the truth—that is, unless you have a very liberal definition of the  word &#8220;instant.&#8221; I tried the service using my private e-mail system. I  gave up after getting no response for half an hour. I tried Gmail next.  That took 20 minutes. Here&#8217;s the link, <a href="http://john-t5lgf.posterous.com/" target="_blank">if you&#8217;re curious</a>.</p>
<p>That photo is now on a server someplace, languishing, like most  things on the Internet. I once joined Facebook under an assumed name and  never bothered going back. It&#8217;s wasted junk that still exists on a  server. I must have a half-dozen blogs that I&#8217;ve started and since  forgotten about.</p>
<p>Yahoo did the right thing when it decided to shutdown Geocities,  close down servers, and take all of the junk offline. Oh course some  important sites were probably shuttered in the process, but thanks to  all of the junk the service had accrued over the years, it was  impossible to save them.</p>
<p>This brings up a parallel problem. People create canonical one-shot  Websites and post them on various blogging platforms. They generally get  very light traffic, but they may be referenced by a link someplace. So  you&#8217;ll read something and run into a link to Thomas Jefferson&#8217;s unique  formula for wine preservation. You click on the link, and the site has  been taken down for one of any number of reasons.</p>
<p>I know some of my Blogger sites disappeared after Google bought the  company. I lost a complete backup of all of my contact information when  an &#8220;always free&#8221; Website went out of business. And I can&#8217;t access my  Flickr photos ever since Yahoo bought the site. It&#8217;s one thing after  another, and the end result is a collection of junk, missing pages, and  dead ends. And all the while, site like Posterous, Reddit, and Twitter  come and go. Does anyone even use LiveJournal anymore?</p>
<p>The usefulness of the Internet—the Web in particular—has peaked,  thanks to the limitations of search engines, a problem I&#8217;ve addressed  before. Missing or moved pages, combined with an accumulation of crap  dumped on the Internet for no particular reason, don&#8217;t bode well for the  future. There&#8217;s no evidence that the junk accumulation and missing  pages are going to stop any time soon.</p>
<p>So instead of just complaining, we need to start the clean up—in a  way that works. Personal responsibility alone won&#8217;t do it. I think the  cache of information should be archived in a  closed Internet—an  elaborate version of Archive.org&#8217;s <a href="http://www.archive.org/web/web.php" target="_blank">Wayback Machine</a>,  only without the history. Just close the Internet as we know it today.  Archive it and start over. Make the current Internet read-only, and  search and study it, so it can be organized properly. Everything from  now on can be fluid, but let&#8217;s start over from scratch. Now <em>that</em> would be an interesting solution.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hooeeywebprint.wordpress.com/39/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hooeeywebprint.wordpress.com/39/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/hooeeywebprint.wordpress.com/39/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/hooeeywebprint.wordpress.com/39/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/hooeeywebprint.wordpress.com/39/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/hooeeywebprint.wordpress.com/39/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/hooeeywebprint.wordpress.com/39/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/hooeeywebprint.wordpress.com/39/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/hooeeywebprint.wordpress.com/39/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/hooeeywebprint.wordpress.com/39/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/hooeeywebprint.wordpress.com/39/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/hooeeywebprint.wordpress.com/39/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/hooeeywebprint.wordpress.com/39/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/hooeeywebprint.wordpress.com/39/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hooeeywebprint.wordpress.com&amp;blog=10843279&amp;post=39&amp;subd=hooeeywebprint&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hooeeywebprint.wordpress.com/2010/07/14/the-internet-is-a-garbage-dump/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/1af2b8ce609ac07583d4b9a16ef921b6?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">hooeey webprint</media:title>
		</media:content>
	</item>
		<item>
		<title>Designing re-search engines</title>
		<link>http://hooeeywebprint.wordpress.com/2010/04/15/designing-re-search-engines/</link>
		<comments>http://hooeeywebprint.wordpress.com/2010/04/15/designing-re-search-engines/#comments</comments>
		<pubDate>Thu, 15 Apr 2010 11:21:10 +0000</pubDate>
		<dc:creator>hooeeywebprint</dc:creator>
				<category><![CDATA[innovation]]></category>
		<category><![CDATA[search]]></category>

		<guid isPermaLink="false">http://hooeeywebprint.wordpress.com/?p=32</guid>
		<description><![CDATA[Greg Linden write about a topic which is hooeey webprint&#8217;s raison d&#8217;etre: finding stuff that one has searched for or has seen before. Greg quotes a 2010 paper by Microsoft Research: The most obvious way that a search tool can &#8230; <a href="http://hooeeywebprint.wordpress.com/2010/04/15/designing-re-search-engines/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hooeeywebprint.wordpress.com&amp;blog=10843279&amp;post=32&amp;subd=hooeeywebprint&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Greg Linden write about a <a href="http://glinden.blogspot.com/2010/03/designing-search-for-re-finding.html">topic</a> which is hooeey webprint&#8217;s <em>raison d&#8217;etre</em>: finding stuff that one has searched for or has seen before.</p>
<p>Greg quotes a <a href="http://people.csail.mit.edu/teevan/work/publications/papers/wsdm10.pdf">2010 paper</a> by Microsoft Research:<em></em></p>
<blockquote><p><em>The most obvious way that a search tool can improve the user  experience given the prevalence of re-finding is for the tool to  explicitly remember and expose that user&#8217;s search history.</em></p></blockquote>
<p>We believe that a tool that can not only remember the user search history, but the user&#8217;s web history can be more effective.  An astonishing 40% of searches are in fact re-searches (trying to find a page that the user had seen before) hence there is a strong case for re-designing search engines or building specialized applications to better support re-finding efforts.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hooeeywebprint.wordpress.com/32/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hooeeywebprint.wordpress.com/32/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/hooeeywebprint.wordpress.com/32/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/hooeeywebprint.wordpress.com/32/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/hooeeywebprint.wordpress.com/32/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/hooeeywebprint.wordpress.com/32/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/hooeeywebprint.wordpress.com/32/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/hooeeywebprint.wordpress.com/32/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/hooeeywebprint.wordpress.com/32/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/hooeeywebprint.wordpress.com/32/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/hooeeywebprint.wordpress.com/32/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/hooeeywebprint.wordpress.com/32/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/hooeeywebprint.wordpress.com/32/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/hooeeywebprint.wordpress.com/32/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hooeeywebprint.wordpress.com&amp;blog=10843279&amp;post=32&amp;subd=hooeeywebprint&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hooeeywebprint.wordpress.com/2010/04/15/designing-re-search-engines/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/1af2b8ce609ac07583d4b9a16ef921b6?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">hooeey webprint</media:title>
		</media:content>
	</item>
		<item>
		<title>Where does your data live?</title>
		<link>http://hooeeywebprint.wordpress.com/2010/02/18/where-does-your-data-live/</link>
		<comments>http://hooeeywebprint.wordpress.com/2010/02/18/where-does-your-data-live/#comments</comments>
		<pubDate>Thu, 18 Feb 2010 07:26:31 +0000</pubDate>
		<dc:creator>hooeeywebprint</dc:creator>
				<category><![CDATA[innovation]]></category>
		<category><![CDATA[search]]></category>

		<guid isPermaLink="false">http://hooeeywebprint.wordpress.com/?p=27</guid>
		<description><![CDATA[Here&#8217;s an interesting article from New Scientist about long term personal data storage. The key idea is that while it will become easier and cheaper to store, well, an infinite amount of data, it&#8217;s important that better ways of organising, &#8230; <a href="http://hooeeywebprint.wordpress.com/2010/02/18/where-does-your-data-live/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hooeeywebprint.wordpress.com&amp;blog=10843279&amp;post=27&amp;subd=hooeeywebprint&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s an <a href="http://www.newscientist.com/article/dn18512-innovation-we-cant-look-after-our-data--what-can.html" target="_blank">interesting article</a> from<a href="http://www.newscientist.com" target="_blank"><em> New Scientist</em></a> about long term personal data storage. The key idea is that while it will become easier and cheaper to store, well, an infinite amount of data, it&#8217;s important that better ways of organising, retrieving and presenting that data be developed.</p>
<blockquote><p>&#8220;Last week <em>New Scientist</em> pondered the fragility  of digital data stores <a href="http://www.newscientist.com/article/mg20527451.300-digital-doomsday-the-end-of-knowledge.html">over  the very long term, in the event of a civilisation-wide calamity</a>.  But anyone worried about civilisation&#8217;s chances would do well to look to  their own data stores first.</p>
<p>Most of us today are blithely heading  for our own personal data disasters. We generate and store vast volumes  of information, but few of us really look after it.<span id="more-27"></span></p></blockquote>
<p>&#8220;Benign neglect&#8221; is how <a href="http://research.microsoft.com/en-us/people/cathymar/" target="ns">Cathy  Marshall</a> of Microsoft Research Silicon Valley in Mountain View,  California, describes the way most people treat their personal archives  of digital material. It&#8217;s a view formed by spending time with computer  users to find out how much people value their accumulated data, how they  try to protect it and whether they&#8217;ve succeeded.</p>
<h3>Infinite U-Store-It</h3>
<p>Most people adopt what she dubs &#8220;the  infinite U-Store-It&#8221; approach, accumulating data haphazardly on various  computers, gadgets, removable disks and online services. &#8220;If you&#8217;ve ever  looked inside a U-Store-It you&#8217;ll realise why this is a bad idea,&#8221; she  says. &#8220;People don&#8217;t realise what they have, they just save everything  and when they do clean up they don&#8217;t do it systematically.&#8221;</p>
<p>When asked, people typically say they  value their data a lot. But they lose it nonetheless, more from  disorganisation than from a technological catastrophe such as a hard  disk failure, Marshall has found. Data can fall prey to online services  or ISPs closing accounts or changing their policies, logins being lost,  or simply forgetting what and where we have in physical or virtual  space.</p>
<p>Web services – &#8220;cloud&#8221; computing – are  becoming the home for much of our data: for example, people often store  their photos on <a href="http://www.flickr.com/" target="ns">Flickr</a> or business contacts on <a href="http://www.linkedin.com/" target="ns">LinkedIn</a>.  Giving stewardship of our data to a third party in the cloud could be a  way to keep it safe from both disaster and disorganisation.</p>
<h3>Night-light storage</h3>
<p>For example, computer scientists led  by <a href="http://users.soe.ucsc.edu/%7Eelm/" target="ns">Ethan Miller</a> at the University of California, Santa Cruz, are developing hardware  for storage services designed to look after data that you have yet to  create.</p>
<p>Their plan, dubbed <a href="http://www.ssrc.ucsc.edu/proj/archive.html" target="ns">Pergamum</a>,  is to use low-power storage &#8220;bricks&#8221; that can each make 1 terabyte of  data available instantly over the web while using just 2 watts of power –  roughly the same as a pair of computer speakers.</p>
<p>The bricks contain digital storage and  processors to manage that store and coordinate with other bricks. They  can be connected together to make as large a store as is necessary with  very little effort, and are designed to prevent future obsolescence:  they connect using standard network switches to allow today&#8217;s bricks,  which are built around hard disks, to work smoothly with tomorrow&#8217;s  flash-based bricks, or those containing storage formats as yet unknown.</p>
<h3>Memory lane</h3>
<p>But as well as developing cheaper,  more cavernous digital U-Store-Its, we need help to explore, organise  and rediscover forgotten, perhaps decades-old data.</p>
<p>Software developed at the library of  Stanford University, California, to record stories of pioneers of early  computing suggests how this might be done. The <a href="http://stanfordluminaryarchives.googlepages.com/salt" target="ns">Self  Archiving Legacy Toolkit</a> can recognise places, names and other  organising concepts in a person&#8217;s digital &#8220;papers&#8221;, such as emails,  letters and research reports. It then creates a branching &#8220;mind map&#8221;  linking items by people, places or ideas that they have in common,  forming an interactive digest of person&#8217;s life.</p>
<p>Such a tool could be of use to any of  us now that diverse, disorganised digital archives are becoming the  norm.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hooeeywebprint.wordpress.com/27/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hooeeywebprint.wordpress.com/27/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/hooeeywebprint.wordpress.com/27/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/hooeeywebprint.wordpress.com/27/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/hooeeywebprint.wordpress.com/27/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/hooeeywebprint.wordpress.com/27/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/hooeeywebprint.wordpress.com/27/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/hooeeywebprint.wordpress.com/27/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/hooeeywebprint.wordpress.com/27/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/hooeeywebprint.wordpress.com/27/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/hooeeywebprint.wordpress.com/27/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/hooeeywebprint.wordpress.com/27/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/hooeeywebprint.wordpress.com/27/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/hooeeywebprint.wordpress.com/27/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hooeeywebprint.wordpress.com&amp;blog=10843279&amp;post=27&amp;subd=hooeeywebprint&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hooeeywebprint.wordpress.com/2010/02/18/where-does-your-data-live/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/1af2b8ce609ac07583d4b9a16ef921b6?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">hooeey webprint</media:title>
		</media:content>
	</item>
		<item>
		<title>The Implicit Web-an exploration</title>
		<link>http://hooeeywebprint.wordpress.com/2010/02/16/the-implicit-web-an-exploration/</link>
		<comments>http://hooeeywebprint.wordpress.com/2010/02/16/the-implicit-web-an-exploration/#comments</comments>
		<pubDate>Tue, 16 Feb 2010 09:11:48 +0000</pubDate>
		<dc:creator>hooeeywebprint</dc:creator>
				<category><![CDATA[discovery]]></category>
		<category><![CDATA[innovation]]></category>

		<guid isPermaLink="false">http://hooeeywebprint.wordpress.com/?p=23</guid>
		<description><![CDATA[Here&#8217;s a presentation from USID 2008.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hooeeywebprint.wordpress.com&amp;blog=10843279&amp;post=23&amp;subd=hooeeywebprint&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s a presentation from <a href="http://hci-hyderabad.org/usid2008/index.htm" target="_blank">USID 2008</a>.</p>
<iframe src='http://www.slideshare.net/slideshow/embed_code/3192176' width='500' height='410'></iframe>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hooeeywebprint.wordpress.com/23/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hooeeywebprint.wordpress.com/23/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/hooeeywebprint.wordpress.com/23/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/hooeeywebprint.wordpress.com/23/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/hooeeywebprint.wordpress.com/23/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/hooeeywebprint.wordpress.com/23/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/hooeeywebprint.wordpress.com/23/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/hooeeywebprint.wordpress.com/23/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/hooeeywebprint.wordpress.com/23/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/hooeeywebprint.wordpress.com/23/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/hooeeywebprint.wordpress.com/23/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/hooeeywebprint.wordpress.com/23/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/hooeeywebprint.wordpress.com/23/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/hooeeywebprint.wordpress.com/23/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hooeeywebprint.wordpress.com&amp;blog=10843279&amp;post=23&amp;subd=hooeeywebprint&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hooeeywebprint.wordpress.com/2010/02/16/the-implicit-web-an-exploration/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/1af2b8ce609ac07583d4b9a16ef921b6?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">hooeey webprint</media:title>
		</media:content>
	</item>
		<item>
		<title>The Implicit Web&#8211;what it means for us</title>
		<link>http://hooeeywebprint.wordpress.com/2009/12/16/the-implicit-web-what-it-means-for-us/</link>
		<comments>http://hooeeywebprint.wordpress.com/2009/12/16/the-implicit-web-what-it-means-for-us/#comments</comments>
		<pubDate>Wed, 16 Dec 2009 11:26:33 +0000</pubDate>
		<dc:creator>hooeeywebprint</dc:creator>
				<category><![CDATA[discovery]]></category>
		<category><![CDATA[innovation]]></category>
		<category><![CDATA[vc]]></category>

		<guid isPermaLink="false">http://hooeeywebprint.wordpress.com/?p=18</guid>
		<description><![CDATA[The implicit web is a fascinating and of late, a practicable idea. Here&#8217;s Brad Feld&#8217;s blog post : I’ve been fascinated with the notion of the Implicit Web since I determined that I was tired of my computer (and the &#8230; <a href="http://hooeeywebprint.wordpress.com/2009/12/16/the-implicit-web-what-it-means-for-us/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hooeeywebprint.wordpress.com&amp;blog=10843279&amp;post=18&amp;subd=hooeeywebprint&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>The implicit web is a fascinating and of late, a practicable idea. Here&#8217;s Brad Feld&#8217;s <a href="http://www.feld.com/wp/archives/2009/04/the-maturing-of-the-implicit-web.html" target="_blank">blog post :</a></p>
<blockquote><p>I’ve been fascinated with the notion of the <em><a href="http://www.foundrygroup.com/blog/archives/2008/03/theme-implicit-web.php">Implicit Web</a></em> since I determined that I was tired of my computer (and the Internet in general) being stupid.  I wanted it (my computer as well as the Internet) to pay attention to what I, and others, were doing.  Theoretically “my compute infrastructure” should learn, automate repeated tasks (automatically), figure out what information I actually want, and make sure I get it when I want it.<span id="more-18"></span></p></blockquote>
<p>In 20 years, I expect we will snicker at the idea of having to go search for information by typing a few words into a text box on the screen.  It’s way better than 20 years ago, but when you step back and think about it, it’s pretty lame.  I mean, I’ve got this incredible computer on my desk, a gazillion servers in the cloud, this awesome social network, yet I find myself typing the same stuff into little boxes over and over again.  Ok – it’s all pretty incredible given that it wasn’t so long ago that people had to rub sticks together to get fire, but can’t it be amazing and lame at the same time?</p>
<p>Several companies that I’ve got a personal investment in that play in and around the implicit web recently came out with new releases that I’m pretty excited about; each addresses different problems, but does it in elegant and clever ways.</p>
<p>The first – <a href="http://www.oneriot.com/">OneRiot</a> – came out with a new twist on <a href="http://twitter.oneriot.com/">using Twitter for search</a>.  OneRiot’s goal is to provide a search engine for the real time web.  To that end, they’ve historically gotten their data on what people are looking at from a collection of browser-based sensors (anonymous, opt-in only).  They’ve built a unique search infrastructure that takes a variety of factors, including number of people on a specific URL in a particular time period, freshness of the content, and typical content weighting algorithms.  A little while ago they realized that people were tweeting a huge number of URLs, mostly via URL shorteners (which are loathed by some very smart people.) <a href="http://search.twitter.com/">Twitter search</a> addresses keywords in the tweet, but it doesn’t do anything with the URL’s, especially the shortened ones.  So, OneRiot built a pre-processor that grabs tweets from Twitter’s API that include a URL, tosses the shortened URL into OneRiot’s search corpus (which expands the URL and indexes the full page text), and then references it back to the original tweet.  It also correlates all tweets with the same URL (including re-tweets) across any URL shortened service.  Now, imagine incorporating any URL data that’s real time that has an API, such as Digg.  Aha!  It’s alpha so forgive it if it breaks – but <a href="http://twitter.oneriot.com/">give it a try</a>.</p>
<p>The second – <a href="http://getglue.com/">AdaptiveBlue</a> – has released their <a href="http://getglue.com/42/brief.php">newest version of Glue</a>.  Glue is a contextual network that uses semantic technology to automatically connect people around everyday things such as books, music, movies, stars, artists, stocks, wine, and restaurants.  It uses a browser-based plugin to build this contextual network implicitly.  When you are on a site such as Amazon, Last.fm, Netflix, Yahoo! Finance, Wine.com, or Citysearch, the Glue bar automatically appears when it recognizes an appropriate object, categorizes it, and let’s you take specific action on it if you want.  Glue has been evolving nicely and now includes the idea of connected conversations between friends (e.g. talk about whatever you like regardless of the site you are visiting), smart recommendations (e.g. implicit recommendations), and web wide top lists of the aggregated activity of all Glue users.</p>
<p>In addition, we’ve finally found a company that we think is attacking a wide swath of the problem of the Implicit Web the correct way, at least given today’s technology. We hope to close the investment and start talking publicly about it early next month.</p>
<p>For now, I expect the applications around the Implicit Web to continue to fall into the early adopter / you need to see it to believe it category (where it’s harder to explain than just to show).  In the near term, if you are interested in this are, try out <a href="http://twitter.oneriot.com/">OneRiot</a> and <a href="http://getglue.com/">Glue</a> – they are both evolving and maturing very nicely.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hooeeywebprint.wordpress.com/18/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hooeeywebprint.wordpress.com/18/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/hooeeywebprint.wordpress.com/18/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/hooeeywebprint.wordpress.com/18/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/hooeeywebprint.wordpress.com/18/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/hooeeywebprint.wordpress.com/18/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/hooeeywebprint.wordpress.com/18/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/hooeeywebprint.wordpress.com/18/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/hooeeywebprint.wordpress.com/18/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/hooeeywebprint.wordpress.com/18/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/hooeeywebprint.wordpress.com/18/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/hooeeywebprint.wordpress.com/18/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/hooeeywebprint.wordpress.com/18/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/hooeeywebprint.wordpress.com/18/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hooeeywebprint.wordpress.com&amp;blog=10843279&amp;post=18&amp;subd=hooeeywebprint&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hooeeywebprint.wordpress.com/2009/12/16/the-implicit-web-what-it-means-for-us/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/1af2b8ce609ac07583d4b9a16ef921b6?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">hooeey webprint</media:title>
		</media:content>
	</item>
		<item>
		<title>Open data is the future of web discovery &#124; VentureBeat</title>
		<link>http://hooeeywebprint.wordpress.com/2009/12/14/open-data-is-the-future-of-web-discovery-venturebeat/</link>
		<comments>http://hooeeywebprint.wordpress.com/2009/12/14/open-data-is-the-future-of-web-discovery-venturebeat/#comments</comments>
		<pubDate>Mon, 14 Dec 2009 11:09:56 +0000</pubDate>
		<dc:creator>hooeeywebprint</dc:creator>
				<category><![CDATA[discovery]]></category>

		<guid isPermaLink="false">http://hooeeywebprint.wordpress.com/?p=4</guid>
		<description><![CDATA[Twitter cofounders have talked about the importance of discovery in interviews and at conferences over the last several months. This week a new design for Twitter.com went live featuring top tweets and a search box to find more of what &#8230; <a href="http://hooeeywebprint.wordpress.com/2009/12/14/open-data-is-the-future-of-web-discovery-venturebeat/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hooeeywebprint.wordpress.com&amp;blog=10843279&amp;post=4&amp;subd=hooeeywebprint&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<blockquote>
<div>
<p>Twitter cofounders have talked about the importance of discovery in interviews and at conferences over the last several months. This week a new design for <a href="http://www.twitter.com">Twitter.com</a> went live featuring top tweets and a search box to find more of what you want, but Twitter and many other web companies could improve discovery much more by incorporating other players’ data.</p>
<p>Also, a year and a half ago, <a href="http://www.google.com">Google</a> vice president <a href="http://venturebeat.com/2008/01/31/googles-marissa-mayer-social-search-is-the-future/">Marissa Mayer said that social search is part of the future of search</a>. Now, the question is what data can help make social search and discovery advance faster.<span id="more-4"></span></p>
<p>Think about the data that represents everything you do online, including web visits, searches, ads clicked, purchases, time spent, location, etc. Web products like the Google browser toolbar return data to Google about the websites you visit. Browsers like Chrome, Firefox and Internet Explorer can get even more data about what you do. For this piece, I’m referring to all this toolbar, browser, search and email data as “toolbar data” for short.</p>
<p>What you typically discover on Twitter and <a href="http://www.facebook.com">Facebook</a> is limited to your connections and what you search. More, better data is needed to learn about what you’re missing. You might have a lot of interests – sports, music, technology, books, movies, TV, food, travel, etc. – and things happen around you and around the web related to them that you probably want to know about. Surprise concert by your favorite band tomorrow night? New travel website? Cutting edge phone being released? We don’t even know how much we’re missing until we see it.</p>
<p>With more data, developers could build services or apps with toolbar data to see what’s hot now, this week, month or year for any thing broken down by age, location and more. One app might focus on the most popular content about travel to Asia based on unique visitors to specific web pages and the number of links shared by email or social networks. Another app might cover the most engaging communities online based on growth in time on particular parts of each website compared to peers. The data could look at user session activity across sites and specific content on web pages. In contrast, Google Hot Trends only reports on search terms and typically free data from analytics services like <a href="http://www.compete.com">Compete</a> report only by website unless you pay for web page level reports. Entrepreneurs could use the toolbar data to identify unmet needs, then build products and services to meet them. Without a more complete picture of the data, it’s hard for entrepreneurs to know what users really want.</p>
<p>Twitter cofounder Biz Stone recently said that <a href="http://www.techcrunch.com/2009/07/23/biz-stone-talks-twitter-at-fortune-brainstorm/">ranking the authority of tweets is needed to surface the important tweets</a>. Already the Twitter feed can quickly become overrun with fresh content, causing you to miss tweets you may find interesting. Users not on the site all day could benefit from a summary of the best tweets before seeing the real-time stream while on the site. Surprisingly, I haven’t seen a developer using Twitter APIs has solved this problem, unless you count Twitter search companies that filter tweets in search results. But some third-party developers for Twitter tell me they could benefit from more data about what content each user cares about, such as the number of impressions and clicks on links by which users, as well as time spent on different pages and locations of the users. At least part of that data is held by Twitter, Twitter clients and URL shorteners.</p>
<p><strong>The data people want access to</strong></p>
<p>With users sharing more and more on Twitter and Facebook, there are billions of statuses/tweets projected to be set each year. To try to capture this opportunity, dozens of Twitter search start-ups have sprung up over the last year. Big companies like <a href="http://www.yahoo.com">Yahoo</a> and <a href="http://www.microsoft.com">Microsoft</a> have released Application Programming Interfaces (APIs) that let developers access their web index and rerank results. But these services can be limiting. Google APIs, for example, do not permit developers to rerank results yet.</p>
<p>The hundreds of millions of people using Google prove that web content that’s not all real-time is, of course, still very useful. The next step after continuing to improve real-time search is for Twitter search providers to figure out how to do relevance ranking over different amounts of time. Beyond searching Twitter data, there might be an opportunity to search the <a href="http://www.facebook.com/press/info.php?statistics">4 billion pieces of content</a> (web links, news stories, blog posts, notes, photos, etc.) shared on Facebook each month, with degree of access depending on how users transition their privacy settings using its upcoming <a href="http://blog.facebook.com/blog.php?post=101470352130">privacy transition tool</a> and whether <a href="http://www.insidefacebook.com/2009/07/29/apis-critical-to-facebooks-plans-to-dominate-real-time-search/">Facebook makes a search API </a>available like Twitter.</p>
<p>The availability of tweets/status messages to find using real-time search is limited by when and how much people choose to share. Sometimes there’s a delay between when an event happens and when people start to share about it online, but first they’re searching or clicking on content they want to share. Google benefits from seeing what people are searching and browsing around the web in part thanks to the Google Toolbar, Google Chrome, and other data, too, like Gmail and search history. For now, at least, Google has a data advantage and so might be better than most for telling you what’s new around you. That said, this data gap isn’t insurmountable for Facebook and Twitter. Either could eventually acquire toolbar-style reach by building large enough businesses around search or creating services that make users want to share their full data. They could also partner with a company like Yahoo or Microsoft to get the data.</p>
<p>Then the data disadvantage could go away and actually flip to Facebook’s advantage, in particular because of Facebook’s massive and growing user base, network effects and all its data not indexable by Google. Google creating a social graph as popular as Facebook’s is actually less likely than Facebook acquiring toolbar or equivalent data. If Facebook eventually acquires the data, makes the data available to developers and helps commoditize search (assuming that data can be used to replicate and/or improve upon Google’s advances), then <a href="http://www.wired.com/techbiz/it/magazine/17-07/ff_facebookwall">where would that leave Google</a>?</p>
<p>About the Google data advantage, someone close to Facebook tells me: “The framework I usually use for a discussion like this is ‘explicit’ vs. ‘implicit.’ One real danger here is that a lot of the data you will get from implicit browsing is searching and looking without success. While you can help narrow and get a sense of someone’s interests, you can’t necessarily predict the next best matches. The reason that Tweets/Facebook (and even PageRank) are interesting is that they try to actually take an explicit affirmative action such as a share or hyperlink and create useful data and relationships on top of the core data – and especially from real people such as your friends or influencers.”</p>
<p>I’m told Google has a significantly higher number of toolbar users than Yahoo and Microsoft, but Microsoft Internet Explorer still has the largest browser install base. The number of active Google toolbar users is a closely held secret, and I’m told a considerable number of users turn off the tracking feature, but that the number of people with tracking on is large enough to get a picture of what’s happening on most sites. Of course the exact numbers have a big impact on how much data these companies are actually collecting from users. Google made available <a href="http://www.google.com/trends/hottrends">Google Hot Trends</a> a while ago to show trending searches for any given day and clicking a search term shows web and news results, but it’s relatively uninformative compared to all the data Google knows about what’s happening on the web from constantly crawling the web and data coming from services like Chrome and the toolbar. Compared to a Facebook or Twitter feed of content like news stories and status messages, a list of links to popular search terms not necessarily related to your interests just isn’t that interesting. Google Trends for Websites at least shows websites also visited, but it doesn’t show trends for specific web pages.</p>
<p>Already Google indexes the web in near real-time, and some former Googlers say the company could easily have a real-time view into most of the web already. Imagine if Google made it easier to discover all the data it already knows about specific web pages or content. Think of all the different kinds of content out there, and get ready for a long list of possibilities. In terms of content, there’s trending web pages and websites, products, news, photos, tweets, status messages, comments, blog posts, books, games, searches or any other web content. You could then potentially see trends for each piece of content, organized by types of data like unique visitors, type-in, upstream and downstream traffic, referrals, purchases, link sharing, clicks, demographics, user profiles, location and proximity to you, audience also searches for/likes/visits, traffic frequency, daily/weekly/monthly unique users/page views/visits, business activity, time spent, etc.</p>
<p>Imagine if you had control over filtering the content you discover using all the types listed above such as trending purchases or time on site. Think of it as data you see on <a href="http://www.comscore.com">comScore</a>, <a href="http://www.quantcast.com">Quantcast</a>, Compete, etc. but a level deeper because it’s about individual pages updated in real-time with the sorts of filtering options above. Google could share what’s happening around you without waiting for users to set more status messages or type searches into the search box. The result could be a feed of what’s happening now or what’s happened ranked over any stretch with most anything online for most anyone around the world. This map of human activity could encompass most of the activity on Twitter and give a clearer picture about what’s done all around the web rather than just the subset of topics that tend to be most represented on Twitter.</p>
<p><strong>How does this apply to everyday life?</strong></p>
<p>What might the pairing of real-time with more traditional search data look like? One area is discovering trends. Maybe you like tech, so you look up the top ten most popular web pages (news, start-ups, Twitter accounts, etc.) in the category technology or consumer internet looked at or shared by people in Silicon Valley over the last day. Maybe you’re thinking about visiting New York City and want to find the hottest restaurants, so you look up the most trending popular <a href="http://www.yelp.com">Yelp</a> restaurant web pages over the last couple months visited by, shared or commented on by people living there. Maybe you have a favorite blog or Twitter account you follow, but you want to discover content that people like you are checking out, so you look at the top 10 trending web pages viewed over the last few weeks by people who also frequently visit the blog or Twitter account. Maybe you want to find something to do for fun, so you look at games that have the highest level of time on site growth in the last few months. Developers could think of any number of combinations to make available to users that’s easy to use or offers customization options.</p>
<p>An engineer who has been focusing on search tells me that academics often want search log files because they want all the information to derive their own statistics. This requires rigor and sophisticated techniques the researchers like to use, but the downside is that this data can be noisy. For developers, he said the search provider could aggregate the data into summaries and make those available instead of the raw data. This is not only more usable/efficient, but can protect users’ privacy better than raw data because the data is presented as summaries rather than web history tied to an individual that can still expose identity. If interesting data that’s easily accessible is what’s driving all the developer activity around the Twitter API, then how many more multiples of the developer activity might occur given toolbar data that covers significantly more activity around the web in addition to tweets, which are a subset of what people care to talk about publicly?</p>
<p><strong>What do people in the industry think about the potential of a toolbar data API?</strong></p>
<p><a href="http://www.avc.com">Fred Wilson</a>, the VC at <a href="http://www.unionsquareventures.com">Union Square Ventures</a> who has backed companies including Twitter and comScore, said this about a possible toolbar data API: “It would be great. It’s not likely to come from the analytics companies because they sell their data. Quantcast seemed to be headed in an advertising direction which would have made this approach workable for them. But lately, it seems they are looking more and more like Compete and comScore.”</p>
<p>Othman Laraki, a former Googler who worked on Google Toolbar and is now cofounder and President of <a href="http://www.townme.com">TownMe.com</a>, said: “It’s definitely an interesting area and something that could prove to be an incredibly valuable resource for people developing new services. Particularly, now that both search engines as well as other services are increasingly focused on surfacing real-time information, offering an API that makes it possible to shorten the feedback loop could be a game-changer. Whereas it has become easier to analyze traffic after the fact (i.e. when a service addresses a need, one can more easily understand why), what is still difficult is discovering the untapped opportunities (i.e. when consumers have a need that is not yet satisfied). Moreover, opening up the lower-level data would likely bring a great deal of creativity to the game. An example that comes to mind is the Twitter’s API. Twitter’s openness has enabled numerous interesting applications that in turn have made the data more valuable in the first place.”</p>
<p><a href="http://www.venturegeneratedcontent.com">Satya Patel</a>, a former Googler and now a principal at <a href="http://www.battery.com">Battery Ventures</a>, said: “I just don’t seen any company that has toolbar data making that data accessible to others. It’s too valuable and too much of a privacy concern. If you think about it, the data that Google has from the Google Toolbar and Google Analytics is incredible and basically can create a real-time map of the web. There are all kinds of interesting applications for this data but I just don’t see any benefit to Google and others to opening up this data. Twitter is different because it has already conceded some of this data, to Bit.ly for example. There are many sources of web usage data so it will be interesting to see how this market evolves and who really creates value for consumers or businesses based on this data.”</p>
<p>A former long-time Googler said: “Toolbar data is super sensitive at Google. I think it would be unlikely that Google would ever share this data. Google doesn’t want additional monetization at the expense of risking user trust/privacy on toolbar data. Larry is rightfully very concerned about defending the user’s privacy (and for Google it is economically advantageous to do so). I think if sharing toolbar data were ever to happen it would be very far in the future and likely only because something about the infrastructure of the Internet changed so dramatically that all of the browsing habits of users became transparent or otherwise generally visible.”</p>
<p>Konrad Feldman, chief executive of Quantcast, said: “Many people have toolbars for discovery, such as StumbleUpon. I think the static data is kind of interesting, but of course real-time trending coupled with collaborative filtering is where you really get the ‘aha’ moment for discovery.” He also noted the company recently launched a <a href="http://webreprints.djreprints.com/2227170150898.html">new media program</a> that lets advertisers make a profile of the type of users they want to reach, and then match this target group in real-time based on the 6 billion real-time media consumption events the company observes every day.</p>
<p>Saar Gur, partner at <a href="http://www.crv.com">Charles River Ventures</a>, which is a Twitter investor, said: “A number of valuable discovery services (measured by CTR) have been built leveraging cookies (e.g., online advertising, personalization and analytic services like Quantcast).  A number of valuable discovery services have been built leveraging social data (e.g., Facebook, Twitter, Yelp, etc.).  That being said, for a number of reasons developers have been unable to leverage the much richer data set that is captured by browsers and toolbars.  With all the attention around bit.ly or ShareThis/AddThis as Digg killers, think of the services that could be built combining social data with actual page-level consumption data (e.g., time on site, pages within a site that are not cookied). As an example, the initial add-on market on Firefox is very interesting but doesn’t really leverage the most interesting data that Firefox can capture.”</p>
<p>Martin Green, Chief Operating Officer at <a href="http://www.meebo.com">Meebo</a>, said: “I think an easy way to think about the ultimate user benefit is to see if social content data APIs can do for web content what Amazon delivers for products. I love shopping at Amazon because it knows my historical interests (purchases) and those of others, and it matches that history with the current inventory and usage trends and associations to make a series of relevant suggestions every time I go to the site or search for a product. I would personally love a content recommendation service for content discovery for the web ‘right now’ that is built from a combination of knowing my interests and taking advantage of the social filtering process from people I know and people who share my interests, and who’ve seen something relevant to me before I have discovered it.”</p>
<p>Vipul Ved Prakash, CEO and founder of <a href="http://www.topsy.com">Topsy</a>, said: “The types of higher order analysis that can be done using the visitation logs for discovery/recommendations/trends can be very valuable. We’ve built Topsy to process streams of events about the web – so we’d likely be one of the consumers of this data. The field of anonymizing network traces is an active area of research these days (motivated by the need for security researchers to share logs without compromising privacy) and that work might be applicable here. A good method for anonymizing traces is considered to be <a href="http://www.citeulike.org/user/martibur/article/2974196">prefix-preservation</a>, which has better privacy than one-to-one anonymous mapping (AOL did one-to-one) and more signal than completely anonymous. That said, completely anonymous is still extremely useful and perhaps more practical.”</p>
<p>Gregg Poulin, General Manager of Compete.com, said: “You are right on with toolbar data (we call it clickstream) being opened up and the benefits of ‘crowdsourcing’ it. You should also think about how Deep Packet Inspection could work here. That (currently) is very ‘black box’ and focused on delivery ads to people showing certain behavior but, I think that same technology could be opened up.” The Compete API lets you access data on the website level, but not on the web page level.</p>
<p>Liad Agmon, founder of social search company <a href="http://www.delver.com">Delver</a>, said: “It’s extremely valuable, not only for the developer community, but for the general web community. There has been constant debate on how accurate are Alexa and Compete (which also use toolbar stats for their data), and getting access to row data could be amazing.”</p>
<p>Adam Boyden, president of <a href="http://www.conduit.com">Conduit</a>, which powers community toolbars used by 200,000 publishers and 60 million monthly active users, said: “We think there is a huge trend to have information dynamically updated in real-time while sitting persistently within a user’s browser. More than that, we think you have picked up on an even bigger trend as users want to be able to customize the information they receive and content owners are finding they have to cooperate with other providers to give the best experience possible. In other words companies need to cooperate with each increasingly to thrive. We developed the Conduit Open marketplace to help with this trend by allowing any content owner to offer their information to be easily added to other content owner’s toolbars and also allow end users to customize part of their toolbars as well.” The company <a href="http://www.conduit.com/Conduit-Open/Marketplace.aspx">launched the open marketplace</a> a few weeks ago, which lets companies offering Conduit toolbars share features and choose to include features created by others. Boyden said that any developer could create a discovery tool, then the owners of the toolbar could choose to make it available to new users, or give existing toolbar users the option to activate it provided they clearly understand and consent to whatever data is shared.</p>
<p>Mark Cramer, chief executive of <a href="http://www.surfcanyon.com">SurfCanyon</a>, said: “Data is extremely powerful, which is why those who have it are normally reluctant to give it up while those who don’t are excited about all the possible things they could do with the data. We formed Surf Canyon with the goal of building a technology that would re-rank results ‘on the fly’ as people search. Building the underlying search from scratch would have been prohibitive, but we’re able to benefit from residing in the browser as an add-on. We have also repurposed our technology to cull out the more relevant Tweets. We could similarly see integrating data from other sources, such as Compete, Quantcast, etc. and then even exploiting this data when re-ranking. (For example, popularity or freshness could help influence relevancy, in either a positive or negative direction.) The backend is built on Yahoo! Boss and Microsoft SilkRoad.”</p>
<p>Tobias Peggs, GM of <a href="http://www.oneriot.com">OneRiot</a>, said: “OneRiot’s user panel is a key advantage in real-time search.  It enables us to deliver broad-based real-time search results by harvesting both explicit social activity on Twitter, Digg and other services in combination with implicit data from over 3 million users who have elected to join our panel. That data helps inform our PulseRank algorithm – PageRank for the real-time web – ensuring that our results reflect what’s relevant right now in relation to your query. OneRiot has a simple search API today that is handling millions of searches a day for partners like Microsoft and Scour who are delivering our real-time search results to their users.  Soon we’ll extend the API to offer a deeper view into the data that we have around each piece of content in our index. Opening up this data (while respecting our users’ privacy) will give the developer community lots of opportunity to build some very creative real-time applications. API partners will now be able to get rich social meta-data on content from across the web, including information like “dwell times”, to number visits, to the PulseRank on any individual url, all in real-time.”</p>
<p>Cyril Moutran, cofounder and chief executive of <a href="http://www.twazzup.com">Twazzup</a>, said: “Analyzing data stream from toolbar could indeed provide great insights. More generally any user activity stream that can be sliced either by topic, user segment and/or location can provide powerful insights. Communication tools like Twitter capture not just individual user actions, but also the propagation of links and messages through social graphs. Analyzing this propagation (how, how fast, where, who is involved) is the source of great insights.”</p>
<p><strong>Privacy, and other issues</strong></p>
<p>Facebook and Twitter have grown massively by taking what people share in posts and status messages and making it easy to share and consume content. A question is if these companies and others might figure out discovery before the companies with the data advantage do. It’s unknown whether building a discovery service is high enough priority or even on the to-do list at the companies holding the toolbar data. At the least, Facebook, Twitter and companies with useful data about Twitter –  like Twitter clients and URL shorteners – could help developers trying to improve discovery and search by making more data available about what users care about, including what they click. Andrew Cohen of <a href="http://bit.ly/" target="_blank">bit.ly</a> tells me the company has made available an API to enable developers to access info about bit.ly links. Given that these links are clicked about 1 billion times per month, it will be interesting to see what people do with its API.</p>
<p>Someone at Google familiar with Google Toolbar told me that in some cases the company has not found much signal in toolbar data and so looked at links being explicitly shared for help. This person also said that link sharing data gathered from the toolbar is somewhat limited. Word is that links shared by Gmail users has been more valuable. Twitter benefits from having a significant number of links being shared.</p>
<p>As Facebook and Twitter users <a href="http://blogs.zdnet.com/BTL/?p=10409">share more content</a>, they’ll benefit from fresh content and data to help users find and discover even more content. Imagine if these companies could access toolbar data from Google, Yahoo or Microsoft and map the information to their graphs of connections. You could get recommendations based on what your connections find interesting around the web. Advanced privacy tools would be required to avoid sharing content based on web browsing data with someone who might connect it back to a friend. For example, a story recommended to you might be about a baseball team. If you have a single friend who is a loyal fan of the team, you could guess that the story was recommended to you because they had read the article somewhere. Before that in case there are still privacy concerns, content filters could zoom out from the closest connections so you’re unlikely to draw connections from info you see and what people you know are doing.</p>
<p>Privacy issues include legal challenges that have <a href="http://www.wired.com/epicenter/2009/05/nebuad-venture-capital-dispatch-wsj/">stopped companies like NebuAd from using ISP data</a>, problems with users’ identities being exposed when sharing raw data like in the case of AOL because users  <a href="http://en.wikipedia.org/wiki/AOL%20search%20data%20scandal">tend to search for their own names </a>and mistaken user expectations about sharing of data as seen with  <a href="http://en.wikipedia.org/wiki/Facebook%20Beacon">Facebook Beacon</a>. The legal issue is an open question, but summaries of data through an API could be a big step towards avoiding the privacy problems of sharing raw data. Google manages to use toolbar data for powering services without creating user mistrust, so maybe other companies can do so as well.</p>
<p>Those privacy challenges could make it hard for Twitter and Facebook to get the full benefit of their social graph data. That reduces the competitive advantage of Facebook and Twitter because Google, Yahoo and Microsoft don’t have popular, explicit social graphs anyway. However, using social filtering could be a key draw for users, such as looking at what friends, friends of friends or people with similar user profiles and interests as you. This is already evident with the popularity of Facebook and Twitter. Toolbar data aside, Facebook and Twitter have their own user data that could be shared with developers to help them build better ways for users to discover, consume and share on their platforms.</p>
<p>Questions might be raised about why Facebook and Twitter should get complete access to the toolbar data when it’s not feasible to let all developers see personal data about each user, but these companies could agree to uphold the privacy policies in place by Google, Yahoo and Microsoft. Special agreements to get data access could also apply to Alexa and other companies that together have millions of users with toolbars. Anonymizing data at scale is hard, but trusted partners could get access to the same data these companies hold. Google already is understood to <a href="http://glinden.blogspot.com/2008/07/google-toolbar-data-and-actual-surfer.html">use toolbar data for Google Trends, the Ad Planner services</a> and hasn’t ruled out <a href="http://www.toprankblog.com/2006/04/matt-cutts-on-toolbar-data/">using toolbar data in search rankings</a>. Facebook, Twitter and other developers could make use of the data as well.</p>
<p><strong>How to get more data</strong></p>
<p>It’s not known for sure outside of the companies holding toolbar data exactly how useful it is to search and would be to discovery, but some people I know at data driven start-ups say they would be thrilled to use the data and that it could be game changing. There’s a long list of companies that have tried or are trying to get this data on their own <a href="http://www.oneriot.com">using toolbars</a> or some other way including a number of <a href="http://venturebeat.com/2007/08/30/mogad-gets-half-a-million-to-build-a-better-recommendation-site/">start-ups</a>,  <a href="http://en.wikipedia.org/wiki/Facebook%20Beacon">Facebook Beacon</a> and many others but so far their toolbar distribution and/or data access ended up as or is a fraction of the major web players. OneRiot uses a toolbar to get data to influence search results.  OneRiot gets data from hundreds of thousands of toolbar users each day, and the total number of URLs visited each day is five times as many URLs are shared on all of Twitter each day. OneRiot plans to make data including anonymous user activity collected from the toolbar available to developers. <a href="http://www.stumbleupon.com">StumbleUpon</a> offers an optional toolbar to help discover content around the web that recently passed 10 billion random website visits. StumbleUpon chief executive Garrett Camp tells me the service has millions of active users and recently passed 600 million ratings set by users across 32 million pages. Word as of last year was that the company was looking at using its data and Yahoo Boss to rerank search results, and in April the <a href="http://kara.allthingsd.com/20090413/stumbleupon-stumbles-out-of-ebays-arms-to-be-reborn-as-a-start-up/">founders and new investors bought back the company from eBay</a>. Hopefully Yahoo Boss and other <a href="http://battellemedia.com/archives/004970.php">open search initiatives</a> will continue to expand as part of the <a href="http://digital.venturebeat.com/2009/07/29/microsoft-and-yahoo-unite-on-search-in-revolt-against-google-dominance/">recent Yahoo-Microsoft deal</a>. But StumbleUpon is the exception as a service that has attracted a lot of downloads. Twitter search and web search developers have to build up user bases to get usage data, which is slowing the advancement of those services.</p>
<p>For now, we need one of Google, Microsoft and Yahoo to make toolbar data available to see if there’d be massive advances by third-party developers. The first company to do this will open a new market, so who will be the first, and why? These big companies could find a way –  like Facebook has  – to build a vibrant developer community using their data and distribution. Perhaps the data provider can participate in the value created by charging for accessing and using the data. Or maybe Facebook, Twitter (or Twitter developers) will be the first to share click and other usage data. No longer would you have to build a multi billion dollar company to get access to data that could be used to make significant advances for users. All these companies would need to feel comfortable that users’ privacy is protected. Googlers tell me it’s highly unlikely Google would release toolbar data because the data is too valuable, there are privacy concerns and users might be surprised to see how much Google knows about them. Maybe the data is too important to Google so the company must hold it close to maintain its competitive advantage, but perhaps Yahoo and Microsoft would be willing to share data with select partners while tightly protecting user privacy for a chance to increase their competitiveness with Google. Or maybe these companies will try to build discovery services themselves.</p>
<p>In the meantime, those potential advances in search, discovery and more are being stifled. Developers using Twitter, Yahoo and Microsoft search APIs could make better services for users with more data from those companies as well as Google, Facebook, Mozilla and others like data analytics companies. The chance these companies will share more data with developers aside, it’s worth figuring what would be possible if they did.</p>
<p>[Disclosure:  <a href="http://twitter.com/dougsherrets">Doug Sherrets</a> owns some Facebook shares and he works for <a href="http://www.slide.com">Slide</a>.]</p>
<p>Thanks to Ada Chen, Ashvin Kumar, Azra Panjwani, Chris Messina, David King, Eric Eldon, Eugene Shteyn, Gavin Joughin, Greg Linden, Greg Sterling, Ido Green, Itamar Herzberg, Jack Abraham, Jesse Farmer, Jing Chen, Joe Greenstein, Jon Turow, Josh Elman, Julian Gutman, Justin Smith, Kara Swisher, Keith Rabois, Lars Kamp, Loren Brichter, Rishi Mandal, Ryan Elmore, Sachin Rekhi, Scott Banister, Suhail Doshi, Trip Adler, Vik Singh, Yan-David Erlich, people quoted above and others for reading drafts of this post.</p>
<div>
<div>
<table border="0">
<tbody>
<tr>
<td></td>
<td></td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
</blockquote>
<p>via <a href="http://digital.venturebeat.com/2009/07/31/open-data-is-the-future-of-web-discovery/">digital.venturebeat.com</a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hooeeywebprint.wordpress.com/4/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hooeeywebprint.wordpress.com/4/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/hooeeywebprint.wordpress.com/4/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/hooeeywebprint.wordpress.com/4/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/hooeeywebprint.wordpress.com/4/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/hooeeywebprint.wordpress.com/4/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/hooeeywebprint.wordpress.com/4/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/hooeeywebprint.wordpress.com/4/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/hooeeywebprint.wordpress.com/4/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/hooeeywebprint.wordpress.com/4/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/hooeeywebprint.wordpress.com/4/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/hooeeywebprint.wordpress.com/4/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/hooeeywebprint.wordpress.com/4/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/hooeeywebprint.wordpress.com/4/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hooeeywebprint.wordpress.com&amp;blog=10843279&amp;post=4&amp;subd=hooeeywebprint&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hooeeywebprint.wordpress.com/2009/12/14/open-data-is-the-future-of-web-discovery-venturebeat/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/1af2b8ce609ac07583d4b9a16ef921b6?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">hooeey webprint</media:title>
		</media:content>
	</item>
	</channel>
</rss>
