Weblog Search! Get Yer Fresh Weblog Search Here!
Dave Winer implemented a weblog search using the Google API. He seems really thrilled by it.
‘Cept it’s been done. Speaking of prior art, a Google API-based weblog search tool has already been built by Micah Alpern. The results return isn’t nearly as nifty as Dave’s, but Micah’s does integrate with your blogroll to add a “Search Blogs I Read” feature. That is nifty.
However, both Dave’s and Micah’s tools have a fatal flaw: Google.
According to Google’s cache, the last time Google crawled Scripting News was May 16. (Also the last date it crawled 10RW.) If you do a search today for“weblog search” on Dave’s site, the results are incomplete. Google hasn’t yet crawled anything he’s written recently.
Movable Type, on the other hand, has a search tool already built in. (See the little search box over on the side?) I don’t know what Ben and Mena are using to power it, but whatever it is, it indexes every word of my weblog immediately. It’s probably extremely low overhead, as it doesn’t have to “crawl” — it can just index each post as you post it.
I can even search for “the” and get every post back. (Well, I assume every post; I didn’t check to see if there are posts where I didn’t use “the.”)
Like Dave’s tool, Movable Type’s search presents the results reverse-chronologically. Instead of providing the whole web log post (which could be overkill and wasted resources, unless you’re as pithy and brief as Dave), it excerpts the post and provides a link to the full post. If you’re logged in to Movable Type, it also provides an Edit link that kicks you right into the editing form. Ooh, did I mention you can also use regular expressions?
Most importantly, I don’t have to wait for Google to crawl my weblog. The Movable Type weblog search will return hits on the stuff I blogged minutes ago. As Dave would say, Bing! That’s killer.
It seems to me like Google is overkill for a weblog search. Google’s great because it scales for humongobytes (one humongobyte = a gazillion terabytes) of information. But for the amount of content in an individual weblog, you don’t need the scalability of Google. What you do want (at least I want) the freshness of having everything indexed as soon as it’s posted.
Blogger has this functionality, but only on the authoring side. E.g. from the authoring side you can search all your posts. They don’t expose it to the users like MT does. They should.
Why don’t Manilla and Radio already have the kind of search capability MT has? Or do they? Or am I missing the sparkliness of Dave’s tool?
Bottom line: using Google to search your own weblogs, you’re sacrificing freshness for scalability that you don’t need. And freshness is what makes weblogs tasty. [Homer Simpson voice] Mmmmm. Weblogs.