Folks,
After 2 months of painstaking effort, I'm proud to announce the new Technorati infrastructure is up and ready for use.
Please have a look, and tell us what you think:
We focused 100% of our time on completely refurbishing our underlying event engine - essentially taking a volkswagen engine out and putting a Ferrari engine in. This new engine sports:
1) Much faster indexing - the median amount of time it takes from when someone posts something on their weblog to when it is captured and searchable via our live database is 7 minutes.
2) Much faster querying - our goal is to have every search query take less than a second, even as the database is being continuously updated. We added a query timer at the top of every results page so you can judge for yourself.
3) Much more scalable - We built this distributed database system to scale. As we track more events, we add more machines to scale. As our user traffic increases, we add more machines to scale. This should continue to work for quite some time, so we're eager to test under load.
4) Much better internationalization support - The database is entirely in UTF-8, a character set that encompasses a significant number (well, all) of non-english languages, including Japanese, Farsi, Hebrew, and many others. You can see results in multiple languages all on the same page. Localization should be significantly easier.
5) A new, smarter spider/crawler, which understands weblog posts and blogrolls much better than our old spider. You'll note that on our results pages, many results offer a "Read Full Post" capability, which take you directly to the entire microcontent post that created the link.
6) A redone results page, which should load faster, and is designed for non-browser usage as well. Lots has been moved to CSS, and we've added a nifty pager widget at the top and bottom of each page of results.
Please go and use the site - and send us feedback.
Some known issues: There are a few areas where we're still filling out content, fixing bugs and layout, like in the top 100 page, breaking news, current events, and other pages. We're looking to find showstopper bugs or problems before we move this beta infrastructure over to the production site. So, don't fret if a page you like is currently missing or if the top 100 is messed up, we're fixing that. You may also see a change in your inbound blogs/links numbers, but that is primarily due to the fact that we're still bringing the new database up to speed, so we know that some of the numbers are different.
Thanks again for your time and patience, and on behalf of the entire Technorati team, we thank you for all of your support. We're really looking forward to your feedback.