Technorati Party in SF, Thursday Oct 28

A lot has happened at Technorati in the last few months, including our move to new office space, right near Pac Bell Park in San Francisco. I'd like to invite everyone to an open house at our new office to catch up on what's new with us and to find out what's new with you.

Here are the details:

WHEN: Thursday, October 28, 7 p.m.

WHERE: Technorati, 665 Third Street #207, San Francisco. Map and Directions.

WHAT: A party to catch up, and celebrate the move to our new offices!

RSVP: rsvp@technorati.com. As space is limited, please be sure to RSVP!

UPDATE: Some more details on the party - we're splurging on the food (catered) and drinks (wine, beer, soft drinks).

Oct 2004 State of the Blogosphere: Corporate Bloggers

This is part 4 of a series on the growth of the Blogosphere, its impact on individuals, corporations, media, politics, and technology, Part 1 covered the overall growth of the blogosphere, part 2 covered the volume of postings, and part 3 covered the growing influence that bloggers are having, and compared them to the online presences of traditional mainstream media.

Today I'll discuss a small but influential segment of bloggers - Corporate Bloggers. These are people who blog in an official or semi-official capacity at a company, or are so affiliated with the company where they work that even though they are not officially spokespeople for the company, they are clearly affiliated. For example, the folks in SAP's developers program get blogs if they want them, and are available to anyone who joins the (free) SAP developers network. This group also includes folks at Sun Microsystems and at Microsoft, where employees are actively encouraged to blog.

Slide7

The chart above (click on it to see a larger version) shows some of the organizations that are at the forefront of the corporate blogging wave. In addition to the big corporate names and the bloggers at companies involved in the blogging space, there are a large number of individual consultants, small business owners, and individual CxO bloggers - about 3,000 that we have identified as of October 2004 - which fill the “other” category. These are folks who are blogging about what is going on at their businesses, but either because of the small number of people at the business, or the small number of bloggers at the individual business, we aggregated them into a single category.

Even though some of the largest technology companies are represented in this graph, to me this shows that we are still at the relative start of accepted use of blogging as a part of corporate policy - and that there is still a tremendous opportunity for forward-thinking companies and management to have a significant positive impact on their public perception by encouraging an enlightened blogging policy, encouraging openness both within and outside of the organization.

Oct 2004 State of the blogosphere: Big Media vs. Blogs

This is part 3 of a series on the growth of the Blogosphere, its impact on individuals, corporations, media, politics, and technology, Part 1 covered the overall growth of the blogosphere, and part 2 covered the volume of postings. Today I'll build further on the growing influence and authority of bloggers, and compare some of the online influence to mainstream media sites. Click on the image below to see the full-size chart (this data is about a month old, so there's a bit of change in the relative rankings from this data set and the current Top 100, for example)

The folks at Google (and IBM, and others) made a fundamental breakthrough in exploiting the concept of “Page Rank” - essentially, that hyperlinks are votes of attention, and that the number of web pages linking to a page is reflective of the authoritativeness of that page, and that use of collective human intelligence towards relevance revolutionized the search industry.

At Technorati, we've taken the same fundamental realization and extended it to people and organizations. The number of people linking to you is a very powerful measurement of your influence or authority with those people - because if nothing else, those people are spending some attention on you. Documents are the exhaust of our attention streams - they are a tangible reflection on what we are spending our time and attention on. Negative attention “I hate such-and-such” runs counter to this theory, but empirical evidence shows that people overwhelmingly link to items and objects that they like or endorse, far more frequently than to things they disapprove of (e.g. Terveen and Hill, 1998).

Slide6

The chart above shows a graph of the most influential or authoritative blogs as compared with the most authoritative “big media” sites. Certainly, top-quality journalism, interesting articles, and consistency of quality show why the top big media sites are on top. But it also shows that a large number of people are getting news, information, and opinion from outside of the mainstream media, and that these sources are rivaling or exceeding the attention paid to smaller “professional” sites.

Also important are the approximately 8000 blogs that have between 100-1000 inbound sources, which represent a set of people who are often writing about targeted or niche topics, like PVRBlog (158 sources), or Ross Mayfield (340 sources), and tens of thousands of blogs between 50-100 inbound sources, which represent smaller communities of conversations going on every day, on a wide range of topics. There is a lot of information and conversation in the tail of the media power curve that goes well beyond what is available from larger media organizations.

Tomorrow: A look at the emerging world of the corporate blogger, and how they are changing the image of some of the most influential organizations, like Microsoft and Sun Microsystems.

Oct 2004 State of the Blogosphere: 4.6 posts per second

This is part 2 of a series of posts describing the growth of blogging during 2003-2004 that is an expansion of the talk I gave at last week's excellent Web 2.0 conference. You can view Part 1, which covers the overall size of the blogosphere.

With the tremendous growth in the number of weblogs also comes an increase in the number of posts per day, also known as posting volume. This is an excellent proxy for the amount of time spent on blogging, because greater posting frequency means that more people are posting more often, and it also tends to validate the increased number of blogs that are out there. As of October 6, 2004, there are approximately 400,000 posts created every day in the blogosphere, which averages out to about 4.6 posts per second, or over 16,000 posts per hour. What is also interesting are the spikes in weblog posting, and the reasons for the spikes, as shown in the graph below (click on the graph for a larger view):

Slide5

Many of the volume increases were due to political events. Large spikes occurred around the Iowa Caucuses (the Howard Dean scream), the time of the Nick Berg beheading, when both conservative and liberal bloggers posted prolifically on the new form of terrorist threat, and around both major American political conventions, where bloggers were feted as well.

However, other noteworthy spikes occurred around non-political news events as well. The blogosphere was abuzz around the discovery of a flaw in the basic mechanism of high-end Kryptonite locks, which made them vulnerable to picking with a dime-store plastic pen, and that news flew around the blogosphere for 5 days before mainstream media picked it up, which caused a second spike, as shown by a secondary spike in posting volume, as bloggers discussed the implications. A side note: I wonder how much Kryptonite could have done if their executives were keeping track of the blogosphere on a regular basis - it certainly could have helped to avert a major PR problem if they had reacted quickly and offered a recall or a fix before the news had broken in mainstream media. The story broke on 9/12/04 on a bicycle forum, but quickly spread throughout the blogosphere, with major mentions in the New York Times, the Boston Globe, and others.

Tomorrow, I'll look at the level of authority and influence that various blogs are attaining, and compare that with many traditional media sites.

State of the Blogosphere, October 2004

Things have been incredibly busy over at the day job, so it is nice every once in a while to take a step back and look at the big picture. To prepare for my presentation at last week's Web 2.0, my team ran a number of analyses on the collected data that we've been tracking since November 2002, when the Technorati service started, and we've noted a number of interesting trends over the past 2 years, so I thought I'd take some time out this week and blog about each one of them, accompanied by some charts and graphs showing the underlying data.

First off, let's look at the size of the blogosphere (click on the picture for a larger version):

Slide3

First off, we're now tracking over 4 Million weblogs. Regular readers will remember that we tracked the 3 Millionth weblog on July 7th, just 3 months ago. In addition, the blogosphere has been doubling at a regular pace, and it is now more than 8 times as large as it was in June of 2003. In addition, the slowest rate at which the blogosphere has doubled in size is once every 5 months.

This leads to the second graph, which shows the acceleration of the growth of the blogosphere:

Slide4

This shows the number of new weblogs being created every day. Right now, there are about 12,000 new weblogs being created each day, which means that on average, a new weblog is created every 7.4 seconds. It is important to understand, though, that not all weblogs are regularly posted to - in fact, about 45% of all older weblogs have not had a post in 3 months. This may be due to abandonment, hosting service switches, tire kicking, or other factors, Mary Hodder has a good discussion on these issues.

Tomorrow: Volume of posts, and what it can tell us about ourselves...

Technorati Hackathon, San Francisco: Wed Oct 6, 2004

We're having a Hackathon at our new offices in San Francisco, on Wednesday, October 6. We're going to have lots of pizza and beer, and lots of outlets and free wifi. The idea is to actually do some real web services hacking that night after talking about it all day at the looks-to-be-great Web 2.0 conference. If you're a hacker who knows about our API, or is interested in learning more about coding web services, this is going to be a chance to hang out with our core developers as well as with other leading web services developers, with the goal being to foster great new applications and tools using those APIs. We're also really interested in sparking further conversation and getting feedback from you so that we can make Technorati more valuable for you. You don't have to be a Web 2.0 attendee to come to the hackathon - but there's sure to be lots of attendees there.

OK, the details:

WHEN: Wednesday October 6, 2004, from 8PM - whenever!
WHERE: Technorati Offices (map) at 665 3rd Street, Suite 207, San Francisco, CA 94107 (between Brannan and Townsend Streets)
WE PROVIDE: Free pizza, beer, soft drinks; Fast WiFi, whiteboards, lots of room to hack as a group or individually, experts, help, and advice.
YOU PROVIDE: Creativity, energy, good humor, great ideas, willingness to teach and learn, readiness to hack!

IMPORTANT NOTE: Space is limited (our offices only hold so many people!), so please RSVP (rsvp@technorati.com) as soon as possible to guarantee your spot! First come, first served.

New Technorati Toolbar for Firefox

Thanks to the cool guys at UltraBar, you can now have Technorati at your fingertips with the Technorati Toolbar! I've been using this for a while now, and it is fantastic - you can type in search queries, and get back up-to-the-minute results from around the web, and you can also click on the Technorati talk-bubble () from any page you're browsing to see who is talking about that page, and what they're saying - which has made reading the news online a completely more satisfing experience. The toolbar works on all versions of Firefox past 0.9 (including the 1.0 PR that is currently available). Fantastic job, guys!

Putting out colo fires (literally!), aka “A weekend of downtime”, grrr.

Ugh. What a horrible weekend. I and the team have been spending the entire weekend dealing with massive data corruption caused - of all things - by an electrical fire on the main electrical line coming into our colo here in San Francisco.

The colo fire has led to a cascade of failures that has caused the Technorati service to be down for most of the weekend. It's also giving me a lot more respect for people who build and maintain 100% uptime of services, the trials and tribulations they go through, and also the cost of being operationally excellent.

What Happened

At about 9:30PM PST Friday, there was an electrical fire on the power main inside our colocation center, where our entire server infrastructure is housed. This caused our battery backup power supplies to kick in, but the independent power generator at the colo never kicked in - possibly because the problem was a fire inside the building rather than a general power blackout of the neighborhood. Well, the fire was only problem #1. We weren't expecting or planning for an outage of that kind. It caused a cascade of other problems that made the rest of the weekend a huge PITA. Problem #2 was that we didn't have a good enough emergency plan in place that would shut our systems down cleanly when power ran out like that. Unfortunately, that meant that when the batteries died, our server farm went down quite ungracefully - causing problem #3, which was data corruption due to the unclean shutdown.

The rest of the weekend has been spent recovering from these failures - we've had to do consistency checks and then rebuilds of the data sets that got corrupted, and we're doing that for over a hundred machines. Bad bad bad. At least we've been performing regular daily backups, and we're able to use that as starting points on our road to recovery. The current ETA to get services back up and running is by Monday morning, which will mean a weekend of unplanned downtime.

What we're doing about it

Clearly, this is unacceptable, but the damage is already done, so the big question on my mind is how to learn from this outage and make sure that it never happens again. One of the important things learned is that there's a reason why some colocation centers are called “Tier 1” (and priced that way) and others are not. Tier 1 means that everything is overprovisioned, and there's plenty of infrastructure backup already built into the place - electrical, network, fire suppression, environmental, security, etc. We have been planning a move to a new colocation center, but this most recent incident just underscores the need to move asap. Second, it illustrates a hole we had in our emergency plan - we had built our emergency plan based on a threat level of a short outage followed by a quick electrical recovery. We had planned for a shutdown of critical systems if battery power fell below a certain threshold (upsd for you techies out there), but we hadn't gotten it implemented given that we were planning the move to the new colo. That of course, led to the data corruption that is keeping the team up all weekend.

Once we get past this crisis and get the service back up and running again, I guarantee that we'll be doing a post-mortem analysis to see where the failure points were, and how we can avoid them in the future. I have learned a lot from this experience about the value (and implicit cost) of planning and building systems around unreliable components, and doing everything you can to eliminate risk - and also about planning for quick recovery when the unthinkable happens. I have a lot of respect for the folks at Google, Yahoo, eBay, and the like for their ability to build and maintain a solid world-class infrastructure, have it scale, and also innovate with new applications as well.

Planning for Murphy's Law

Count me as one of the humbled. To our users and customers: Ouch this hurts, and we're working on making sure that outages like this never happen again. To the folks down in ops and engineering at companies around the globe keeping these systems running and useful, no matter what Murphy throws at you, you've got my appreciation.

I'll post as we have more updates on service status as the day progresses.

UPDATE: As of 10:00 PST Monday, the infrastructure is back online, and we're live again, but we're still making sure that everything is back 100%. Searches should work, and we're monitoring our response time.

Sputnik releases Control Center 3.0, and hosted service: SputnikNet

The folks at WiFi management software and services company Sputnik have just released a major software and services upgrade. Sputnik Control Center is the easy-to-use, easy-to-buy software that allows you to manage hundreds of WiFi access points as a single system, manage access control, create and deploy captive portals, track usage by AP and user, set up network policies, and much much more. Check it out: there's a Sputnik Hotspot Kit for only $599. that includes Two Sputnik AP 160s and two Sputnik Control Center licenses. Makes it really easy to become a WiFi access provider, or to install secure wireless across a company or campus.

SputnikNet enables you to run a managed wireless network without having to set up or run your own server. With SputnikNet, you get a hosted Sputnik Control Center set up just for you. Just plug Sputnik-Powered APs into broadband Internet, and manage your wireless network. You can manage as many access points and wireless networks as you like for only $19.95 per access point per month.

Congratulations, Sputnik folks! Full disclosure: I'm a founder and advisor for Sputnik, so don't just take my word for it - go and see what others are saying. Daily Wireless has a good review, as does WiFi Networking News and WiFi Planet.