Going to Paris - Photowalk anyone?

I'm heading to Paris for Le Web 3, and I'm going to be arriving on Saturday, December 9th, which gives me 2 days before the conference starts to get in some sightseeing. Anyone up for a meeting/get-together and photowalk on Saturday afternoon the 9th or on Sunday the 10th? I've only been to Paris once before, and I fell in love with the beautiful city. I must admit however that I haven't seen much of it.

I am an avid photographer, so I'm really looking for beautiful places, especially hidden gems that only the locals know about...

Leave a comment here or drop me an email at dsifry AT technorati DOT com. I'll be staying at the Sofitel Paris Porte de Sevres near the conference...

Here's a shot from my last trip to Paris:

L'Etrange Incident

Technorati Tags: , , , ,

Make Magazine adds Link Count Widget in 3 minutes

Makelogo UrlI'm incredibly excited tonight because I've been showing off the new Live Link Count Widget to a number of bloggers, and the response has been quite positive. Phil Torrone, Senior Editor at Make, and keeper of the MakeBlog got it installed and working in just under 3 minutes. Right after putting it up, he posted about it:

Ok Makers we added a cool new link to all the blog posts on MAKE, it's called the "Technorati Link Counter Widget" - it will show other makers and sites that are linking to the post you're reading. Just click "View blog reactions" - It's a great way to find other maker-friendly sites, tech sites and people/orgs that are likely interested in similar topics - so check it out, let us know what you think, consider this an experiment... and if you have a site, post a link to MAKE so we can see you! Link & Technorati Link Counter Widget.

Here are a couple examples...

When you view these on posts, you'll see the # of people talking before you click, so far it's pretty impressive and spam free!

I'm totally excited that we could be of service to Phil and to Make. If you're not a regular subscriber to Make already, you should definitely check it out and make it a favorite! There's some of the most interesting original content and hacks up on the site, and the magazine is wonderful as well!

Don't forget to check out their Open Source Gift guide - perfect for all the geeks in your life, or sign up for the Maker's Bill of Rights, can't wait to see which major manufacturer will be the first to put one of those in a box...

Add Real-time Link Counts to your blog posts

Ever wanted to be able to show your readers the reactions from around the blogosphere on your blog posts? Something that is updated live, whenever someone links to you, that you can integrate on your blog, just like Comments and Trackbacks, but without the spam (well, at least most of it)?

We just launched a widget for you. It's called the Technorati Link Counter Widget, and you can see it running on my blog (check the bottom of this post in the comments and trackbacks section) or on other bloggers' blogs, like Hugh MacLeod's Gaping Void, Paolo Valdemarin's weblog, and Jeff Veen's blog. Right now, you have to be a bit technical in order to install it - you need to be able to edit your templates, but if you can do that, It's easy to install. If you use one of the larger blog publishing tools like Movable Type, Wordpress (.org only, so far no support for .com), blogger, or Typepad, just cut and paste the code from the linkcount page into your index and post templates.

We've been talking to the different tool vendors about getting this installed as an easy option for non-technical users, but in my experience, the thing that they react to the fastest is requests from their users! So if you want this on your blog, let your tool vendor know!

It's been pretty exciting to see the early reaction in the blogosphere, and we've been getting a lot of great feedback from people. Please leave your feedback in comments here, or even better, write a post about it, and link to this entry, your reaction will show up in my link count widget at the bottom of this post. Here's the post on the Technorati weblog with a list of a bunch of other people who have installed the widget to give you some ideas as well...

Technorati Tags: , , , , , , , ,

State of the Blogosphere, October, 2006

Hey, it's that time of the year again! We’re well into the crisp days of autumn and its time for the quarterly State of the Blogosphere report.

The State of the Blogosphere continues to be strong.

The last few months have prompted a great deal of thought amongst the team here about the maturation of the blogosphere since I wrote the first algorithms that led to the creation of Technorati nearly four years ago, and I'll be going into a lot more depth below.

OK, let's start with the overall numbers:

Currently Tracking More than 57 Million Blogs and Counting.

Slide0002-7

As you can see, growth in the numbers of blogs tracked by Technorati continues to grow briskly. While the doubling of the blogosphere has slowed a bit (every 236 days or so, here's the historical data) , interest in blogging remains considerable. About 55% of all blogs are active, which means that they have been updated at least once in the last 3 months.

Slide0003-9

To get another view, let's look at the number of new blogs tracked each day:

Slide0004-10

As of October 2006, about 100,000 new weblogs were created each day, which means that on average, there was a slight decrease quarter-over-quarter in the number of new blogs created each day.

As we’ve said in the past, some of the new blogs in our index are Spam blogs or 'splogs'. The good news is Technorati has gotten much better at preventing these kinds of blogs from getting into our indexes in the first place, which may be a factor in the slight slowing in the average of new blogs created each day.

The spikes in red on the chart above shows the increased activity that occurs when spammers create massive numbers of fake blogs and try to get them into our indexes. As the chart shows, we’ve done a much better job over the last quarter at nearly eliminating those red spikes. While last quarter I reported about 8% of new blogs that get past our filters and make it into the index are splogs, I’m happy to report that that number is now more like 4%. As always, we’ll continue to be hyper-focused on making sure that new attacks are spotted and eliminated as quickly as possible.

My gut feeling is that since we're better at dealing with Spam now, even some of the blue areas in last quarter's graph were probably accountable to spam, which would mean that rather than the bumpy ride shown above, we're actually seeing a steady increased (but slower) growth of the blogosphere. Hopefully we'll be able to have a more detailed analysis of these issues next quarter.

Daily Posting Volume

Slide0005-12

First off, the total posting volume of the blogosphere has leveled off somewhat, showing about 1.3 million postings per day, which is a little lower than what we were seeing last quarter but still about double the volume of this time last year. This leveling off may be the result of more aggressive and mature spam fighting capabilities as discussed above, but we'll have to see how the next three months progresses to determine if this is the case or if some other trend is at work.

Along with the aggregate posting volume information, we’ve put in some annotations of the events that occurred at the time of the spikes, showing that the blogosphere continues to react strongly to various world events. It is important to note that these spikes are relative to the posting volume at that time. For instance, the big spike in July is related to the Israeli / Hezbollah conflict as well as other escalating tensions in the Middle East. I similarly would expect to see a spike beginning today and throughout this week in response to the upcoming U.S. elections.

Blogs and Mainstream Media

The integration of blogs and traditional media sites on the web continues. We've put together the top 100 sites that make up "The short head" (as opposed to "the long tail") is still predominantly made up of traditional media sites, like The New York Times, Yahoo! News, CNN, and MSNBC.

Slide0007-5

However, as we move down the curve, blogs become more widespread in the list. There are 12 blogs in the top 100 combined list. 3 are in the top 50, and 9 of the 12 are in the slide below:

Slide0008-8

By the time you reach the top 5000, blogs have essentially taken over, with very few well-funded mainstream media sites listed. This is partially because of the nature of the medium - that is, the traffic of sites further down the curve make significant staffing and revenue difficult. However, lower cost structures make individual or small group blogs operating at little cost quite efficient at these revenue levels.

The Medium Matures

As I mentioned earlier, we’ve been doing a lot of thinking about the maturation of the blogosphere and the blogging phenomenon in general. We asked ourselves, "What are the common characteristics of top bloggers? Do they behave differently? What can we learn from them?

So, we broke down some basic posting behavior for bloggers that have different Technorati rankings, with the level of influence or authority increasing as you go from left to right in the chart below:

Slide0006-8

The Low Authority Group (3-9 blogs linking in the last 6 months)

The average blog age (the number of days that the blog has been in existence) is about 228 days, which shows a real commitment to blogging. However, bloggers of this type average only 12 posts per month, meaning that their posting habits are generally dedicated but infrequent.

The Middle Authority Group (10-99 blogs linking in the last 6 months)

This contrasts somewhat with the second group, which enjoys an average age not much older than the first at 260 days and which posts 50% more frequently than the first. There is a clear correlation between posting volume and Technorati authority ranking.

The High Authority Group (100-499 blogs linking in the last 6 months)

The third group represents a decided shift in blog age while not blogging much more frequently than the last. In keeping with the theme of the maturation of the blogosphere, it seems evident that many of these bloggers were previously in category two and have grown in authority organically over time. In other words, sheer dedication pays off over time.

The Very High Authority Group (500 or more blogs linking in the last 6 months)

In the final group we see what might be considered the blogging elite. This group, which represents more than 4,000 blogs, exhibits a radical shift in post frequency as well as blog age. Bloggers of this type have been at it longer – a year and a half on average – and post nearly twice a day, an increase in posting volume of over 100% from the previous group. Many of the blogs in this category, in fact, are about as old as Technorati and we’ve grown up together. Some of these are full-fledge professional enterprises that post many, many times per day and behave increasingly like our friends in the mainstream media. As has been widely reported, the impact of these bloggers on our cultures and democracies is increasingly dramatic.

A note on Ranking

For those of you who are new to Technorati's ranking systems, we establish a blog’s authority (or influence) by tracking the number of distinct blogs that link to it over the past 6 months. In this chart, we’ve looked at folks with at least 3 links or more and grouped them into four separate categories. In total, we’re looking at about 1.5 million blogs of the 57 million total. Even though I labeled the first group as the "Low Authority" group, given that these people are in the top 2% of all of the blogs that exist, the concept of "low" is purely in relation to the other groups above.

Blogging is Global

As we reported last quarter, English and Japanese remain the two most popular languages in the blogosphere. There were, however, some interesting shifts among those languages less well represented in the blogosphere. Holding steady in the number three spot is Chinese, although it has dipped slightly to 10% of the total posting volume. A notable change, however, is that Farsi has pushed its way into the top 10 languages in use in the blogosphere, bumping Dutch, which had held the number 10 spot over the last couple of quarters, into the number 11 spot.

It is important to note that some important caveats apply to this language data.

Slide0009-6

Posts by language by hour, shown below, looks very similar to last quarter.

Slide0010-3

We delved a little deeper to see if we could understand any other interesting per-language trends. If you look at the top 4 languages side-by-side and standardize them to their relative posting levels throughout the day, recognizable posting patterns by language begin to emerge. While Japanese and Chinese language posts have a daily pattern that indicate heavily localized posting, both English and Spanish language posts indicate more globalized posting patterns.

Interesting to see how much blogging goes on during work hours!

Slide0011-3

In Summary:

  • Technorati is now tracking more than 57 Million blogs.
  • Spam-, splog- and sping-fighting efforts at Technorati are paying dividends in terms of the reduction of garbage in our indexes, even if it does seem to impact overall growth rates.
  • Today, the blogosphere is doubling in size approximately every 230 days.
  • About 100,000 new weblogs were created each day, again down slightly quarter-over-quarter but probably due in part to spam fighting efforts.
  • About 4% of new splogs get past Technorati's filters, even if it is only for a few hours or days.
  • There is a strong correlation between the aging and post frequency of blogs and their authority and Technorati ranking.
  • The globalization of the blogosphere continues. Our data appears to show both English and Spanish languages are a more universal blog language than the other two most dominant language, Japanese and Chinese, which seem to be more regionally localized.
  • Coincident with a rise in blog posts about escalating Middle East tensions throughout the summer and fall, Farsi has moved into the top 10 languages of the blogosphere, indicating that blogging continues to play a critical role in debates about the important issues of our times.

As always, I'm very interested in your comments and feedback.

Technorati Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Getting everything ready for my Opening

It's been a nutty week, that's for sure. With all the cool stuff going on at Technorati - continued growth, new features (have you checked out our new search results yet?) and much more in the hopper, I've had to put together everything for my photo opening tomorrow at the 3rd Street Grill in San Francisco, I didn't get to start until way after closing tonight. With lots of thanks to Teresa, Tantek, and Mike, we got everything up and on the walls. Right before closing up, I took a quick video.

Come on by and join us for a fun party tonight (10/13/2006) from 5-9pm!...

Friday The 13th Party in SF: You're Invited!

Vernal Falls Rapids, YosemiteSome of you may know that when I'm away from the computer, I am often following my other passion, photography. I've been invited to show and sell some of my recent photographs at the new Third Street Grill in San Francisco, and they're throwing an opening party on the 13th, from 5PM - 9PM. The Cafe is providing food and drink, which should be fun - they use all natural organic ingredients, very yummy. The folks from Technorati and a bunch of other South Park web companies will be there to start the weekend off right, with food and drink aplenty.

I've already heard back from lots of interesting people who will be coming, so I'm really excited to show off my art and have a great get-the-weekend-started party to boot.

Come join us! Leave a comment if you're planning on making it, and if you're too cool to comment, come on by and help us celebrate anyway...

When: Friday, October 13th, 2006 from 5PM-9PM
Where: The Third Street Grill, 695 3rd Street, San Francisco CA.
Phone: 415 538-0804
Upcoming.org Event

My Flickr page, where you can see some of my work (especially the Yosemite set, London set, Paris set, and Southwest set...)

Technorati Spam? Actually, we just messed up. Sorry!

Whenever someone comes to Technorati and signs up as a member (to get our watchlist services, or to claim a blog, etc) we have an opt-in checkbox to subscribe to the Technorati Newsletter. You can see the form here.

We have been doing this ever since Technorati started, and after about a year hiatus, the Newsletter is back! We're calling it "The Buzz Monitor", and of course it'll have a RSS feed as well for folks who don't want it via email but want it via RSS.

Since it's been a long time since we sent out a newsletter, we wanted to make sure that people who had forgotten about their Newsletter preferences had a chance to opt out again just in case they weren't interested any more in getting a newsletter from us.

So, we sent out that email this morning.

Now here's the whoops - we messed up when we sent out that email. We tried to do everything the right way - we used a third party provider that handles unsubscribes, handles full mailboxes, and makes sure to put in a "Unsubscribe" link that is personalized for each person getting the email. The idea is to make it really easy for people to stop getting the newsletter.

The only problem was that we messed up the link. The first batch of email had a link that sent people over to their Technorati Profile page, where they could un-check the "Subscribe me to the Technorati Newsletter" box on their profile page. That wasn't well thought out, and I apologize. You see, if you're not logged in to Technorati at the time, it will end up taking you to the Technorati signup page, and ask you to sign up as a new member, or sign in! So of course, if you had forgotten your Technorati username or password, you'd be screwed. Bad bad bad on us, and I am sorry for that.

We're fixing that right now, and I'm going to make sure that no new emails go out from us without all of these simple unsubscribe procedures being fixed. This is an issue that we take very seriously here at Technorati.

By the way, I want to thank Adam Kalsey, for pointing out the problem, and for his kind suggestions . He's got a number of examples of companies that have done this both right and wrong, and I want to thank him for his help, criticism, and feedback.

UPDATE: This is now fixed, with the unsubscribe link now doing the right thing.

Technorati Tags: , , , , ,

TiVo Feature Request: Play all programs in a group

So I finally got a VCR+DVD Recorder combo in order to put those "Dora the Explorer" episodes onto DVD for the kids to watch while on a plane, and after getting everything all connected up, I am placed in a baffling situation.

You see, TiVo has this very nice "Save to VCR" capability, where you can hit record on your VCR (or DVD recorder) and the program will get a nice title sequence and play out to the recording device. TiVo also has this nice program grouping capability, so I can se that I have 18 Dora The Explorer episodes saved on the box. What I'd really like is a way to have the TiVo play all 18 of those episodes back-to-back without having to waste my entire Sunday babysitting the machine to see if it has finished each 30 minute episode. I'd really like to actually get out of the house, you see.

Anyone have any ideas or solutions?

Technorati Tags: , , , ,

Weekend Edition Saturday

Logo Npr 125I'm on NPR's Weekend Edition Saturday, discussing the State of the Blogosphere report. On Thursday I went down to the studios at KQED here in San Francisco, and chatted with host Scott Simon for about 10 minutes.

As of this writing, NPR has put up a permalink to the interview, but haven't put up the link to the audio itself - I believe that the audio will go up at 1PM EST. UPDATE: You can now listen to the audio, in Real or Windows Media formats.

I'm a HUGE NPR fan (it is basically the only live radio I listen to anymore), and it was very exciting and a bit surreal to put on the headphones in the sound-proof booth and talk with Scott. Being such an NPR fan, I was really nervous. I hope they edited the interview well, and that I was of service to you - helping to make the blogosphere a bit more understandable to folks who don't already know about it...

Technorati Tags: , , , , ,

State of the Blogosphere, August 2006

Three months have passed since my last State of the Blogosphere report, so time for an update on the numbers. For those of you who just want the most interesting tidbits, I've tried something new this time around - I've put in boldface the most significant information. There's also a summary at the bottom of the post for those of you who just want the significant details.

50 Million Blogs and Counting.

On July 31, 2006, Technorati tracked its 50 millionth blog. The blogosphere that Technorati tracks continues to show significant growth. The chart below (click to get a full-sized version) has the details:

Slide0002-6

Technorati has been tracking the blogosphere, or world of weblogs, since November 2002, and I'm constantly amazed at the growth over the years. The blogosphere has been doubling in size every 6 months or so. It is over 100 times bigger than it was just 3 years ago.

Whenever I write about these statistics, I'm always asked by people, "Can it continue to grow this quickly?" Frankly, I can't possibly imagine it continuing to grow at this pace - after all, there are only so many human beings in the world! It has to slow down.

Rather than just postulate on this, we now have enough data to actually look at the real numbers - The rate at which the blogosphere has doubled over time, as shown in the chart below:

Slide0003-8

As this chart shows, back in November of 2003, the blogosphere had doubled in size in 40 days - probably because Technorati was new and was just picking up all of the blogs that were out there in the world. In January of 2004, the blogosphere was doubling at a rate of once ever 120 days, which is about once every 4 months. By July of 2004, the blogosphere was doubling every 180 days, or about once every 6 months. Today, the blogosphere is doubling in size every 200 days, or about once every 6 and a half months. That means things have slowed somewhat - the rate of doubling has increased by about half a month to once every seven months.

What I found so interesting in these numbers is that the graph has stayed so flat in the range of 150-200 day doublings for so long. From January 2004 until July 2006, almost two and a half years later, the number of blogs that Technorati tracks has continued to double every 5-7 months.

Can this possibly continue? Will I be posting about the 100 Millionth blog tracked in February of 2007? I can't imagine that things will continue at this blistering pace - it has got to slow down. After all, that would mean that there will be more bloggers around in 7 months than there are bloggers around in total today. I shake my head as I am writing this - the only thing still niggling at my brain is that I'd have been perfectly confident making the same statement 7 months ago when we had tracked our 25 Millionth blog, and I've just proven myself wrong.

Let's look at the number of new blogs tracked each day, to get another look at the numbers:

Slide0004-9

As of July 2006, about 175,000 new weblogs were created each day, which means that on average, there are more than 2 blogs created each second of each day.

Surely some of these new blogs in Technorati's index are Spam blogs or 'splogs'. The spikes in red on the chart above shows the increased activity that occurs when spammers create massive numbers of fake blogs and try to get them into our indexes. This is going to be a fight that is going to continue as long as people find the web useful, and there's really no way to make sure that we catch every single spam blog before it goes into our indexes. We've been working extremely hard on understanding these spam patterns, and

  1. eliminating the spam from our indexes as quickly as possible, and
  2. making sure that these identified sources of spam (and spam creation patterns) never even make it into the index when they attempt to do so in the future.

What we have found, after lots of analysis and spam elimination, is that we see about 8% of new blogs that get past our filters and make it into the index, even if it is only for a few hours or days. In other words, we're always going to pay a price to make the blogosphere as open a place as possible, and Technorati will always have some results that are spammy. We're going to have to continue be extremely vigilant to make sure that new attacks are spotted and eliminated as quickly as possible. About 70% of the pings Technorati receives are from known spam sources, for example, but we're able to drop them before we even send out a spider to go and index the splog.

Of course, we're also going to make some mistakes - so if you think your blog is possibly misclassified, go and have a look at your blog profile (here's mine, for example)- simply type in your blog homepage URL to see what Technorati thinks it knows about your blog. If you don't see your newest posts showing up, make sure that you've claimed your blog. If all else fails, please let us know about it, and we'll try to fix it for you. Please note that if you have multiple URLs for your blog (e.g. Typepad users often have multiple URLs for their blogs, as do some other services) to please try the alternative URLs as well before dropping us a support ticket.

OK, back to the fun. Here's a look at the daily posting volume in data that Technorati tracks:

Slide0005-11

First off, the total posting volume of the blogosphere continues to rise, showing about 1.6 Million postings per day, or about 18.6 posts per second. This is about double the volume of about a year ago. Along with the aggregate posting volume information, we've put in some annotations of the events that occurred at the time of the spikes, showing that the blogosphere continues to react strongly to various world events. It is important to note that it is the relative increase in posting volume rather than the absolute increase that is most relevant here. In other words, because more people are blogging now, the total number of posts on a particular day don't tell the whole tale of the impact of an event - For example, The National Spelling Bee was not as large an event in the blogosphere as Hurricane Katrina. What is important to note in these charts is the relative size of the spike in relation to the posting volume at that time.

Another interesting item to note is the level of influence that blogs are having, especially compared with the mainstream media (MSM). This chart is somewhat biased towards western sources of the MSM, and if you see a source that is missing from this (or the next) chart, please let me know.

What is interesting is that some of the most influential weblogs are being treated in much the same way as traditional MSM, as measured by the number of bloggers who are linking to them, as shown in the chart below:

Slide0006-7

The blogs are in red, MSM in blue. What becomes more interesting to me, however, is that as you continue down the long tail of media sites, the number of blogs starts to grow - to 11 of the top 90 sites, or 12.2% of the total, especially given the budget differentials, as shown below:

Slide0007-3

Next, let's look at the language distribution of the blogosphere. One of the most interesting statistics that has changed since the last State of the Blogosphere is that English has retaken the lead as the #1 language of the blogosphere. However, it's not by much - the Japanese blogosphere has grown substantially as well.

In April, English edged out Japanese with 34% of all postings to 33% of all postings, with Chinese taking the #3 spot with 14% of all postings.

Slide0009-3

In May, English extended its lead to 41% of all postings in the blogosphere, to 31% in Japanese and 10% in Chinese.

Slide0010-2

In June, Chinese caught up somewhat, with 39% of all postings tracked by Technorati in English, 31% in Japanese, and 12% in Chinese. It is important to note that, as in the report in April, that there are some significant underreporting issues, especially in Korean and in French, as described in that report.

Slide0011-2

Finally, I thought it would be interesting to look at what times of day show significant posting volume by language. The chart below shows this information using Pacific time (Technorati is located in San Francisco, so we're biased towards that time zone) as our base:

Slide0008-5

It is interesting to note that the most prevalent times for English-language posting is between the hours of 10AM and 2PM Pacific time, with an additional spike at around 5PM Pacific time. Japan, which is 17 hours ahead of San Francisco, shows a different pattern - more posting occurring during the evening hours into the night, as well as the early morning hours before work begins. I'm not entirely sure what to make of these numbers, but it would appear that English-speaking people are more likely to blog during work hours and early evening in the USA, while they are more reluctant to blog during work time in Japan. More research is definitely needed to understand when and where people are blogging. Perhaps a more experienced cultural anthropologist or sociology researcher can provide better insight here, if you're interested, drop me a line at dsifry AT technorati DOT com.

In summary:

  • Technorati is now tracking over 50 Million Blogs.
  • The Blogosphere is over 100 times bigger than it was just 3 years ago.
  • Today, the blogosphere is doubling in size every 200 days, or about once every 6 and a half months.
  • From January 2004 until July 2006, the number of blogs that Technorati tracks has continued to double every 5-7 months.
  • About 175,000 new weblogs were created each day, which means that on average, there are more than 2 blogs created each second of each day.
  • About 8% of new blogs get past Technorati's filters, even if it is only for a few hours or days.
  • About 70% of the pings Technorati receives are from known spam sources, but we drop them before we have to send out a spider to go and index the splog.
  • Total posting volume of the blogosphere continues to rise, showing about 1.6 Million postings per day, or about 18.6 posts per second.
  • This is about double the volume of about a year ago.
  • The most prevalent times for English-language posting is between the hours of 10AM and 2PM Pacific time, with an additional spike at around 5PM Pacific time

As always, I'm very interested in your comments and feedback.

Technorati Tags: , , , , , , , , , , , , , , , , , , , , , , , , ,