Friday, May 13, 2011

Personal Lessons Learned from The Blogger Outage

For nearly 24 hours, Blogger was in read-only mode. This period, combined with the advertised lock out for the maintenance that led to the corrupted data problem, has been frustrating for me as well as for numerous other Blogger users. In this post, I look at some lessons learned or observations made as a result of this experience.


It's Easy To Take Things for Granted

I've been blogging via Blogger since late 2007. Although this lock-out from Blogger for nearly a full day was frustrating, it reminded me of how well Blogger typically runs. I had become accustomed to few, very short outages before this and really took Blogger availability for granted. We all would like the products we deliver to customers to have zero downtime and 100% availability, but this was a reminder to me that many customers (including myself) can tolerate a one-time glitch if there is a proven track record of high availability. Of course, it's still best to avoid these outages as much as possible and to have them be as short as possible when they do occur. However, we have more opportunity to keep customers loyal during longer-than-anticipated outages if they have had longer periods of high availability most of the time.


It's Difficult to Complain about Free Service

I don't make a dime off of this blog and I don't have any advertisements. I enjoy writing the blog and gain non-monetary benefits from writing it. One of the luxuries of this is that it is easy to tell any unsatisfied reader that he or she at least got their money's worth. The Blogger staff has done nothing like this to my knowledge, but the hard truth is that many of us get typical high availability from Blogger without paying a dime for that service. Given Blogger's Service Issue today - Would you Pay for Service? is a Blogger forum thread that includes discussion about whether users would be willing to pay for Blogger use in an effort to get better support (including telephone or direct support beyond the general forum support).

I liked whoffkne's forum comment: "Yeah, I mean, seriously, what kind of service do we pay for...oh...yeah, right. =)" That summarizes it pretty well.


It's My Readers I'm Worried About!

In the Blogger "Something is Broken" forum thread Blogger Service Issues, ronniesaunders states, "I have a lot readers who depend on my blog as their primary source of news and information" and implied that this came at a particularly frustrating time: "A military compound in Pakistan was just bombed and 80 people have been injured and my blogger service is down."

My thoughts were along a similar line as ronniesaunders's, but to a lesser degree and about less important news. I wasn't so much worried about letting down my two readers, but I did want to publish mention of the just-announced JavaOne 2011 session submission tips while the news was still hot. At least with the read-only mode, I slept well knowing that my loyal readers could still read about interesting posts as of May 10. I had not realized how much I enjoyed being able to write a quick post about a new event as I learned about it and how frustrating having to wait another day to do so could be.


Maybe It's Time to Start Being Superstitious?

On the previously mentioned Blogger forum post Blogger Service Issues, themusesguild brings up something I had not previously considered: "Hey Guys,Three Simple Words...Friday The 13th."


Any News is Good News

One of the most difficult things about this issue was not being aware of when service might be restored, how complex the underlying problem was, and not knowing the fate of posts and comments published in the vulnerable period between the problems and the application of the read-only setting. This uncomfortable feeling of not knowing when service would be restored or what, if anything, bloggers should start doing is expressed in the Blogger forum thread Recent Blogger service issues. JMan360's comment likely articulates what others were thinking: "They really need to tell us more. I don't know whether to start redoing everything or not. I'll be irritated either way. Why don't they tell us more details?"

In another Blogger: Something is Broken thread, poingly asked, "Why do you refuse to give an ETA for restoration of Blogger service?" The discussion that followed included covering the challenges associated with coming up with such estimates and the distractions that coming up with such estimates can cause. On the one hand, users want an estimate. On the other hand, users don't necessarily want the ultimate solution put off even longer in order to provide the estimate and users especially don't want to be given repeatedly changing estimates. The lesson that might be learned here is that a best guess estimate might be provided with a whole bunch of associated caveats.


It Could Happen to Me!

Many of the bloggers who use Blogger are not developers and will never be in a situation where they are responsible for creating, delivering, or maintaining a software product used by numerous users. They don't have to be concerned with issues like availability, redundancy, and scalability. For we developers, however, this is a reminder that we do need to care about such issues. With heavyweights such as Amazon and Google experiencing these down times recently, it's a reminder that it can happen to any of us. We have to do our best to avoid it and need to be prepared to deal with the unexpected when even our best preventative efforts fail.


Backup!

No matter who is storing your data "on the other end," it is important to store one's own data in a backup as well. This event was a reminder to me that I had not exported my blog via Blogger's export mechanism for some time. I'll be doing that this coming weekend.


Conclusion

Personally, I'm not ready to abandon Blogger over this single issue. It was frustrating at times, but I also was reminded of how good the service has normally been and appreciate it more now. I definitely hope not to see something similar for some time and believe that nothing has been lost.

3 comments:

DTH Rocket said...

I STILL can't log into my dashboard and it's driving me CRAZY!!!

@DustinMarx said...

DTH Rocket,

I'm sorry to hear about that. I know it was frustrating for me to be without the ability to post for a day, but it's got to be far worse to still be one of the few still unable to post. I can see something is wrong with your account because when I click on your profile link, I see an "We're sorry, but we were unable to complete your request." error.

Hopefully it will be fixed soon.

Dustin

@DustinMarx said...

Although not as large or widespread as the April 2011 outage, it appears that at least the US-EAST-1 portion of the Amazon EC2 cloud suffered another outage early this morning. The good news is that it was smaller, more confined, and more quickly resolved than the April incident.

Dustin