Still Life with Drobo
About 6 months ago I bought a Firewire Drobo with wide eyes and big expectations. I’ve had a growing video and audio library that was getting tedious to keep all on one drive. The idea of having a machine that automatically expands by popping in a new drive was awfully sexy.
In the past I’ve messed around with various flavors of RAID and even went so far as to built up a Linux RAID file server. And while that wasn’t an awful experience, I wanted to spend less time administering a storage solution and instead just have one work.
Thus begins our tale of woe.
For the first few weeks, all was happy. I slapped in a 1TB drive along with a few 300GB or so drives and loaded it up with data. I was immediately concerned about how slow it was. I had read reports of people happily getting 30-40 MB/sec read speeds and I was seeing nothing of the sort. I was languishing in the 8 to 12 range.
Now I run a hackintosh rig, and there have been some issues with FW speeds on them in certain configurations (EDIT: This issue is now fixed on my rig, but it wasn’t when I was first setting up the drobo), but speeds seemed okay on my Macbook pro. So I just moved the Drobo over to the USB bus and let it live there where it seemed to stay pretty solid at 12MB/sec reads and writes. That was fast enough to serve up my itunes library and back up my home folder and I just left it at that.
Until 2 weeks ago when the drobo started blinking its lights yellow and green. For some reason it was doing a re-layout. Okay. So that completed. Then it just started randomly dropping off of the USB bus when copying files. Awesome.
So I finally sent in an e-mail to drobo tech support. They had me send in a diagnostic file and some information and told me they would get back to me.
In the meantime, the drobo was in an unhappy place. randomly doing rebuilds. I decided to just pull everything off of it and reformat it. Pulling off the data took 3 days between slow speeds and random bus drops.
After a few days I called up Data Robotics to check on the ticket. Apparently it was with tier 3 support and they hadn’t got to it yet. The guy on the phone promised to send them an IM that I had enquired. Okay.
So I reformat the thing, and start putting data back on a little at a bit. I notice excellent speeds from the firewire bus. i.e. 28MB/sec writes. Coolness! Except after about 4 hours of so, it was right back to lousy speeds. I listened to the unit and noticed that even with no access from the computer, the hard drives were doing something. But the drobo gave no indication that anything was happening. All green lights and happy times. I had noticed this before, but just wrote it off as wonkyness. Turns out this was the cause of my previous slow speeds.
So I call Data Robotics again after 6 days of no status reports. Again I get the “tier 3 has it, and hasn’t looked at it and all I can do is send them an IM” song and dance. I ask to talk with a manager. Apparently all the managers have gone home (it was 6pm their time) so they promise me a call back on friday morning.
Well low and behold the very next day tier 3 has a chance to look at my ticket. Go figure. So I get a new unreleased firmware to prevent drops off of the USB bus, and a recommendation to remove a 320GB drive that is in the unit as apparently it’s been disappearing to the drobo and coming back randomly.
From talking with the support rep, apparently the drobo won’t mark a drive as bad unless it meets some specific conditions, yet it will continuously re-layout the drobo when a drive disappears, reappears disappears…etc. He said a failing drive controller can cause the issue. I of course have no way of knowing whether or not this is accurate.
I pulled the drive and wrote zeros across the whole thing with no issues, but maybe there is some weirdness between the drobo and the drive. For 320GB, I don’t care.
Without the drive, the drobo is no longer constantly clicking away, and speeds are pretty great with 3 1TB drives. 28MB/sec writes and 40MB/sec reads (on some files, around 30MB/sec on others). It did hard freeze one time while moving all of my data back to it, but it hasn’t since so I’m keeping my fingers crossed. I dumped another diagnostic file and let DR know.
Some thoughts and reflections on Drobo troubles:
The drobo diagnostic file is either encrypted or an unreadable binary, which means the end user has no way to troubleshoot problems with the drobo themselves without involving DR. This sucks.
The drobo can have a problem with a drive that massively affects its performance without reporting it in any way, either by lights or by the dashboard. DR could easily set a threshold for “edge events” that mark a drive as questionable to the user without calling it failed.
The drobo will do re-layouts (or some kind of drive maintenance magic) without blinking any lights or giving the user any indication of what its doing. Generally speaking, this wouldn’t be a big deal if it didn’t hurt performance, but if the drobo is moving stuff around to protect the data, I probably want to know that its happening and why.
Drobo tier 3 support is really slow. While I had no data on the drobo that I couldn’t lose, I could imagine being in big freakout mode if my data was at risk and my vendor didn’t even start working on my support ticket for a week. This might not have been a huge deal if I was given any indication up front of how long it would take, or the phone reps offered anything other than a shrug of the shoulders when I called.
Any time you have your data tied up in a black box, you’d better have a lot of trust in that box and the people who control its secrets. When you buy a drobo, you’re making a choice to hand over a lot of control of how your data is handled in exchange for a stress-free experience.
A quick search for drobo + yourexpletivehere in google will yield quite a few sundry tales of mysterious drive failures and spotty support response. And while this is hardly a representative sample, it makes it hard to trust the magic black box or the druids who control its secrets.
I would be hard-pressed to recommend the Drobo to anyone at this point, but I could probably be convinced given some somewhat significant changes at Data Robotics.
Human readable diagnostic file, or a utility for reading the file that shows the common issues the Drobo has to correct – drive failures, read errors, write errors and etc. This doesn’t have to come in the box if we’re wanting to sell the mystique of a “just works” solution, but it should be available for those who run into problems.
Better feedback. The drobo, or at least the dashboard should indicate when the drobo is doing any non-user-initiated work on the drives. This can be a hidden preference in the dashboard.
Responsive support. Responsive support is not acknowledgement of your ticket and asking for a diagnostic file. Drobo could easily have save us time and effort by asking all the up front questions like “when did you purchase the unit, what drive models and sizes are in the unit” etc on the web support form and included a spot to upload your diagnostic file. It could then set an expectation based on workload of when your case will be looked at. Everyone will have different views on what is an acceptable timeframe for a solution, but I would expect that someone has looked at my diagnostic file within 2 days and we have a resolution within 5.
EDIT 2009-07-01: Drobo called my this morning and has decided to replace the unit. I’ve been okay with it since removing the bad drive, aside from the one bus hang I saw. But apparently the drobo is negotiating small packet sizes with my machine and they want to replace the drobo, powersupply and firewire cable just to be sure. I’ll update if I see any difference in speeds.

Recent Comments