Category: Work

  • When good drive arrays go bad.

    There I am, settling in for a quiet Saturday of housekeeping, websurfing and bookreading. And the phone rings.

    The main Enco server is down. This is bad news ordinarily, but it becomes exceptionally bad when you remember that the standby server went down a week ago and you haven’t yet received the replacement hard drive you need. Uh oh.

    So I hop in the shower, hurry to the office, and discover that one of the drives in the external chassis has gone south. Oh no. I grab the spare (yes, we do keep a spare for the main server, just not the standby), put it into an enclosure and swap it into place, all the while expecting a long overnight as I babysit the restoration of files to the new Netware volume I’m doomed to have to create.

    And the new drive exhibits exactly the same problem as the old. Aw, hell…

    (changing verb tenses, just a moment please.)

    It took Gary and I about three hours to get everything running again. How could we possibly have rebuilt a RAID 0 array and restored the data in such short time? Piece of cake. Turns out the drive itself didn’t die, just the receive bay in the hot-swap drive chassis.

    And the boxed spare also turned out to be flakey. We tried every combination of enclosure, receive bay and LVD add-on board we had… except one. In a flash of desperate inspiration I decided to look up on one of the shelves in the engineering shop. Under a pair of old hard drives and other assorted detritus I found one more receive bay. We attached an LVD add-on board and set the SCSI drive ID to match the old bay so the RAID controller would hopefully recognize the original drive and spare us the need to create a new array. Lo and behold, it worked!

    Yay, we got our array back. The main Enco server is once again alive and kicking. We made a list of spare parts we need to order, since it’s just a matter of time before that slot fails again. (Turns out that we’ve lost two receive bay units in the same chassis position since putting the Enco system into service. This does not instill us with confidence.) I then turned my attention to the standby server for which we’d received the replacement drive yesterday, naturally on the day I couldn’t make it to the office.

    There’s a standard principle followed by almost every RAID-controller manufacturer in the business: All drives in an array will be treated as if they were the same size as the smallest drive in the array. It’s difficult to replace a single dead drive with an exact duplicate, especially two years down the road, so RAID controllers (usually) allow you to use a replacement drive slightly larger than the original. Yet, for some asinine reason, the folks at 3Ware decided that all drives on one of their IDE RAID controllers must always be exactly the same to be included in a single array.

    Of course, the replacement drive we purchased, while the same manufacturer (IBM) and basic type (IDE, 7200 RPM), was just a wee bit larger than the others, and therefore different enough that the 3Ware controller refused to include it in the new array. And so, we cannot bring The Beast back online until we either find another DTLA-307075 or buy six or seven identical replacement drives for the new array.

    I suppose you can’t win ’em all.

  • Ready for the weekend? You bet your sweet bippy.

    So why did my Friday go directly to hell at noon?

    1) The Entercom WAN network went completely down at about ten minutes past noon, Pacific time. It didn’t come back until nearly 5:00, at which time we finally learned that a Worldcom router had failed and caused the whole mess. For five crucial hours on a Friday afternoon we couldn’t get commercials downloaded for air, let alone emails in and out of the building. Imagine the joy.

    (Geeky side note: My proxy server scheme doesn’t work without a local DNS server, since everyone is told via DHCP to get DNS lookups from the Corporate office… an impossibility when the frame cloud is down. So I spent a couple of hours installing a dinky little caching nameserver. Next time I’ll be ready… if I can figure out where in Netware 5’s DHCP system you configure the nameserver info.)

    2) The Beast, our standby Enco fileserver, died. Or, rather, one of its hard drives died. This happened a few days ago, but I only routinely check up on the box about once a week. We have to order a replacement IDE drive in the 75 gigabyte range, then strap it in, power up the machine and configure a brand new RAID 0 array. Then I get to spend the following 48 hours copying data from the main Enco server. Yippee, ha ha, whee. Right now The Beast is on my workbench, missing a hard drive. So much for getting anything else done, like prepping the new PD Streaming box for Kansas City… or rolling out another Compaq (I’ve only done five so far!) or… well, much of anything.

    (Side note: I did get to spend about an hour playing “spot delivery boy,” since I had one of the few computers with a working Internet connection. Proud to serve, I am. And it was kind of fun. Call me weird.)

    And now I think I’ll go home and vegetate. And eat. And play games. Then I’ll come back on Sunday and get the work done I was supposed to do today…

    (Another side note, for the hell of it: So, is that enough posting for one day? Does it make up for the “lack thereof” during the rest of the week? I sure as hell hope so. Really now.)

    There, I’m done. See you tomorrow. Or Sunday.

    (One last side note: I added two new links to the blogroll. Moody In The Rain is Celina’s journal, and I put Hey! in as well. I’m all about Oregon thingie-ers. What, you want me to call ’em “bloggers?” Anyway… Visit and enjoy.)

  • The feeling of a job well done

    Yesterday afternoon I shipped out the last two streaming servers for the PD Streaming project. The project itself won’t be completed until all of the markets are online, but the biggest part of my involvement is now over.

    Halleluiah. I can now commence rolling out the Compaq WinXP machines just as fast as I can.

    And yes, I got to shake hands with the Big Boss Man a few minutes ago. He thanked me for all of my hard work on the aforementioned project, then insisted he was pressed for time and couldn’t chat. *shrug* I get that a lot…

  • Thank God It’s Tuesday

    Today I’m getting things done. I completed the next-to-last of the PD Streaming machines. I did some administrivia and maintenance. I even have plans to do some more… things.

    Yesterday, not so much. It was one of those crazy, never a moment’s rest kind of days. Only small fires, but there were an awful lot of them.

    And that was just at work.

    The hard drive in what we call The Big Computer at home went click-of-death Sunday night, so I went home yesterday to face the task of completely rebuilding the box. No stress, it’s only the sole Internet access computer in the house. This time I went with Windows 2000 Pro instead of the venerable (and finicky) Windows 98 SE. In the process I discovered that the DVD-ROM drive and the (crappy Acer) CD-RW drive refuse to peacefully coexist. So we’ve gone from two hard drives and two CD drives in that computer to one of each. Bleah.

    Oh yeah, and half an hour into the process my back went out. It went out in a big, bad, hurtful way. It’s now 18 hours later and my shoulder’s still twinging. Ouch.

    So I’m glad it’s Tuesday and not Monday The Thirteenth anymore. The only good to come out of yesterday was dinner at Chang’s with Wendi. Yum.

    On a completely unrelated note, I’ve picked out my camera. It’s only a month away…

  • Everything’s Coming Up Rosey

    We would like to announce that the ROSEY105.COM domain was renewed this evening. Regular service at that domain will resume within the usual standard time period.

    That will be all.

    And please, let’s not get into how the domain was allowed to lapse in the first place.

  • Bizzee Bizzee Bizzee

    Where to begin? Where to end?

    • I spent the entire back half of yesterday’s workday adding personal information into the users in our NDS trees. For instance, Corporate has wanted phone numbers to show up in the Groupwise address book since, oh, two years ago. I made that happen yesterday. Yes, I’m a loser.
    • Today has largely been spent completely rebuilding The Lab, the website that started out as a grand vision and is now something of a sad testament to the fact that I really didn’t know what I was doing three years ago. (One could argue that I still don’t, I suppose.) And yes, I did strip down and recycle the more useful bits of the current greyduck.net code. You can call it laziness, I call it expediency.
    • The first two of the new Compaq computers (you know, the WinXP beasties we received a month ago?) have been placed. Every last technical hurdle has been cleared, including the Norton AntiVirus upgrade and the wonderful ZENworks Dynamic Local User trick. I can now place these machines with complete impunity. But that project is on hold until…
    • …I complete the last four PD Streaming servers and ship them to Norfolk, Rochester and Wichita. I have until the end of next week to do so, and the last four will be the most time-consuming as they’re former LMiV machines that are running RedHat 6.1 and need a complete OS reinstallation. Fun, wot?


    I’m not neglecting my updating duties. I’m just placing a higher priority on keeping my job. Call me strange…

    Update: Here’s the funny part. This journal entry is number 404. Oddly enough, I spent over an hour arguing with IIS and PHP about how to handle 404 errors. The trick is to make sure that the PHP ISAPI handler is set to check for the existence of the document. Otherwise a missing document is presumed to exist (!) and the site visitor sees a bunch of PHP error gibberish. I won’t even go into the trouble I had upgrading to the latest PHP version. Feh.