Friday, 12 June 2015

Exchange 2003, where did my database go?

It's a classic story, an old SBS 2003 server is running low on space. It's 2013 so they need to buy a new server anyway. Then the usual, no IT budget means no server. Just stick an extra hard drive in and jobs a good'n.

Well, at least it's better than buying a NAS and using it as a second server :)

So, hard drives are in. RAID-5 has been extended and we have an extra 300GB of space. We create a new partition for exchange, now it's just matter of moving the database and we're good to go. Piece of cake.

Exchange 2003 makes it fairly painless to move the database, you just edit the location, point it where you want the database to sit, and windows does the rest. It takes a while so just leave it running and go make a coffee.

Unfortunately, the engineer who was performing the move overlooked the fact that the terminal server session times out after 30 minutes. Easy thing to miss, but it would have dire consequences.

I'm still not quite sure how it happened. The engineer told me he kicked off the move, left it running for a while, came back to it and the session had been disconnected. He logged back in, move had failed. Then I think he tried it again and it failed a second time.

After that, nothing. Exchange store wouldn't mount, engineer asked me to have a look the next morning. Checked the event logs, found the following error:

Information Store (15376) First Storage Group: An attempt to write to the file "E:\Program Files\Exchsrvr\rmdbdata\pub1.edb" at offset 0 (0x0000000000000000) for 4096 (0x00001000) bytes failed after 0 seconds with system error 21 (0x00000015): "The device is not ready. ". The write operation will fail with error -1022 (0xfffffc02). If this error persists then the file may be damaged and may need to be restored from a previous backup.

That's not good. Looking at the server, there were now two databases! The original one and another copy on the new partition. The database in the new partition didn't work, but worst still, the original refused to mount as well!

Okay well, the original database should still be intact right? Surely it will have copied the database to the new location and then removed the original once the copy has completed? I honestly don't know, but it went wrong somehow.

I ran eseutil, to try and repair the database. I ran a soft repair, which does a sweep of the database and replays the transaction logs to bring it into a clean state. This failed, can't remember the exact error, database is corrupt. I think I even called microsoft and opened an emergency case. They confirmed my fears, and basically said "yeah it's fucked, and don't bother doing a hard repair, it won't be worth the trouble". So the only course of action is restore from a backup.

Okay, don't panic. That's why we have backups. Company had Symantec System Recovery, nice full image of the server we can restore from. Time for Symantec to shine.

So open up the software. Last backup... 8 days ago. Waste of time, can't afford to lose a weeks worth of emails. So I created a blank database on the new partition, and started backing up everyone's outlook (fortunately they were all cached). Then went through the fun of importing all the emails back into the server. I used Stellar Exchange Recovery and managed to recover some of the emails that wern't cached (couple of people on leave).

Fortunately it was a small company, only around a dozen users. So by the end of the day we were pretty much there. Fortunately the customer was understanding and appreciated that we managed to recover almost all their email.

Moral of the story? Run a backup before you move the exchange database! Also, initiate the move from the console of the server, not from terminal services.

Many more exchange stories to come :D

 

No comments:

Post a Comment