Wednesday, 29 May 2013

Apologies and Explanations

[Keywords: Plesk, migration, Mailman, Horde, Spamassassin, Rackforce, frustration]

Last week my hosting provider moved my Web sites and e-mail to a new server:

We will be migrating your domain data to a brand new server that will be setup prior to the migration date. This server will contain the most up to date packages and patches for your operating system as well as the newest version of the Plesk control panel. This migration will be done live and should not require any intervention on your part.

Unsurprisingly, it did require intervention on my part: about three full days of work.

Before I could fix the problems introduced by the botched server migration, some readers of my weekly e-mail newsletter got confusing and spurious automatically generated e-mail messages from an out-of-control robot acting in my name, and one or two messages from people trying to unsubscribe from what they thought was a new and unknown list were improperly forwarded to the whole list.

There wasn't much, if anything, I could have done to avoid this. Nevertheless, I apologize, especially to the two subscribers to my newsletter whose "unsubscribe" messages were forwarded erroneously to other subscribers.

I'm writing this article to explain what happened and what I have done to minimize the chances that it will happen again to me or to other Rackforce customers. I believe I have fixed the problems on my server, and at my insistence my hosting provider has changed their procedures. I hope these explanations may also help other people searching for solutions to similar Plesk migration problems.

If you don't care about any of this, you can stop reading this article now, and move on to other things. For those interested in the gory details, read on.

For more than ten years, Hasbrouck.org has been hosted on a succession of servers at Rackforce in Kelowna, B.C. Most foreign visitors to British Columbia focus on the coast, and while the Okanagan is relatively well-known to Canadian tourists and retirees as having the warmest climate in the country, it's largely off the U.S. tourist map. But Kelowna is a pleasant small city, with a strong food and wine culture among other attractions. It's a center for Canadian server farms, of which Rackforce is one of the largest, for many of the same reasons including cheap hydroelectic power that the largest U.S. server farms for companies like Amazon and Google are located in central and eastern Oregon and Washington across the border to the south.

Rackforce mainly serves larger businesses. I'm in their smallest tier of customers, but even that costs about US$100 a month. (They offer a choice of billing in US or Canadian dollars.) That's about ten times what I would pay for the cheapest hosting in the U.S., but it's worth it to me to have my data in Canada (a remarkable number of "Canadian" hosting companies actually have mirror servers in the U.S., vulnerable to U.S. government surveillance, often without clearly disclosing this fact) and to deal with a company that outsources neither operations nor support. Rackforce offers "cloud" hosting on redundant servers at its own data centers in Kelowna and elsewhere in Canada, and when I have a problem, I can talk to people in those buildings, who can walk over to the racks and check the hardware if need be, 24/7.

So far, so good. Until last week.

I have a "virtual server" at Rackforce, with the virtualization and the server managed through Parallels Plesk.

A couple, of weeks ago, Rackforce told me that as part of some upgrades to their cloud architecture they would need to migrate my virtual server to a new installation of the same Linux version, although, "This migration ... should not require any intervention on your part."

Just in case, I rescheduled the migration for last Friday, so I'd have the Memorial Day long weekend (not a holiday in Canada, so Rackforce would be at full staff) to recover from any problems.

On Friday morning, when Rackforce booted up my new server, I immediately began receiving a flood of spam at my own e-mail address (Spamassassin obviously wasn't enabled), mixed in with first a message welcoming me to my own e-mail newsletter, and then a couple of requests from subscribers to take me off this "new" list.

I immediately shut down the mail server, purged the queue of outgoing unsent messages, and spent most of the long weekend figuring out what had happened, why, and how to fix it.

It turned out that Rackforce had used the Plesk Migration Manager, which transfers user "data" and recreates accounts but doesn't migrate many account and server application settings. The migrated accounts and applications are reset to defaults or nulls, which is what caused my problems

There's a Plesk Pre-Transfer Checker which would have identified the most serious of the problems, and identified workarounds, but Rackforce hadn't told me about it or given me a chance to act on its warnings and recommendations before they booted up the new (and misconfigured) virtual server.

Because the settings had not been migrated, and were reset to defaults, Spamassassin was disabled server-wide and for each e-mail account on the new server, Horde webmail filtering rules (including spam filters) did not exist, and -- far worst -- although my mailing lists had been recreated in the Mailman list management application, they had been reset to default unmoderated discussion lists (rather than moderated announcement lists), all individual subscribers' settings had been reset to allow any subscriber to post to the list, replies were reset to go to the entire list rather than only to the list owner (me), and the lists were treated as "new" so cryptic default "welcome" messages were immediately generated to all subscribers.

When recipients of these "welcome" messages hit "reply" to unsubscribe, two of their replies were forwarded to some portion of the list before I could shut down the mail server.

Here's what I found out should have been done, and what you need to do if you are migrating a Plesk server and have used Mailman, Horde webmail, and/or Spamassassin:

  1. Run the Plesk Pre-Transfer Checker on the old server before the migration, and heed its warnings.
  2. If any Mailman lists have been created on the server (or if you aren't sure if they might have been), turn off list welcome messages on the new server (server-wide setting) after setting up the new server and before using the Plesk tool to migrate your data.
  3. Manually migrate Mailman settings for lists and subscribers by copying /var/lib/mailman/lists/ from the old to the new server. Make sure to do this before starting mail services on the new server, or data may leak through spurious messages or list administrivia messages broadcast to all subscribers instead of sent only to the list owner.
  4. Before migrating your data, check that Spamasassin is installed on the new server. If it isn't installed as part of the default Plesk installation, install it from the Plesk management console.
  5. After migrating your data, re-enable Spamassassin for all users before starting mail services on the new server. The migration will disable Spamassassin checking even if it was installed and in use and is installed on the new server. It cannot be re-enabled from Plesk. To enable or re-enable it, change the database flags for all e-mail users. These database commands worked for me, but obviously I can't warrant they will work for others.
  6. Before trying to use Horde, use mysql dump and restore commands to manually migrate the database containing Horde webmail user data and settings.

Again, do all of this this before turning on mail services (especially SMTP or Mailman) on the new server.

Rackforce has confirmed that they will no longer tell their customers that, "This migration ... should not require any intervention on your part", and that:

I have updated the Plesk migration process documentation in light of the pitfalls that you have raised.

1. "The migration will be done live..." has been changed to "The data migration will be done live..." to reflect the fact that we are referring specifically to the copying of content (not the process as a whole).

2. We are now to run the Plesk pre-migration checker after the initial provisioning of the new VM [virtual machine].

3. The output of the Plesk pre-migration checker is to be included with the access/credentials information (which is all to be released to the client at least 24h prior to the scheduled migration).

Again, I apologize to my newsletter subscribers.

Link | Posted by Edward on Wednesday, 29 May 2013, 22:57 (10:57 PM) | TrackBack (0)
Comments
Post a comment









Save personal info as cookie?