You are here

Backups and distributed data

Two rules. (1) Every piece of data you own should have at least one backup, whether you think you need it or not. (2) Every piece of data that you own on a machine that you do not control should be stored encrypted; otherwise you do not own it…

Remember, it's not just the irretrievable data that will be a problem when you lose data. It's also the time lost restoring state that you can recover.

Just read an article about someone who kept their email on a popular webmail site. Their email disappeared without notice, and without being recoverable. 300MB of email.

Of course, that could be the best thing that happened to them. After all, webmail is not normally stored encrypted, or if it is site software has the key. Even if you think you have nothing private, in a large email archive there will be things you wouldn't want published.

Backups are the most important to encrypt of all. Here in Portland we recently had a medical IT worker leave unencrypted backups being transported off-site in their car, from whence it was stolen.

I needed a file off my home desktop box the other day, but couldn't get in because its net interface was down. Fortunately, the file was sitting right there on the backup drive on my gateway machine. Unencrypted, too.

Care is golden, folks. Be careful out there. Fob


So what about the Amazon S3 service rolled out today... a mutual friend observed, when you upload your data to somebody else, they own it.


As I've said publicly before, I can't use GMail. I have a whole bunch of student info and other personal stuff whose privacy I can't risk. I'm sure Google is very careful with GMail content; more careful than I am, frankly. But between the possibility of government subpoenas and the large number of semi-trustworthy folks who can inevitably access stuff stored at a large central site, I am much more comfortable controlling my own content with the help of my own staff.

At first, I thought S3 might be different. If you can upload/download encrypted data, then at $0.15/GB/month it's an OK, but not cheap, global storage solution. A large drive is around $0.04/GB/month now, assuming a 2 year lifetime. Add in the cost of admin and backups, and the S3 pricing doesn't look so bad. Note that if you upload encrypted data, no one owns it but you.

However, this particular item in the TOS disturbs me.

5) You agree to provide such additional information and/or other materials related to your Application as reasonably requested by us or our affiliates to verify your compliance with this Agreement. If your Application is available as an online solution, you acknowledge and agree that we (and/or our affiliates) may crawl or otherwise monitor your Application for the purpose of verifying your compliance with this Agreement, and that you will not seek to block or otherwise interfere with such crawling or monitoring (and that we and/or our affiliates may use technical means to overcome any methods used on your Application to block or interfere with such crawling or monitoring). If your Application is a desktop solution, you agree to furnish a copy of your Application upon request for the purpose of verifying your compliance with this Agreement.

Not clear that all of this applies, but the general flavor is reinforced by other clauses.
All in all, I think I'll just buy me another disk somewhere, but thanks anyway, Amazon.