A senior colleague told me yesterday that in the old days PhD candidates kept a hard copy of their thesis in the freezer – in the hope that it would survive should their house burn down. These days making sure your work is not lost due to some unfortunate event, like your computer crashing or your house flooding, is much easier – but is it really? No one should ever suffer the same fate as one poor Rutgers student who had his laptop stolen together with his only copy of five years of doctoral research. As a PhD Candidate in Finance, I have devised my own system of precautions. Doing research in finance routinely calls for a large amount of data, so I have literally gigabytes of data to store.
I purchased a portable, external hard disk to store my massive datasets that I carry with me between school and home. I periodically back it up to my home computer. How often? Whenever I remember. I also back up my external drive to my office computer. How often? Not so often, as I prefer to work on the external drive itself. Simply put, unless the external hard disk and the home computer malfunction, I am pretty okay.
This past week, out of the blue, my home computer stopped working. No warning, it just froze one night right after dinner. Fine, I thought, I am safe. I have everything including my larger data files saved on the external drive.
After spending hours to unsuccessfully restore my computer, I was only able to recover a previous version of my work from few months before. Frustrated, I tried to copy the data from the external drive back onto the desktop. At this point, I realized that the external drive was the only full copy of my data. With a sense of urgency I plugged the external drive in. The copying process was going unusually slow and then suddenly came to a halt, as did my heart. I had potentially lost several weeks of work. Eventually it restarted and after hours upon hours, I managed to get most of the files to the desktop. Soon after that the external stopped working entirely. I had come within minutes of losing my work, after my multiple fail-safe’s really failed to keep me safe.
Although I ended up not losing any of my data, for a couple of days I was shaken. How could both the desktop and the external drive fail at the same time? What are the odds?
I have yet to find the perfect backup system for my files, but I am working on it. I feel that whatever I was doing so far works to some extent – but is not foolproof. Keeping multiple versions of my datasets takes up space and creates a lot of redundancy, as well as varying versions of the data. Putting everything in a web service like Dropbox, SugarSync or other cloud storage service provides some version control. Cloud software is pretty easy and straightforward to use. However, these services can be prohibitively expensive, especially when you need to store a large amount of data on a PhD student’s budget. So from now on, I plan to keep at least three copies of my files. Or maybe even four.