Igor Kromin |   Consultant. Coder. Blogger. Tinkerer. Gamer.

One of the projects I'm maintaining makes heavy use of Google App Engine Cloud Storage for uploading and serving files, mainly images but also some occasional binary content. Recently I've been wondering about how I could perform a full backup of these files and it turns out that it's very simple and requires no new code or additional tools apart from what is already included with the App Engine SDK.

The tool that makes this possible is gsutil, specifically its rsync option. This works very much like the usual rsync command in Unix but lets you perform bulk transfers/synchronisation of files between a Google Storage bucket and your local machine.
The gsutil rsync command makes the contents under dst_url the same as the contents under src_url, by copying any missing files/objects (or those whose data has changed), and (if the -d option is specified) deleting any extra files/objects. src_url must specify a directory, bucket, or bucket subdirectory.


So for my specific case where I wanted to have everything from Cloud Storage copied to my laptop, I would use a command like this...
 Command
gsutil -m rsync -d -r gs://my-app.appspot.com/ /Backups/backup-files


This would make a mirror copy of my app's default bucket files on my laptop in the local /Backups/backup-files directory. The -m option makes transfers run in parallel, speeding up the entire backup. The -d option will delete any local files that are not in the Cloud Store bucket. The -r option performs a recursive copy of all the bucket data.

The output looks something like this (truncated)...
 Terminal Output
Building synchronization state...
Starting synchronization...
Copying gs://my-app.appspot.com/xxxxx...
...
| [5.9k/5.9k files][325.9 MiB/325.9 MiB] 100% Done 76.6 KiB/s ETA 00:00:00
Operation completed over 5.9k objects/325.9 MiB.




The operation was quite fast too. I had almost 6000 files to copy, which completed in a few minutes. Mind you it was around 300Mb of data, but transfers of many small files do not tend to make full use of the available bandwidth.
gsutilrsync_1.png

gsutilrsync_2.png


This operation can also work in reverse, that is making the Cloud Store bucket a mirror image of the local file system; effectively restoring data that was backed up earlier. Doing that requires the source and destination arguments to be swapped, and possibly not using the -d option when copying back to the bucket.

Of course this will only back up the files and doesn't touch the Cloud Datastore, this is a topic for another post.

-i

Skip down to comments...
Hope you found this post useful...

...so please read on! I love writing articles that provide beneficial information, tips and examples to my readers. All information on my blog is provided free of charge and I encourage you to share it as you wish. There is a small favour I ask in return however - engage in comments below, provide feedback, and if you see mistakes let me know.

If you want to show additional support and help me pay for web hosting and domain name registration, donations, no matter how small, are always welcome!

Use of any information contained in this blog post/article is subject to this disclaimer.
 
comments powered by Disqus
Other posts you may like...