mysqldump diff MySQL Differential Backup Shell Script

mysqldump diffmysqldump diff is a shell script that I wrote to take some of the pain out of differential MySQL backups.

Download mysqldump diff

After changing the appropriate values, this will dump your database daily and create a unified diff patch so you’ll only need to keep the original mysqldump intact and patch files describing the subsequent differences between each future dump. It is a simple form of version control for MySQL database dumps without the overhead of managed methods. Although results will differ based on on the frequency of record-modification and the interval between backups, using this method on weekly backups (over a 2-month period), my patches were each roughly a tenth of the size of the original dump and of course, the benefits are exponential as time goes on while allowing you to increase the frequency many-fold while still saving space.

mysqldump diff

Alternative MySQL Backup Methods

Generally, I’d advocate the following standardised (read: supported) methods but the clients’ databases I use this for are not interested in redundancy, clusters or cloud servers so the following aren’t ideal:

  • mysqldump and gzip would have me running out of storage space and/or bandwidth quickly
  • MySQL binary log is less flexible and also recording more than needed
  • MySQL Replication (master/slave) would mean running another instance: unwanted costs and maintenance

Comments

  • I do so :mysqldump quote-names add-drop-table add-locks no-data routines exednted-insert -ulogin -ppass default-character-set=latin1 -f dbname > dbname_structure.sqlcopy /Y /V dbname_structure.sql D:\BackUp\dbname\dbname_structure.sqlfor /F tokens=1-4 usebackq delims=. %%1 in (`date /t`) do set mydate=%%4_%%3_%%2_%%1ren E:\BackUp\dbname\dbname_structure.sql dbname_structure%mydate%.sqlmysqldump quote-names add-locks no-create-info -ulogin -ppass default-character-set=latin1 -f dbname > dbname_data.sqlcopy /Y /V dbname_data.sql D:\BackUp\dbname\ren D:\BackUp\dbname\dbname_data.sql dbname_data%mydate%.sql

    • I had no idea mysqldump was available for Microsoft Windows. Your comment does not display correctly but regardless of that, it creates a full backup split into two files (structure and data). The point of my script is that it’s differential so that you do not have duplicated data: You only store the changes in data between each backup. For example:

      With a database of 50MiB, the first dump would be 50MiB then every subsequent day might only add 1MiB of data whereas with daily, full dumps: Day 1 would be 50MiB, day 2 would be 51MiB et cetera. That will get unmanagable (and costly) very quickly.

  • Which distribution of diff do you use? I have gnu diffutils and the “-q” option you specify causes the .sql.diff to only have contents such as “Files dbdump-2013-03-20.sql and dbdump-2013-03-21.sql differ”.

    • I’m also using GNU diffutils 3.2 but the -q option is for quiet reporting i.e. if there is a difference, not returning the actual differences. I can’t believe I posted this with that still in there…

      Thanks, Jason! :-)

  • Jackson Cooper

    Interesting, thanks for posting it!

    A bit late to the party, but…

    Say if I have your average WordPress blog, and create a post. The diff will show the INSERT statement, great. But then I modify the post, and it will presumably be an UPDATE. Statement. The diff between the backup and live version will be the INSERT statement. So then if you patched another database, wouldn’t it simply have 2 INSERT statements, one for the original post, and another for the updated?

    • You’re not at all late and you’re very welcome! :v

      The query is an UPDATE at the time of modification, certainly, but that’s inconsequential when comparing MySQL dumps (as created with mysqldump, no other selective/differential tools) by line, because they’d both have complete INSERT statements for each post – just the latter dump would be the modified version of the post.

      Is this hypothetical or are you having issues with your backups?

  • I have thought about doing the same, but I have a WIP based on git (git will only commit the diff between “revisions”). I’m just wondering if you dismissed that intentionally?

    • git uses diff and should have a similar result but I wrote this for use in a server environment and try to keep lock-in/dependencies at a minimum for future flexibility and manipulation. The lovely development features of git (commits, tags, messages, multiple branches…) wouldn’t be of any use, either.

      I didn’t include the gzip compression in this post but I think that could betouch to implement once it’s being tracked by git…? I’d love to “checkout” your script though! :-D

Leave a Comment