ActiveState Powered by ActiveState

Recipe 191017: Backup your files


Makes backup versions of files

Python
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
#!/usr/bin/env python

# Backup files - As published in Python Cookbook
# by O'Reilly with some bug-fixes.

# Credit: Anand Pillai, Tiago Henriques, Mario Ruggier
import sys,os, shutil, filecmp

MAXVERSIONS=100
BAKFOLDER = '.bak'

def backup_files(tree_top, bakdir_name=BAKFOLDER):
    """ Directory back up function. Takes the top-level
    directory and an optional backup folder name as arguments.
    By default the backup folder is a folder named '.bak' created
    inside the folder which is backed up. If another directory
    path is passed as value of this argument, the backup versions
    are created inside that directory instead. Maximum of
    'MAXVERSIONS' simultaneous versions can be maintained

    Example usage
    -------------
    
    The command
    $ python backup.py ~/programs
    
    will create backups of every file inside ~/programs
    inside sub-directories named '.bak' inside each folder.
    For example, the backups of files inside ~/programs will
    be found in ~/programs/.bak, the backup of files inside
    ~/programs/python in ~/programs/python/.bak etc.

    The command
    $ python backup.py ~/programs ~/backups

    will create backups of every file inside ~/backups/programs
    folder. No .bak folder is created. Instead backups of
    files in ~/programs will be inside ~/backups/programs,
    backups of files in ~/programs/python will be inside
    ~/backups/programs/python etc.
    
    """
    
    top_dir = os.path.basename(tree_top)
    tree_top += os.sep
    
    for dir, subdirs, files in os.walk(tree_top):

        if os.path.isabs(bakdir_name):
            relpath = dir.replace(tree_top,'')
            backup_dir = os.path.join(bakdir_name, top_dir, relpath)
        else:
            backup_dir = os.path.join(dir, bakdir_name)

        if not os.path.exists(backup_dir):
            os.makedirs(backup_dir)

        # To avoid recursing into sub-directories
        subdirs[:] = [d for d in subdirs if d != bakdir_name]
        for f in files:
            filepath = os.path.join(dir, f)
            destpath = os.path.join(backup_dir, f)
            # Check existence of previous versions
            for index in xrange(MAXVERSIONS):
                backup = '%s.%2.2d' % (destpath, index)
                abspath = os.path.abspath(filepath)
                
                if index > 0:
                    # No need to backup if file and last version
                    # are identical
                    old_backup = '%s.%2.2d' % (destpath, index-1)
                    if not os.path.exists(old_backup): break
                    abspath = os.path.abspath(old_backup)
                    
                    try:
                        if os.path.isfile(abspath) and filecmp.cmp(abspath, filepath, shallow=False):
                            continue
                    except OSError:
                        pass
                
                try:
                    if not os.path.exists(backup):
                        print 'Copying %s to %s...' % (filepath, backup)
                        shutil.copy(filepath, backup)
                except (OSError, IOError), e:
                    pass

if __name__=="__main__":
    if len(sys.argv)<2:
        sys.exit("Usage: %s [directory] [backup directory]" % sys.argv[0])
        
    tree_top = os.path.abspath(os.path.expanduser(os.path.expandvars(sys.argv[1])))
    
    if len(sys.argv)>=3:
        bakfolder = os.path.abspath(os.path.expanduser(os.path.expandvars(sys.argv[2])))
    else:
        bakfolder = BAKFOLDER
        
    if os.path.isdir(tree_top):
        backup_files(tree_top, bakfolder)

Discussion

I find this script useful in my development work. It can be used for non-python source code also.

03/04/05 Updated with the modifications done for the script as published in Python Cookbook 2nd edition with a couple of bug-fixes.

[Edit] (Apr 08 2008) - Modified command-line handling, enhanced backup folder option to take another folder as argument, added detailed function docstring.

Comments

  1. 1. At 4:24 p.m. on 28 mar 2003, Tiago Henriques said:

    Nitpicking. Neat and simple recipe. I have a rather minor suggestion: since you don't need the "x" index, why not simply use "for file in files:" instead of "for x in range(...): file=files[x]"?

  2. 2. At 3:33 a.m. on 29 mar 2003, Zhed Bolyshevin (the author) said:

    Re: Nitpicking. Nice suggestion.Actually the

    "for x in range(0, len(..))" comes

    more out of habit than anything else :-)

    Thanks

    Anand Pillai

  3. 3. At 7:07 p.m. on 29 mar 2003, Tiago Henriques said:

    Change recipe to avoid unnecessary backups. I added a few lines to the original recipe to avoid backing up files that have not been modified since the previous backup. Here's the complete code:

    #! /usr/bin/env python
    #backup.py - backup versions of python source files
    import sys, os
    from shutil import copy
    from string import lower
    from filecmp import cmp
    
    targetdir=""
    try:
        targetdir=sys.argv[1]
    except:
        targetdir="."
    
    files=os.listdir(targetdir)
    
    #Backup directory
    for file in files:
        ext=lower((os.path.splitext(file))[1])
    
        if ext in ('.py', '.ht'):
            abspath=os.path.abspath(os.path.join(targetdir, file))
            print 'Backing up file ', file ,'...'
            #check for existence of previous versions
            index=0
            while os.path.exists(abspath + '.bak.' + str(index)):
                index += 1
            if not index==0:
                #no need to backup if file and last version are identical
                if cmp(abspath, abspath + '.bak.' + str(index-1), shallow=False):
                    continue
            copy(abspath, abspath + '.bak.' + str(index))
    
  4. 4. At 1:52 a.m. on 15 apr 2003, Zhed Bolyshevin (the author) said:

    Recipe updated. Added the above suggestion to the recipe.

    Anand

  5. 5. At 5:58 a.m. on 22 apr 2003, Zhed Bolyshevin (the author) said:

    Modified recipe. Modified recipe to copy files to a directory

    named 'bak' in the current directory. Also

    made to work on the entire tree than cwd.

    Anand

  6. 6. At 1:15 a.m. on 4 dec 2003, Mario Ruggier said:

    Ability to specify files by name. I have made slight adjustments to this function:

    • made types an optional parameter, with default value None meaning all types

    • added files parameter to initail (non-recursive) backup call, to allow specifying files explicitly

    • removed variable "bakuppath" (i was getting an error) and using "bakup" for copy()

    The new backup(), modified in this way, becomes:

    def backup(dir, types=None, files=None):
        "Back up files or files with extension in passed tuple types"
    
        if files is None:
            files=os.listdir(dir)
    
        # Backup directory
        for file in files:
            abspath = os.path.abspath(os.path.join(dir, file))
    
            if os.path.isfile(abspath):
                ext=lower((os.path.splitext(file))[1])[1:]
                if types is None or ext in types:
                    # check for existence of previous versions
                    index=1
    
                    # create directory named 'bak' in current directory
                    newdir = os.path.join(dir, 'bak')
                    if not os.path.exists(newdir):
                        os.makedirs(newdir)
    
                    while 1:
                        if index > MAXVERSIONS:
                            break
                        bakup = os.path.join(newdir, file + '.bak.' + str(index))
                        if not os.path.exists(bakup):
                            break
                        index += 1
    
                    if index>1:
                        # no need to backup if file and last version are identical
                        oldbakup = os.path.join(newdir, file + '.bak.' + str(index-1))
                        try:
                            if os.path.isfile(oldbakup) and cmp(abspath, oldbakup, shallow=0):
                                print 'File ', file, ': file is unchanged'
                                continue
                        except OSError, e:
                            pass
    
                    print 'Backing up file \t', file ,' Version:', index
                    try:
                        copy(abspath, bakup)
                    except OSError, e:
                        pass
    
            elif os.path.isdir(abspath):
                backup(abspath, types)
                pass
    
  7. 7. At 7:57 a.m. on 24 sep 2004, Ann Person said:

    Will break horribly... Will break horribly if any of the backup files are not regular file types. Eg. fifo.

Sign in to comment