SourceForge.net Logo Home of http://sourceforge.net/projects/pyfmf
Contact: danperl@users.sourceforge.net
Downloads: http://sourceforge.net/project/showfiles.php?group_id=116235

pyfmf: a file management framework in python
Page Changed: 5/13/2006


http://www.python.org

Pyfmf: A Project for File Management in Python

pyfmf is an open-source project in Python for platform-independent file management.  The core of pyfmf is an extensible framework.  A console based toolkit (zigo) and a GUI platform (zago, a wrapper around zigo) are extensions of the framework.  You can use either just the framework by itself, the toolkit, or the graphic platform to build applications for archiving files, searching files, measuring disk usage, etc.  For instance, build your own back-up utility using existing modules to select which directories and which files to back up.  You can even use existing modules to detect which files have changed since the last back-up.  Then you only have to implement a module (with a few lines of Python code) that executes the actual back-up operation (e.g., writing to a tape device).

The framework is based on handlers that can be layered (like streams in Java, protocol layers in the OSI stack, or like piping of commands in Unix), each handler performing different actions on directories and files: filtering them for the rest of the stack, gathering information and possibly storing that information in a persistent data structure, logging changes to the files, modifying files, archiving files, and so on.  The framework includes a base class for the handlers and the controller that invokes the handlers while it walks through a tree of directories.

Several handlers are implemented:
  • Several filters for directories.  Filter in or filter out directories based on the full path or only on the base names of the directories.  Directories to be filtered can be specified with wildcards.
  • Several filters for files.  Filter in or filter out files based on the full path or on the base names of the files.  Files to be filtered can be specified with wildcards.
  • A handler that stores a persistent data structure of directories, files and file modification times.  It can be used to detect changes in the filesystem.
  • A filter that loads a persistent data structure created by the above mentioned handler and compares the present filesystem structure to the old one.  It flags new files, deleted files and files that have been modified for the handlers above it in the stack.
  • Handlers that log the directories and files into plain, text files.
  • Handlers that act as results viewers for zago.
  • A handler for tar-ing files.  Very useful in combination with some of the above filters, it can create a tar file of only selected file types and it can exclude branches in a tree.  Created to periodically tar all the relevant files in this project from a CVS checkout (for instance, selecting all the '*.py' files but not the '*.pyc' files and excluding the 'CVS' directories in the tree).  Probably just replaces a very complex tar command, but I couldn't resist the irony of using pyfmf to package itself.
New handlers are easy to implement.  The base class has a few hook methods that can be overridden and it contains defaults for all these methods.  A useful handler can be implemented by overriding even only one of the hooks.  For example, here is how the tar-file handler could be implemented:
import os, tarfile, baseClass

class Handler (baseClass.Handler):
    def __init__(self, configDict):
        baseClass.Handler.__init__(self, configDict['name'])
        fileName = configDict['fileName']
        self.tarFile = tarfile.open(fileName, 'w:gz')
    def handleFileHook(self, fileName):
        self.tarFile.add(
os.path.join(self.parentDir, fileName))
        return True
    def finalizeHook(self):
        self.tarFile.close()
        return True
A few more lines have to be added for configuration.  Although so simple, this handler shows its power when it is used in combination with other handlers that filter directories and files.

Configuration files specify which handlers are layered, in which order they are layered, and the configuration of each handler.  Creating or modifying a configuration is done differently with the console based toolkit (zigo) or with the graphic platform (zago), but both are based on the same configuration structures and the same configuration files can be used for either tool.



Feedback (use the email address at the top) will be greatly appreciated.  Do you find this framework useful?  What would you like to see added or improved?  Have you tried to implement a handler for your own purposes?



Credits

pychecker - Regularly used it.  Excellent tool! (http://pychecker.sourceforge.net)
epydoc - See it in action: API Docs (http://epydoc.sourceforge.net).
Many thanks to ActiveState for the developers forum that they provide through the Python Cookbook (http://aspn.activestate.com/ASPN/Cookbook).  I submitted a recipe that is the basic idea behind the design of this project (http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/302422).  Hopefully, the recipe will bring some feedback and get people involved in this project.