Pyfmf: A Project for File Management in Python
pyfmf
is an open-source project in Python for platform-independent file management. The core of
pyfmf is an
extensible framework. A console based toolkit (
zigo) and a GUI platform (
zago, a wrapper around
zigo) are extensions of the framework. You can use either just the
framework by itself, the toolkit, or the graphic platform to
build applications for archiving files, searching files, measuring disk
usage, etc. For instance, build your own back-up utility using
existing modules to select which directories and which files to back
up. You can even use existing modules to detect which files have
changed since the last back-up. Then you only have to implement a
module (with a few lines of Python code) that executes the actual
back-up operation (e.g., writing to a tape device).
The framework is based on handlers that can be layered (like streams
in Java, protocol layers in the OSI stack, or like piping of commands in Unix), each handler
performing different actions on directories and files: filtering them
for the rest of the stack, gathering information and possibly storing
that information in a persistent data structure, logging changes to the files,
modifying files, archiving files, and so on. The
framework
includes a base class for the handlers
and the controller that invokes the handlers while it walks through a
tree of directories.
Several handlers are implemented:
- Several filters for directories. Filter in or filter out directories
based on the full path or only on the base names of the directories.
Directories to be filtered can be specified with wildcards.
- Several filters for files. Filter in or filter out files based on
the full path or on the base names of the files. Files to be
filtered can be specified with wildcards.
- A handler that stores a persistent data structure of directories,
files and file modification times. It can be used to detect
changes in the filesystem.
- A filter that loads a persistent data structure created by the
above mentioned handler and compares the present filesystem structure
to the old one. It flags new files, deleted files and files that have been
modified for the handlers above it in the stack.
- Handlers that log the directories and
files into plain, text files.
- Handlers that act as results viewers for zago.
- A handler for tar-ing
files. Very useful in combination with some of the above filters,
it can create a tar file of only selected file types and it can exclude
branches in a tree. Created to periodically tar
all the relevant files in this project from a CVS checkout (for
instance, selecting all the '*.py' files but not the '*.pyc' files and
excluding the 'CVS' directories in the tree). Probably just
replaces a very complex tar command, but I couldn't resist the irony of using pyfmf to package itself.
New handlers are easy to implement. The base class has a few
hook methods that can be overridden and it contains defaults for all these
methods. A useful handler can be implemented by overriding even
only one of the hooks. For example, here is how the tar-file handler could be implemented:
import os, tarfile, baseClass
class Handler (baseClass.Handler):
def __init__(self, configDict):
baseClass.Handler.__init__(self, configDict['name'])
fileName = configDict['fileName']
self.tarFile = tarfile.open(fileName, 'w:gz')
def handleFileHook(self, fileName):
self.tarFile.add(os.path.join(self.parentDir, fileName))
return True
def finalizeHook(self):
self.tarFile.close()
return True
A few more lines have to be added for configuration.
Although so simple, this handler shows its power when it is used in
combination with other handlers that filter directories and files.
Configuration files specify which handlers are layered, in which
order they are layered, and the
configuration of each handler. Creating or modifying a
configuration is done differently with the console based toolkit (
zigo) or
with the graphic platform (
zago),
but both are based on the same configuration structures and the same
configuration files can be used for either tool.
Feedback
(use the email address at the top) will be greatly appreciated.
Do you find this framework useful? What would you like to see
added or improved? Have you tried to implement a handler for your
own purposes?
Credits
pychecker - Regularly used it. Excellent tool! (
http://pychecker.sourceforge.net)
epydoc - See it in action:
API Docs (
http://epydoc.sourceforge.net).
Many thanks to
ActiveState for the developers forum that they provide through the
Python
Cookbook (
http://aspn.activestate.com/ASPN/Cookbook). I submitted a recipe that is the basic idea behind the design of this project (
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/302422). Hopefully, the recipe will bring some feedback and get people involved in this project.