Home > pyCLiFF

pyCLiFF

PyCLiFF is a project mainly written in Python, based on the MIT license.

A simple framework to get rid of boilerplate crap when writing simple filter utilities for use from a *nix shell.

pyCLiFF, the Python Command Line Filter Framework is a simple tool designed to remove the boilerplate and annoyances from command line filters.

In nix there is a tradition of using pipes to connact many programs together when processing data. The programs are usually very single transformations on text. This text may come in from a file or from stdin, and leave via file or stdout. This pattern is ubiquitous in nix programming, allowing for very powerful command line usage. Such tools frequently (mostly?) follow the this pattern:

  1. Setup/handle arguments
  2. Mainloop: A. Read line from input file (stdin or file) B. Process that line in some way C. Output the result of processing.
  3. Cleanup and exit.

The above pattern results in many programs with a large portion of boilerplate code, and a few dozen lines of unique application code. pyCLiFF is an attempt to cover all the parts of a command line filter that are not unique. (Usually this means step B above).

This module is designed to be used in 2 ways:

  1. As a callback framework:

In this use case, the CLF object is instantiated by the user. During this instantiation, or during runtime, hook functions are provided to the CLF object. These hook functions will be called to handle particular events, or to do application specific tasks, such as handling arguments. This usage is fairly straight forward and very nice for simple modules. It does however tend to result in the use of globals, superfluous state keeping objects, or complicated closure mechanisms. This method, therefore, is reccommended only for short and simple scripts. See examples/head.py to see this usage. (It implements a very limited clone of the standard *nix utility head).

  1. As an inheritable object:

In more complex scripts, or those that need a lot of state kept, it may be more convenient to simply subclass CLF and override the relevant methods, keeping state in the CLF object itself. By doing this the mainloop may be customized if the above pattern is overly simplistic. The CLF class was designed with such inheritence in mind, and as such will be "friendly" to the inheritor: if a subclass implements hook functions, the CLF superclass will not override them, even if hook functions are provided in the call to init.

Overall this project's primary goal is to greatly speed up the process of writing one-off tools and quick data processing scripts. The complexities that do exist (such as optparse args, and the stderr messaging facility) are an acknowledgement to the fact that one-off tools can be difficult, and sometimes are required to grow into "real" programs.

Please see the examples, docs and code for more details. Contributors are of course welcome, please email me or contribute via github standard mechanisms.