I Dream In Code: My Coding Standards

Let me just state up front what I'm trying to achieve any time I'm writing code for use outside my own personal needs. In no particular order, though I've numbered them for later reference again, my goals for code I distribute are:

It should be as easy to use as I can make it;
It should be as hard to misuse as I can make it;
It should be as well- and consistently documented as I can make it;
It should be as general-purpose as I can make it while providing the functionality it was intended to provide;
It should be thoroughly tested;
It should be as unsurprising as I can manage to make it;

There are additional goals that apply to more specific ends that I won't go into detail on at present; these six are global, applying, I think, to any development-effort.

Over the past several years, these have shaken out into a few specific patterns/practices/guidelines that I apply to all my Python code:

The third principle, regarding documentation, is something that I'm going to address with a documentation module in one (or more) of the posts here. I'm also considering ways to integrate documentation-requirements into the thoroughly tested goal (#5), but I don't know how far I'm going to get with that, or if it'll even really be necessary, since I plan on incorporating the documentation-process into my code-templates as soon as I have it worked out.

The balance of these goals, though, have led me to formulate a set of guiding principles that I follow with any code that I write that's going to be available to anyone other than myself. I frequently adhere to them even for code that I'm not planning on sharing or distributing, particularly if I expect that I'll need to set it aside for any length of time — after a while, I'll be as much a stranger to my own code as someone who's never seen it before. At that point, any of the old adages about commenting your code (because the sanity you save may be your own) apply as much to the application of these principles:

Manage/control all public interface entities/members
Raise errors as close to their ultimate source as possible
If specific types are expected, test for those where they're expected
If specific types are expected, assure that some common type-identity is defined or available to use to test the expectation
Restrict inheritance until there is a demonstrable need for it
Leverage inheritance to keep functionality in one and only one place

The rationale (and some discussion about each) seems in order:

Manage/control all public interface entities/members

The public interface of a class is important. Ultimately, it controls all avenues of interaction with each and every instance of the class. Python doesn't have real protected- or private-member capabilities — it's enforced by convention (and name mangling for private members), which does not prevent access to those members. Taken together, these facts open up a whole range of possibilities for bad interactions with instance members, though probably only with respect to instance attributes (properties). In application, what this usually means is that I'll define all public properties of a class using the formal property provided by Python, with getter, setter and deleter methods. The class-templates that I've shown already (the ones that allow for concrete properties, at least) have already hinted at this in their structure. An implemented properties-structure would look something like this for a property named PropertyName:

    #####################################
    # Class attributes (and instance-   #
    # attribute default values)         #
    #####################################

    _propertyName = None

    #####################################
    # Instance property-getter methods  #
    #####################################

    def _GetPropertyName( self ):
        """
Gets the PropertyName property-value of the instance."""
        return self._propertyName

    #####################################
    # Instance property-setter methods  #
    #####################################

    def _SetPropertyName( self, value ):
        """
Sets the PropertyName property-value of the instance."""
        # TODO: Type-check the value argument, and raise an error on a 
        #       failure, or remove this comment.
        # TODO: Value-check the value argument, and raise an error on a 
        #       failure, or remove this comment.
        self._propertyName = value

    #####################################
    # Instance property-deleter methods #
    #####################################

    def _DelPropertyName( self ):
        """
"Deletes"" the PropertyName property-value of the instance by setting it to 
the default value specified in the ClassName attributes."""
        self._propertyName = self.__class__._propertyName

    #####################################
    # Instance Properties               #
    #####################################

    PropertyName = property( _GetPropertyName, None, None,
        'Gets the PropertyName value of the instance.' )

    #####################################
    # Instance Initializer              #
    #####################################
    def __init__( self ):
        """
Instance initializer"""
        # Call parent initializers, if applicable.
        # Set default instance property-values with _Del... methods as needed.
        self._DelPropertyName()
        # Set instance property values from arguments if applicable.

Raise errors as close to their ultimate source as possible;
If specific types are expected, test for those where they're expected; and
If specific types are expected, assure that some common type-identity is defined or available to use to test the expectation;

These tend to be expressed in my code as type- and value-checking of function and method arguments more than anything else. That, in turn, tends to look a lot like emulating static typing of values and arguments, as is found in Java and all of the real C variants. To be brutally honest, there's a fair amount of truth to that observation (accusation?) — but I strongly believe that there's a significant contribution to several of my goals (#2, #5 and maybe #6, though that's often situational).

It's not difficult to imagine a scenario where, say, an instance property-value gets set to some value, other code happens, and somewhere along the line, perhaps a lot further in, an error gets raised because that initial value-setting was invalid. This kind of thing happens all the time, especially during the development phase of a project. Although Python's error-reporting (and traceback) functionality is pretty solid, this sort of bug can be difficult and painful to deal with. If, on the other hand, the initial value-set process actively checks for valid values (by type and/or actual value) and raises an error right then and there, it doesn't have a chance to get buried in other code after silently setting up the imminent failure. While I like Python's dynamic typing, and I like not having to specify a type for each and every variable, property and argument, there are times when it's advantageous to do so, and this sort of scenario is, hands down, one of those times.

There are some trade-offs that have to be made, though, if the more open, dynamic nature of Python code is to be preserved while still actively type- and value-checking potential points of failure. First and foremost among them, I think, is that if type-checking is going to be used, there must be some very abstract types defined for any checks that aren't looking for baseline or primitive types. That is: If a method expects a number for a given argument, checking the type of that argument is easily managed by looking to see if the value is of a numeric type (int, long, float, etc.). If an argument is expecting some other type, say an XML node (whether it's a tag, a text-node, CDATA or a comment), there must be some low-level type or interface (maybe something called IsDOMNode for example) defined that can be used to actually check the incoming value-type. That's not difficult, but it requires planning and/or discipline to make sure it gets done, and I haven't come up with a way (so far) to test for the existence of baseline types during automated testing, so it's a completely manual process.

Restrict inheritance until there is a demonstrable need for it

This is, I gather, a pretty controversial practice, and not just in the realm of Python code. The typical argument against it that I see in Python-oriented discussions is usually some variant of the argument that we're all consenting adults here, so let me subclass as I see fit. OK. Fair enough. As the other consenting adult in this relationship, my concern would be that someone might want to subclass something in my code, even legitimately, that flat-out was not intended to be subclassed. That someone might well be me some time later, after I've long since forgotten why a class wasn't deemed safe or desirable to subclass. Since I'm planning on distributing the actual code, there's nothing to prevent it happening, but if it's going to happen, I'm going to require that doing so requires at least looking at the original code. I'll commit to making sure I explain why it's not intended to be extended in the code itself. After that, and hopefully after actually looking at the explanation, any other consenting adult can make their own decision, hopefully an informed one, with the understanding that they do so at their own risk.

Leverage inheritance to keep functionality in one and only one place

Python is unusual (maybe unique) in that it allows nearly unrestricted multiple inheritance — there are a few restrictions, to be sure, but they are nowhere near as stringent as the ones found in other languages, where, typically, any given class can only inherit from one other class, concrete or abstract, and any number of interfaces. Inheritance can be something of a bugaboo, breaking encapsulation (there are several explanations and examples of how this can happen that can be found easily – take your pick). I have settled on what I believe to be a pretty solid balance between using inheritance and preserving encapsulation by using small abstract classes as mixins for very specific, usually simple, pieces of common functionality. Technically, this still presents the inheritance-breaks-encapsulation risk, but if the functionality is simple, cohesive, and sensible, the ability to keep all of its code in one place is, to my thinking, a good trade-off for that risk.