Let me just state up front what I'm trying to achieve any time I'm writing code for use outside my own personal needs. In no particular order, though I've numbered them for later reference again, my goals for code I distribute are:
- It should be as easy to use as I can make it;
- It should be as hard to misuse as I can make it;
- It should be as well- and consistently documented as I can make it;
- It should be as general-purpose as I can make it while providing the functionality it was intended to provide;
- It should be thoroughly tested;
- It should be as unsurprising as I can manage to make it;
global,applying, I think, to any development-effort.
Over the past several years, these have shaken out into a few specific patterns/practices/guidelines that I apply to all my Python code:
The third principle, regarding documentation, is something that I'm going to
address with a documentation module in one (or more) of the posts here. I'm also
considering ways to integrate documentation-requirements into the thoroughly
tested
goal (#5), but I don't know how far I'm going to get with that, or if
it'll even really be necessary, since I plan on incorporating the
documentation-process into my code-templates as soon as I have it worked out.
The balance of these goals, though, have led me to formulate a set of guiding
principles that I follow with any code that I write that's going to be available
to anyone other than myself. I frequently adhere to them even for code that I'm
not planning on sharing or distributing, particularly if I expect that
I'll need to set it aside for any length of time — after a while, I'll be
as much a stranger to my own code as someone who's never seen it before.
At that point, any of the old adages about commenting your code (because the
sanity you save may be your own
) apply as much to the application of
these principles:
- Manage/control all public interface entities/members
- Raise errors as close to their ultimate source as possible
- If specific types are expected, test for those where they're expected
- If specific types are expected, assure that some common type-identity is defined or available to use to test the expectation
- Restrict inheritance until there is a demonstrable need for it
- Leverage inheritance to keep functionality in one and only one place
Manage/control all public interface entities/members
The public interface of a class is important. Ultimately, it controls all avenues of
interaction with each and every instance of the class. Python doesn't have
real
protected- or private-member capabilities — it's enforced by
convention (and name
mangling for private
members), which does not prevent
access to those members. Taken together, these facts open up a whole range of
possibilities for bad interactions with instance members, though probably only with
respect to instance attributes (properties). In application, what this usually means
is that I'll define all public properties of a class using the formal
property
provided by Python, with getter, setter and deleter methods.
The class-templates that I've shown already (the ones that allow for concrete
properties, at least) have already hinted at this in their structure. An implemented
properties-structure would look something like this for a property named
PropertyName
:
#####################################
# Class attributes (and instance- #
# attribute default values) #
#####################################
_propertyName = None
#####################################
# Instance property-getter methods #
#####################################
def _GetPropertyName( self ):
"""
Gets the PropertyName property-value of the instance."""
return self._propertyName
#####################################
# Instance property-setter methods #
#####################################
def _SetPropertyName( self, value ):
"""
Sets the PropertyName property-value of the instance."""
# TODO: Type-check the value argument, and raise an error on a
# failure, or remove this comment.
# TODO: Value-check the value argument, and raise an error on a
# failure, or remove this comment.
self._propertyName = value
#####################################
# Instance property-deleter methods #
#####################################
def _DelPropertyName( self ):
"""
"Deletes"" the PropertyName property-value of the instance by setting it to
the default value specified in the ClassName attributes."""
self._propertyName = self.__class__._propertyName
#####################################
# Instance Properties #
#####################################
PropertyName = property( _GetPropertyName, None, None,
'Gets the PropertyName value of the instance.' )
#####################################
# Instance Initializer #
#####################################
def __init__( self ):
"""
Instance initializer"""
# Call parent initializers, if applicable.
# Set default instance property-values with _Del... methods as needed.
self._DelPropertyName()
# Set instance property values from arguments if applicable.
Raise errors as close to their ultimate source as possible;
If specific types are expected, test for those where they're
expected; and
If specific types are expected, assure that some common
type-identity is defined or available to use to test the expectation;
These tend to be expressed in my code as type- and value-checking of function and
method arguments more than anything else. That, in turn, tends to look a lot
like emulating static typing of values and arguments, as is found in Java and all of
the real
C variants. To be brutally honest, there's a fair amount of truth to
that observation (accusation?) — but I strongly believe that there's a
significant contribution to several of my goals (#2, #5 and maybe #6, though that's
often situational).
It's not difficult to imagine a scenario where, say, an instance property-value gets set to some value, other code happens, and somewhere along the line, perhaps a lot further in, an error gets raised because that initial value-setting was invalid. This kind of thing happens all the time, especially during the development phase of a project. Although Python's error-reporting (and traceback) functionality is pretty solid, this sort of bug can be difficult and painful to deal with. If, on the other hand, the initial value-set process actively checks for valid values (by type and/or actual value) and raises an error right then and there, it doesn't have a chance to get buried in other code after silently setting up the imminent failure. While I like Python's dynamic typing, and I like not having to specify a type for each and every variable, property and argument, there are times when it's advantageous to do so, and this sort of scenario is, hands down, one of those times.
There are some trade-offs that have to be made, though, if the more open, dynamic
nature of Python code is to be preserved while still actively type- and
value-checking potential points of failure. First and foremost among them, I think,
is that if type-checking is going to be used, there must be some very
abstract types defined for any checks that aren't looking for baseline or primitive
types. That is: If a method expects a number for a given argument, checking the type
of that argument is easily managed by looking to see if the value is of a numeric
type (int, long, float, etc.). If an argument is expecting some other type,
say an XML node (whether it's a tag, a text-node, CDATA or a comment), there
must be some low-level type or interface (maybe something called
IsDOMNode
for example) defined that can be used to actually
check the incoming value-type. That's not difficult, but it requires
planning and/or discipline to make sure it gets done, and I haven't come up with a
way (so far) to test for the existence of baseline types during automated testing,
so it's a completely manual process.
Restrict inheritance until there is a demonstrable need for it
This is, I gather, a pretty controversial practice, and not just in the realm of
Python code. The typical argument against it that I see in Python-oriented
discussions is usually some variant of the argument that we're all consenting
adults here, so let me subclass as I see fit.
OK. Fair enough. As the other
consenting adult
in this relationship, my concern would be that someone might
want to subclass something in my code, even legitimately, that flat-out was not
intended to be subclassed. That someone
might well be me some
time later, after I've long since forgotten why a class wasn't deemed safe
or desirable to subclass. Since I'm planning on distributing the actual code,
there's nothing to prevent it happening, but if it's going to happen, I'm going to
require that doing so requires at least looking at the original code. I'll
commit to making sure I explain why it's not intended to be extended in
the code itself. After that, and hopefully after actually looking at the
explanation, any other consenting adult
can make their own decision,
hopefully an informed one, with the understanding that they do so at
their own risk.
Leverage inheritance to keep functionality in one and only one place
Python is unusual (maybe unique) in that it allows nearly unrestricted
multiple inheritance — there are a few restrictions, to be sure,
but they are nowhere near as stringent as the ones found in other languages, where,
typically, any given class can only inherit from one other class, concrete
or abstract, and any number of interfaces. Inheritance can be something of a
bugaboo, breaking encapsulation (there are several explanations and examples of how
this can happen that can be found easily – take
your pick). I have settled on what I believe to be a pretty solid balance between
using inheritance and preserving encapsulation by using small abstract classes
as mixins
for very
specific, usually simple, pieces of common functionality. Technically, this still
presents the inheritance-breaks-encapsulation risk, but if the functionality is
simple, cohesive, and sensible, the ability to keep all of its code in one
place is, to my thinking, a good trade-off for that risk.