I've been a developer (mostly web-applications) for longer now than I'd care to admit. In the nearly two decades I've been doing this kind of thing, I've acquired several habits that could be considered coding standards that I adhere to, even if they may not be typical best practices in the industry at large. Some of them are, I think, though I've had experience professionally where they were acknowledged but not followed, or set aside because of time/budget priorites. All of the following, though, I consider to be important enough that I plan to follow them while I work through the code I'm presenting here.
Yes, I Know Python's not Java...
Let me get this out of the way first...
Over the last several years, as I've contemplated various programmatic constructs and concepts and how to implement them in Python, as I've Googled them I've run across any number of discussions that follow a pattern like:
- How can I [do something] in Python?
- You shouldn't, It's not
Pythonic.
- How can I [do something] in Python?
- You shouldn't, it violates duck-typing.
- How can I [do something] in Python?
- Why would you want to? Python's not Java
pythonic,trying to figure out what, exactly, that really means is not easy, and is often at least a little bit inconsistent from one person's viewpoint to another. I did, eventually, run across an article that makes a solid attempt to define what
pythonicreally means, and it's worth a read, in my opinion. I noted with some interest, though, that one of the first things stated was that
before going on to actually explain and discuss what some of the idioms of Python actually are, and how they've evolved over time (and, maybe, are continuing to evolve). There is, maybe, enough insight into the then-current idioms discussed there to get a good handle on recognizing other, perhaps newer idioms.Pythonicmeans something likeidiomatic Python,but now we'll need to describe what that actually means.
That same article also mentions The Zen of Python, which can be found by
executing import this
in a Python shell (numbered for later reference):
The Zen of Python, by Tim PetersThese may be good general characteristics for identifying some of the
- Beautiful is better than ugly.
- Explicit is better than implicit.
- Simple is better than complex.
- Complex is better than complicated.
- Flat is better than nested.
- Sparse is better than dense.
- Readability counts.
- Special cases aren't special enough to break the rules.
- Although practicality beats purity.
- Errors should never pass silently.
- Unless explicitly silenced.
- In the face of ambiguity, refuse the temptation to guess.
- There should be one – and preferably only one – obvious way to do it.
- Although that way may not be obvious at first unless you're Dutch.
- Now is better than never.
- Although never is often better than right now.
- If the implementation is hard to explain, it's a bad idea.
- If the implementation is easy to explain, it may be a good idea.
- Namespaces are one honking great idea – let's do more of those!
Pythonicidioms. They are cited frequently enough (usually without explanation or context as applied to answering a
How can I [do something] in Pythonquestion) that it appears many believe them to be adequate.
<soapbox>
But, particularly with respect to practicality beating purity,
(#9), there
are other considerations that I believe are at least as important. Perhaps
even more important, if the alternative is code that is unusable, brittle
or difficult to maintain.
</soapbox>
My Coding Imperatives
In general, I believe that all code should be written to meet the following criteria (setting aside any specific functionality or implementation guidelines). They aren't in any particular order, but I've numbered them for reference later:
- It should be as easy to use as possible;
- It should be as hard to misuse as possible;
- It should be as well- and consistently documented as possible;
- It should be as general-purpose as possible, balanced against the functionality it was intended to provide;
- It should thoroughly tested;
- It should be as unsurprising as possible;
The third item (regarding documentation) is something that I'm going to
address with a documentation module in one (or more) of the posts here. I'm also
considering ways to integrate documentation-requirements into the thoroughly
tested
item (#5), but I don't know how far I'm going to get with that, or if
it'll even really be necessary, since I plan on incorporating the
documentation-process into code-templates as soon as I have it worked out.
Guiding Principles
The balance of those items, though, have led me to formulate a set of guiding
principles that I follow with any code that I write that's going to be available
to anyone other than myself. I frequently adhere to them even for code that I'm
not planning on sharing or distributing, particularly if I expect that
I'll need to set it aside for any length of time — after a while, I'll be
as much a stranger to my own code as someone who's never seen it before.
At that point, any of the old adages about commenting your code (because the
sanity you save may be your own
) apply as much to the application of
these principles:
- Manage/control all public interface entities/members
- Raise errors as close to their ultimate source as possible
- If specific types are expected, test for those where they're expected
- If specific types are expected, assure that some common type-identity is defined or available to use to test the expectation
- Restrict inheritance until there is a demonstrable need for it
- Leverage inheritance to keep functionality in one and only one place
The specific shape of these — how they are implemented — may vary significantly depending on the language or platform(s) in use. They may even take some of their final shape from business- or project-level requirements. Here's how I see them applying to what I'm planning to do here, and specifically in Python...
Manage/control all public interface entities/members
The public interface of a class is important. Ultimately, it controls all avenues of
interaction with each and every instance of the class. Python doesn't have
real
protected- or private-member capabilities — it's enforced by
convention (and name
mangling for private
members), which does not prevent
access to those members. Taken together, these facts open up a whole range of
possibilities for bad interactions with instance members, though probably only with
respect to instance attributes (properties). In application, what this usually means
is that I'll define all public properties of a class using the formal
property
provided by Python, with getter, setter and deleter methods.
The class-templates that I've shown already (the ones that allow for concrete
properties, at least) have already hinted at this in their structure. An implemented
(but very basic) properties-structure would look something like this for a property
named PropertyName
somewhere in a class:
_propertyName = None # The internal storage of the property
def _GetPropertyName( self ):
"""
Gets the PropertyName property-value of the instance."""
return self._propertyName
def _SetPropertyName( self, value ):
"""
Sets the PropertyName property-value of the instance."""
# TODO: Type-check the value argument, raising an error if bad.
# TODO: Value-check the value argument, raising an error if bad.
self._propertyName = value
def _DelPropertyName( self ):
"""
"Deletes" the PropertyName property-value of the instance by setting it
to None."""
self._propertyName = None
PropertyName = property( _GetPropertyName, None, None,
'Gets the PropertyName value of the instance.' )
In my expereience, most property-getter methods (_GetPropertyName
)
that don't refer to some other object, or that don't implement some sort of
lazy instantiation
or -loading strategy are very naïve. They almost have to be, though,
since all they usually need to do is retrieve the interal-storage value that the
property wraps, and outside of a few special cases, I rarely find myself doing
more than this example shows. Most of my deleter-methods, barring variations in
the deleted
value being set are also pretty naïve...
The setter-method shown (_SetPropertyName
) is also naïve as shown,
but that naiveté would go away as soon as the type- and value-checks noted in the
TODO
comments were implemented. The implementations of those checks ties
directly into my next guiding principle, so I'll cover that in more detail shortly.
At least one of those checks (checking for types, usually) shows up in almost
every setter-method I write, eventually.
Raise errors as close to their ultimate source as possible;
If specific types are expected, test for those where they're
expected; and
If specific types are expected, assure that some common
type-identity is defined or available to use to test the expectation;
These tend to be expressed in my code as type- and value-checking of function and
method arguments more than anything else. That, in turn, tends to look a lot
like emulating static typing of values and arguments, as is found in Java and all of
the real
C variants. To be brutally honest, there's a fair amount of truth to
that observation (accusation?) — but I strongly believe that there's a
significant contribution to several of my goals (#2, #5 and maybe #6, though that's
often situational). A simple implementation, using the same setter-method shown
above, and checking for a non-empty string or unicode value or None would look
something like this:
def _SetPropertyName( self, value ):
"""
Sets the PropertyName property-value of the instance."""
if value != None and type( value ) not in ( str, unicode ):
raise TypeError( '%s.PropertyName expects a non-empty str or '
'unicode value, or None, but was passed "%s" (%s).' % (
self.__class__.__name__, value,
type( value ).__name__
)
)
if value != None and not value:
raise ValueError( '%s.PropertyName expects a non-empty str '
'or unicode value, or None, but was passed "%s" (%s).' % (
self.__class__.__name__, value,
type( value ).__name__
)
)
self._propertyName = value
It's not difficult to imagine a scenario where, say, an instance property-value gets set to some value, other code happens, and somewhere along the line, perhaps a lot further in, an error gets raised because that initial value-setting was invalid. This kind of thing happens all the time, especially during the development phase of a project. Although Python's error-reporting (and traceback) functionality is pretty solid, this sort of bug can be difficult and painful to deal with. If, on the other hand, the initial value-set process actively checks for valid values (by type and/or actual value) and raises an error right then and there, it doesn't have a chance to get buried in other code after silently setting up the imminent failure. While I like Python's dynamic typing, and I like not having to specify a type for each and every variable, property and argument, there are times when it's advantageous to do so, and this sort of scenario is, hands down, one of those times.
There are some trade-offs that have to be made, though, if the more open, dynamic
nature of Python code is to be preserved while still actively type- and
value-checking potential points of failure. First and foremost among them, I think,
is that if type-checking is going to be used, there must be some very
abstract types defined for any checks that aren't looking for baseline or primitive
types. That is: If a method expects a number for a given argument, checking the type
of that argument is easily managed by looking to see if the value is of a numeric
type (int, long, float, etc.). If an argument is expecting some other type,
say an XML node (whether it's a tag, a text-node, CDATA or a comment), there
must be some low-level type or interface (maybe something called
IsDOMNode
for example) defined that can be used to actually
check the incoming value-type. That's not difficult, but it requires
planning and/or discipline to make sure it gets done.
Restrict inheritance until there is a demonstrable need for it
This is, I gather, a pretty controversial practice, and not just in the realm of
Python code. The typical argument against it that I see in Python-oriented
discussions is usually some variant of the argument that we're all consenting
adults here, so let me subclass as I see fit.
OK. Fair enough. As the other
consenting adult
in this relationship, my concern would be that someone might
want to subclass something in my code, even legitimately, that flat-out was not
intended to be subclassed. That someone
might well be me some
time later, after I've long since forgotten why a class wasn't deemed safe
or desirable to subclass. Since I'm planning on distributing the actual code,
there's nothing to prevent it happening, but if it's going to happen, I'm going to
require that doing so requires at least looking at the original code. I'll
commit to making sure I explain why it's not intended to be extended in
the code itself. After that, and hopefully after actually looking at the
explanation, any other consenting adult
can make their own decision,
hopefully an informed one, with the understanding that they do so at
their own risk.
Leverage inheritance to keep functionality in one and only one place
Python is unusual (maybe unique) in that it allows nearly unrestricted
multiple inheritance — there are a few restrictions, to be sure,
but they are nowhere near as stringent as the ones found in other languages, where,
typically, any given class can only inherit from one other class, concrete
or abstract, and any number of interfaces. Inheritance can be something of a
bugaboo, breaking encapsulation (there are several explanations and examples of how
this can happen that can be found easily – take
your pick). I have settled on what I believe to be a pretty solid balance between
using inheritance and preserving encapsulation by using small abstract classes
as mixins
for very
specific, usually simple, pieces of common functionality. Technically, this still
presents the inheritance-breaks-encapsulation risk, but if the functionality is
simple, cohesive, and sensible, the ability to keep all of its code in one
place is, to my thinking, a good trade-off for that risk.
Less-Imperative Items and Next Steps
I generally like to have templates and/or snippets set up for development, so
that I don't have to spend (read as waste
) a lot of time creating starting-points
for common items like modules and classes. I mentioned incorporating the
documentation-process into code-templates
earlier, but before I can lock down
any of the starting-point templates noted above, I need to work out how I'm
going to provide in-code documentation so that it can be included in those templates.
There's a fair amount of thought that has to go into that, and I'm not sure if I'll
be able to get any real
code interspersed into it, but I'm going to try...
No comments:
Post a Comment