Before diving into the markup module of the idic framework, I wanted to take a quick look at coding standards from an "official" perspective/standpoint. Specifically Python's PEP-8 standards, since they've been a topic of some interest for me recently. I'm jumping the gun a bit — as I'm writing this, most of the markup module is actually mostly complete, but I haven't posted any of it, but bear with me, please...
Today's post ended up being something that felt better as a global page in the blog (I'll likely refer to it from time to time elsewhere/later), so I've copied it as such: PEP-8 vs. My Coding Standards. As I update my thoughts on this topic, that page will get updated…
If you've been reading the code I've been sharing here, and are aware of the PEP-8
standard, you may have noticed that I'm not adhering to all of the PEP-8 guidelines.
A few of them I've consciously chosen to not abide by, for various reasons that I
feel fall into its A
Foolish Consistency is the Hobgoblin of Little Minds
criteria. A handful
(I think) are hold-overs from working in projects that had very specific (sometimes
rigid) standards for variable names. Those habits were formed over the course of
years. In one case that helped form those habits, there was an even expectation that
the build-process for the associated codebase would actively check for compliance
with those style-guidelines, and if they weren't followed, the build would fail.
While that helped reinforce those guidelines, as well as they habits they formed,
I think that took things just a bit too far.
That same hobgoblins
section raises some thoughts for me about application
of rules around coding standards (naming-conventions in particular). Setting aside
my admitted habits, I tend to think along lines that I thought were best expressed
by a favorite author as:
Look, that's why there's rules, understand? So that you think before you break 'em.PEP-8 itself sets some priorities that I feel fall pretty much in line with that:
— Terry Pratchett, from Thief of Time
Consistency with this style guide is important. Consistency within a project is more important. Consistency within one module or function is the most important. (emphasis mine).Yes, there are rules. Yes, they are there for a reason. But they aren't set in stone, and there may be valid reasons to bend or even break them.
Still, style-guidelines are valuable, particularly in an environment where more
than one person is going to be in the position of reading the code. Since I know
I have some habits that would be considered bad
by PEP-8 standards, I'm going
to run down the list, noting which ones I've kept, which ones I've dropped (and why),
and which ones I should probably adopt that I missed somewhere along the line.
Code Layout
Many of the PEP-8 standards about code layout I follow to the best of my ability to do so:
- Indentation of 4 spaces per level Yes
- Using spaces for indentation, not tabs Yes
- Line length of 79 characters max: Usually
- Line-lengths of 72 characters for long/multiline strings: Probably not — but they will conform to the more general 79-character line-length
- Line-breaks before operators
- Blank-line usage
- Imports (though I think I'm pretty consistently adhering, I haven't gone and looked)
- Wildcard imports (probably more consistently not adhering, simply because the IDE I'm using isn't smart enough to add individual import-items in to the code, even though it is smart enough to recognize they exist)
The one remaining item in the Code Layout standards (Module-level "dunder" names), I know I'm not adhering to. They show up in my module- and package-header template files:
#-----------------------------------#
# File metadata #
#-----------------------------------#
__author__ = 'Brian D. Allbee'
__version__ = '0.1'
__copyright__ = 'Copyright 2017, Some rights reserved by Brian D. Allbee'
__license__ = ('This work is licensed under a Creative Commons '
'Attribution-ShareAlike 4.0 International License. '
'(http://creativecommons.org/licenses/by-sa/4.0/)')
__credits__ = ['Brian D. Allbee']
__maintainer__ = 'Brian D. Allbee'
__email__ = 'brian.allbee+module_name@gmail.com'
__status__ = 'Development'
I don't know whether I'll bother with changing those, frankly. On one hand, I find
that I can more easily read these they way they are right now, though so far I
haven't had much need to read them. On the other hand... Well, I don't
really have an on the other handscenario that I can point to.
String Quotes
I suspect that most developers working in Python will have naturally picked up the guidelines here — Using one type of quote (whether single or double) is probably habitual, and switching to the other to avoid having to escape the preferred quote is something that's pretty easy, provided that you know it's going to happen when you start typing.
Me, I tend to use single-quotes out of habit, and I stick to that pretty well. In the code I've shown so far, there's one pattern where I suspect I break from that pretty consistently: The descriptions of arguments and such in my documentation decorators. That's usually going to happen because I don't know if I'll need to use single- or double-quotes within the description-string until I've written it, by which time it's almost always going to be faster to escape the one or two characters in the string (or, sometimes, to just type \ then and there) than to replace the characters at the ends of the string.
Whitespace in Expressions and Statements
Spaces inside (), [] and {} are going to be a pretty consistent offense (if such
it is) on my part. I'm not likely to change this, I'm afraid, at least not soon,
because while readability counts in general (PEP-20/The
Zen of Python [execute import this
in a Python session]), the simple
truth of the matter is that right now my ability to read
it counts a bit more than anyone else's, and I find it easier to read with
those spaces. If things get to a point where anyone but me is actually using
my code, I may set something up in the build-process to strip out those additional
spaces, but I don't expect that to happen for a while.
Separate, but related: Odds are good that I've got several instances of whitespace at the ends of lines, if only because as I'm shuffling code around, I usually just don't think about removing any that sneak in.
Comments
Keeping comments current and relevant is something that I try to do, but I'd be very surprised if there weren't at least a few here and there where something changed but an associated comment didn't get changed accordingly. The longer a codebase is undergoing active development, the more likely that sort of disjoint is, I suspect.
Block-comments are usually going to be OK, I think — though there will, I'm
sure, be chunks of comments that won't conform because they're items I need to
differentiate quickly and easily. Usually that will be chunks of actual code,
commented out because it's not needed for functionality reasons, but could be useful
for testing or logging while code-changes are being made. In those cases, they'll
usually be commented out at the beginning of the line, making it easier to differentiate
between a real
comment and some code that's been commented out but kept for
some reason.
I try not to use inline comments, though there are probably a few here and there in some of my older code and projects. I doubt that there are many in the code here, if only because it's just not old enough for them to start accumulating yet.
Documentation Strings
May be moot, since I plan to use my documentation decorators in all my code, and they will generate results that are more consistent (and more verbose) by design.
Naming Conventions
If there is any single category of guidelines that my code is going to violate,
this is going to be it, but there's a reason behind it. The target audience, such
as there is one, for most of the code I'm expecting to write for this blog could
be described pretty well as web application developers.
To my thinking, that means
there's a pretty good chance that those target-audience users will be familiar with
JavaScript and its conventions. Some classes (particularly in the markup-generation
parts of the framework) I actually want to mirror their JavaScript equivalents
as closely as makes sense. JavaScript uses CapWords
(or CamelCase) in its
class-names, and mixedCase
for object properties and methods (using the names
of the styles as
listed in PEP-8). In portions of the framework, then, that overlap that desire to
mirror a JavaScript API equivalent, I'm using naming-conventions that will be more
familiar to developers with JavaScript experience. That's a conscious choice on my
part, and since I expect/hope that most of this code will be used by members of that web
application development
group, it has leaked
out across to other sections
of code that aren't as directly related...
Outside of those JavaScript-related
cases:
- Short, all-lower-case names for modules and packages — Yes;
CapWords
for classes (which holds true in both contexts) — Yes;- Function and class-member names — No (but there's a reason)
CapWordsnaming, because there will be cases where classes have
JavaScriptand
non-JavaScriptmembers alike, and I'd like to have some sort of readability-level hinting in those member-names indicating whether they're JavaScript equivalents or not. The recommended convention (
lower_case_with_underscores
) I find to be less
readable in general, particularly in cases where those members exist in
a class that also has JavaScript-equivalent members. It's not a great compromise,
to be sure, but I think it's the best compromise under the circumstances.
Given that the framework is all one project, and that
Consistency within a project is more important.that more or less mandates that all of the framework's codebase should at least prefer my non-standard, non-PEP-8-compliant naming conventions for the sake of consistency. Applications built on top of that framework may not share that preference, though, and shouldn't reflexively abide by them.
Programming Recommendations
PEP-8 also contains a pretty substantial list of Programming Recommendations. Many of them are moot for my efforts, at least so far — they relate to capabilities of Python that I'm simply not using (yet). Most of the rest I stick to pretty well, I think, though even in those cases, there may be additional relevant information or concerns:
- Singleton comparison (e.g.,
x is not None
vs.x != None
): Probably moot, since I generally don't use None except in boolean-compatible ways - Deriving custom exceptions from Exception: Yes
(to be honest, I wasn't aware that there was an underlying exception-class
deeper than
Exception
, so I pretty much got lucky in this case) - Catching specific exceptions: Yes,
though I know there are a few instances of
bare
code that I'll need to track down and reconcile by specifying the baseexcept
Exception
- Newer exception-binding (e.g.,
except Exception as error
vs.except Exception, err
). I hadn't seen this structure until I started this review, so I'll probably need to go looking for the older-style cases and correcting them. - Making sure
try
/except
/finally
is used to assure resource clean-up where it's relevant. I'm pretty sure I'm already doing that (out of habit), but another look-through wouldn't hurt
Checking Compliance with pylint
There's a tool called pylint
that can be used to check and rate
Python code. It checks a variety of items across an entire code-base, including many
(if not all) of the various PEP-8 standards, and generates a report of all the various
items that it finds, plus a rating
of the code it analyzed on a 1-to-10 scale.
Knowing that I have a slew of varied violations
likely, I was expecting it
to pretty much report that the code I was writing was somewhere in what might be
called a horrible
state, and that expectation was borne out when I ran it
against the idic
package just before the markup
module
was finished:
% errors / warnings by module ----------------------------- +-------------------+------+--------+---------+-----------+ |module |error |warning |refactor |convention | +===================+======+========+=========+===========+ |idic.markup |56.00 |39.89 |39.53 |49.95 | +-------------------+------+--------+---------+-----------+ |idic.doc_metadata |44.00 |42.13 |44.19 |30.71 | +-------------------+------+--------+---------+-----------+ |idic.unit_testing |0.00 |12.92 |11.63 |14.70 | +-------------------+------+--------+---------+-----------+ |idic.serialization |0.00 |5.06 |4.65 |4.63 | +-------------------+------+--------+---------+-----------+ Messages -------- +-------------------------------+------------+ |message id |occurrences | +===============================+============+ |bad-whitespace |5259 | +-------------------------------+------------+ |bad-continuation |1319 | +-------------------------------+------------+ |trailing-whitespace |1212 | +-------------------------------+------------+ |invalid-name |538 | +-------------------------------+------------+ |protected-access |91 | +-------------------------------+------------+ |unidiomatic-typecheck |88 | +-------------------------------+------------+ |line-too-long |27 | +-------------------------------+------------+ |(There are more) | +--------------------------------------------+The
bad-whitespace
, trailing-whitespace
and invalid-name
items were more or less expected to be widespread, for the reasons I mentioned above.
I'll have to look at the details on the bad-continuation
items, to see
if they are really a problem, or if they're something in my coding-style that I'm
doing for an identifiable reason. The protected-access
items are, I'm
sure, places where I'm directly accessing a protectedmember of one class-instance from within an instance of another class, probably in cases where I need to set a property-value in the first from within the second, and that property is read-only from a public-interface perspective, but I'll take a deeper look at some point later to determine if that's really a problem. The
unidiomatic-typecheck
items will warrant a closer look as well, since I'm not sure what those entail.
At and below the count/level of the line-too-long
items, I'm not terribly
concerned — so long as the code is functional and testable, it seems likely
that they are minor, tweaky little things. That said, there are 45
reported item-types, and 224 individual items, so those will
probably warrant a closer look before I wrap up the modules in the current idic
package-structure. The final rating,using
pylint
defaults
is reported as:
Global evaluation ----------------- Your code has been rated at -18.45/10
On a scale from 0 to 10, -18.45 sounds pretty bad, but I suspect
that a lot of that bad raing will go away once the bad-whitespace
,
trailing-whitespace
and invalid-name
items are ignored.
Doing that is pretty straightforward, and can even be set up so that the rule-set for
the idic
package is all stored in one place, so that a pylint
run that has specific exceptions relating to the idic
code-base
won't contaminate
pylint
runs against other packages
and modules:
- Create a custom pylint configuration-file:
$ pylint --generate-rcfile > pylint.idic.rc
- Add the
bad-whitespace
,trailing-whitespace
,invalid-name
, andprotected-access
messages to thedisable
list (I prepended hem in my case, since there were already quite a few in the list generated in thepylint.idic.rc
file generated bypylint
itself):[MESSAGES CONTROL] # ... # Disable the message, report, category or checker with the given id(s). You # can either give multiple identifiers separated by comma (,) or put this # option multiple times (only on the command line, not in the configuration # file where it should appear only once). # ... disable=bad-whitespace,trailing-whitespace,invalid-name,protected-access # ...
- Re-run
pylint
using the package-specific configuration-file:pylint --rcfile=pylint.idic.rc idic
% errors / warnings by module ----------------------------- +-------------------+------+--------+---------+-----------+ |module |error |warning |refactor |convention | +===================+======+========+=========+===========+ |idic.markup |56.00 |55.17 |39.53 |63.63 | +-------------------+------+--------+---------+-----------+ |idic.doc_metadata |44.00 |26.44 |44.19 |16.95 | +-------------------+------+--------+---------+-----------+ |idic.unit_testing |0.00 |10.34 |11.63 |14.67 | +-------------------+------+--------+---------+-----------+ |idic.serialization |0.00 |8.05 |4.65 |4.76 | +-------------------+------+--------+---------+-----------+ Messages -------- +-------------------------------+------------+ |message id |occurrences | +===============================+============+ |bad-continuation |1319 | +-------------------------------+------------+ |unidiomatic-typecheck |88 | +-------------------------------+------------+ |line-too-long |27 | +-------------------------------+------------+ |(There are more) | +--------------------------------------------+... and a
ratingof 4.38:
Global evaluation ----------------- Your code has been rated at 4.38/10 (previous run: -18.45/10, +22.83)If all of the various
bad-continuation
items are either corrected,
or deemed to be something I'm doing for a reason, and go away (simulated by adding
that message to the configuration-file's disable
list), the results are still better:
% errors / warnings by module ----------------------------- +-------------------+------+--------+---------+-----------+ |module |error |warning |refactor |convention | +===================+======+========+=========+===========+ |idic.markup |56.00 |55.17 |39.53 |48.28 | +-------------------+------+--------+---------+-----------+ |idic.doc_metadata |44.00 |26.44 |44.19 |31.61 | +-------------------+------+--------+---------+-----------+ |idic.unit_testing |0.00 |10.34 |11.63 |17.82 | +-------------------+------+--------+---------+-----------+ |idic.serialization |0.00 |8.05 |4.65 |2.30 | +-------------------+------+--------+---------+-----------+ Messages -------- +-------------------------------+------------+ |message id |occurrences | +===============================+============+ |unidiomatic-typecheck |88 | +-------------------------------+------------+ |line-too-long |27 | +-------------------------------+------------+ |(There are more) | +-------------------------------+------------+and a rating of
Global evaluation ----------------- Your code has been rated at 8.62/10 (previous run: 4.38/10, +4.24)I'm going to leave the
bad-continuation
items in place until I have
a chance to look at them in more detail, though.
From my perspective, knowing that my coding-practices are going to generate
some pylint
errors and warnings, I'm OK with disabling some of them
(at least so far). If I were using a different IDE, one that handled certain tasks
differently (I won't go so far as to say better
), or if the code-base that
I'm writing weren't intended to break some of the rules, it might well
be a very different story.
I'd also give some serious consideration to including a pylint
run
as part of a quality-check during a build or commit process, assuming that I could
figure out a good/useful way to integrate them. For now, I'll just run pylint
periodically to see what the results look like, and as a check for any egregious
omissions or errors on my part...
No comments:
Post a Comment