The argument documentation-metadata creation process is technically usable at this point, though not as bullet-proof as I'd like yet. It is completely feasible to leave it as it is now, and it'd provide the ability to document function- and method-arguments like so:
# An example of documentation decorators on a function
@describe.argument( 'arg1', 'Description of arg1' )
@describe.argument( 'arg2', 'Description of arg2' )
@describe.argument( 'arg3', 'Description of arg3' )
def MyFunction( arg1, arg2, arg3=None ):
"""
Description of function (original docstring)"""
# TODO: Generate actual implementation here...
raise NotImplementedError( 'MyFunction is not yet implemented' )
The resulting _documentation.arguments
from that decoration
is, presently, a standard Python dict
, and would look like
this:
{
'arg1':{
'description': 'Description of arg1',
'expects': (<type 'object'>,),
'hasDefault': False,
'name': 'arg1'
},
'arg2':{
'description': 'Description of arg2',
'expects': (<type 'object'>,),
'hasDefault': False,
'name': 'arg2'
},
'arg3':{
'defaultValue': None,
'description': 'Description of arg3',
'expects': (<type 'object'>,),
'hasDefault': True,
'name': 'arg3'
}
}
My one remaining (major?) concern is that the underlying
dict
is mutable. That is, it's possible to create new
elements in it, or overwrite existing ones, with invalid data. It
doesn't feel likely that creation or over-writing of that
data would happen with both intent and invalid structure,
but invalid accidental changes are always possible,
and my gut feeling is that they are much more likely to occur. The
concern is that if such a change (intentional or accidental) occurs,
any errors that occur as a result will not surface until the invalid
data is being read, by which time the cause of the error is
potentially quite a long way from where the error would actually
surface. This potential is, in fact, another variant of the rationales
behind my previously-noted manage/control all public interface
entities/members
and raise errors as close to their ultimate
source as possible
principles.
So: How should this kind of scenario be handled? The underlying
dict
data-structure is, by its very nature
intended to be mutable, and capable of using any
immutable type-value as a key and any value-type, immutable
or otherwise, as values, with no type- or value-checking performed for
either. For the purposes that this argument-dictionary is going to be
used, the idea of managing its public interface/members pretty much
requires that:
- Any required keys (
name
anddescription
) that are not provided should raise aKeyError
; - Any supplied key that is not recognized should also
raise a
KeyError
; - Any value supplied for a recognized key that contains invalid
value-types should raise a
TypeError
; - Any key that is not supplied and that has a default value
(
expects
only, at present) should set that default value; and - Any attempt to replace an existing key should raise some kind of error (TBD).
describe.argument
calls happening, and I'd
rather such an effort raise an error than potentially corrupt the documentation
metadata.
Knowing what is desired, the question then becomes how to implement it. There are a couple of approaches, as I see it:
- Implement a class that acts like a
dict
, that performs all of the varied type- and value-checks: This feels better from what might be considered apurist
OO-design perspective, since it would use composition or aggregation instead of inheritance for the important functionality. The trade-off is that the resulting class becomes slightly harder to work with in certain scenarios (generating a JSON representation of an instance being the item that jumps first to mind, but there may well be others that I haven't thought of). - Implement a class that is a
dict
, that performs all of the varied type- and value-checks: Ultimately, this is nothing more than subclassing the built-indict
type and overriding a few methods. This approach should eliminate most (maybe all) of theharder-to-work-with
concerns of the first approach, because the instances will bedict
instances. The trade-off is that keeping the original interface of adict
exactly the same may be difficult, or even impossible, though adding new methods and properties is safe.
dict
. Looking at the
collection-emulation methods, there are eight that would probably need to be
implemented:
__contains__
;__delitem__
;__getitem__
;__iter__
;__len__
;__missing__
;__reversed__
; and__setitem__
.
The second approach, subclassing a dict
, would likely
only require overriding the __setitem__
method. That
assumption bears some closer examination, which I'll do in a bit.
I expect that either approach would customize the __init__
method, and would have the same helper-methods to check for validity of
various keys and values, so those are likely a wash.
In either case, I'd like for the documentation to be serializable to
JSON. I can't put my finger on exactly why just yet, it's just
a sneaking suspicion that I'll want to be able to use that JSON
data-structure somewhere down the line. Since either implementation
would contain values (the types in the
members, in particular), either implementation would also require an
explicit serialization method to be created.expects
So, which of those methods would need to be overridden or
implemented for each approach (dict
subclass
and dict
emulation)? Perhaps the simplest
dict
-emulation approach would be to create an object-structure
that stores its data in an internal dict
. In that case,
most (maybe all) of the emulated methods that would be needed could simply
return the results of calling the equivalent method of the internal
dict
. Wrapping
that method, so to speak. I'll examine
the override/wrapping needs from the perspective of that sort of
wrapping
implementation.
__contains__
- The
__contains__
method tests membership/containment, when, for example,'i' in ( 'a', 'e', 'i', 'o', 'u')
is executed. In adict
, it checks for membership in the keys of thedict
, not in thedict
's values. dict
subclass: Would not need to be overridden.dict
emulation: Would wrap the method of the underlyingdict
.__delitem__
- The
__delitem__
method removes a value identified by a supplied key. Since one of the goals of the object is for it to be immutable, both implementations would need to override the method, raising some sort of error to indicate that the instances do not allow member deletion. A rough equivalent would be attempting to delete a member of atuple
, which raises aTypeError
(tuple object doesn't support item deletion
). __getitem__
- The
__getitem__
method is the underlying mechanism for retrieving a member (whendictInstance[ 'key' ]
is called, for example). dict
subclass: Would not need to be overridden.dict
emulation: Would wrap the method of the underlyingdict
.__iter__
- The
__iter__
method returns an iterable instance of the object's data. dict
subclass: Would not need to be overridden.dict
emulation: Would wrap the method of the underlyingdict
.__len__
- The
__len__
method returns a numeric value (the length of the members in the collection), and is the underlying process behind code likelen( myDict )
. dict
subclass: Would not need to be overridden.dict
emulation: Would wrap the method of the underlyingdict
.__missing__
- The
__missing__
method handles__getitem__
calls when the specified key is not present in the collection. dict
subclass: Would not need to be overridden.dict
emulation: Would wrap the method of the underlyingdict
.__reversed__
- The
__reversed__
method returns an iterable copy of the instance's data, like__iter__
, but in reversed order. dict
subclass: Would not need to be overridden.dict
emulation: Would wrap the method of the underlyingdict
.__setitem__
- The
__setitem__
method assigns values to specified keys (e.g.,myDict[ 'key' ] = 1
). Since the data of an instance should be immutable (at least once a member value has been set), both implementations would require some level of override of the method, if only to check for the existence of a key. The same overriding functionality could (should) be used to type- and/or value-check member keys and values when they are being set, but much of that might be able to be handled by implementation of helper methods. dict
subclass: Full override, performing checks and/or data-structure default-value creation along the way, before calling the parent class'__setitem__
method.dict
emulation: Perform various checks and/or data-structure default-value creation before calling the__setitem__
method of the underlyingdict
.
There's not a whole lot of difference between these two approaches,
ultimately. I'm going to go with the dict
-subclass approach,
though, for various reasons:
- To keep as close to built-in types as I can manage: A
subclassed-
dict
instance is still adict
object, so I won't need to worry as much about writing special code to determine whether instances aredict
-equivalent objects. A simpleisinstance( myObject, dict )
will returnTrue
. I'm not even sure it's possible for an instance of thewrapping-a-
class to be identifiable as adict
dict
-equivalent. I don't know that I'll ever really need to be able to make that sort of identification, but if I do, somewhere down the line, I won't have painted myself into a corner. - For unit-testing reasons: At some point, I'll be planning to write
unit-tests for this class, in all probability. When I do, there
would be fewer tests to be written, since there'd be no reason to
test the methods that were not overridden. That cuts the number of
tests for the class down to two (for
__delitem__
and__setitem__
) from eight — the implications of my previously-statedthorough testing
goal would require unit-tests for all of thewrapping
methods of the class. While I don't mind writing unit-tests (much), they are the most tedious part of development, I think, so anything that reduces the need for more is a Good Thing®™ as far as I'm concerned. - Another aspect of deriving the class directly from a
dict
is that other built-in and common libraries' functions will work without any significant effort. As a case in point, it is possible to usepprint.pprint
to pretty-print an object derived from something thatpprint
already knows how to deal with.
class argdict( dict ):
"""
Provides a dictionary that is specifically purposed for storing
argument documentation metadata"""
#####################################
# Class attributes (and instance- #
# attribute default values) #
#####################################
__allowedKeys = [
'defaultValue', 'description', 'expects', 'hasDefault', 'name'
]
__defaults = {
'description':'No description provided.',
'expects':(object, ),
'hasDefault':False,
}
__optionalKeys = [ 'default', 'description' ]
__requiredKeys = [ 'expects', 'hasDefault', 'name' ]
#####################################
# Instance property-getter methods #
#####################################
The various attributes provide data to support controlling what
member-names are allowed, required, and optional during the process
of setting metadata for an argument. The __defaults
attribute provides default values for metadata members. All of these
will be used as checks and/or value-population items later on.
The class has no properties, so the next significant block is the
__init__
method:
#####################################
# Instance Initializer #
#####################################
def __init__( self, *args, **kwargs ):
"""
Instance initializer."""
# argdict is intended to be a nominally-final class
# and is NOT intended to be extended. Alter at your own risk!
#######################################################################
# Proably just as a reflexive decision, really: I cannot imagine that #
# there'd be any real *use* for extending something this specific, so #
# it's final just to encourage thinking about inheritance-depth. #
#######################################################################
if self.__class__ != argdict:
raise NotImplementedError( 'argdict is '
'intended to be a nominally-final class, NOT to be extended.' )
# Call parent initializers, if applicable.
# Set default instance property-values with _Del... methods as needed.
# Set instance property values from arguments if applicable.
# Type- and (maybe) value-check inbound arguments
initCalled = False
if args:
# A mapping or iterable: list or tuple of (key, value), or a dict
if len( args ) > 1:
if isinstance( args, ( tuple, list ) ):
# Check structure of the mapping/iterable:
# ( <str|unicode>,<dict> ), ...
badItems = tuple( [ ( key, value ) for key, value in args
if type( key ) not in ( str, unicode )
or not isinstance( dict, value ) ] )
if badItems:
raise TypeError( self.__class__.__name__ + ' expects '
'mappings of ( <str|unicode>, <dict> ), but was '
'passed %d mappings that do not conform: %s' % (
len( badItems), str( badItems ) ) )
# If this point is reached, then the baseline structure is
# valid, the instance can be created as an empty dict,
# and it should be populated with iterative calls to
# __setitem__ to assure that the values supplied are
# valid and have the baseline defaults.
dict.__init__( self )
initCalled = True
for key, value in args:
self.__setitem__( key, value )
else:
raise TypeError( self.__class__.__name__ + ' expects '
'mappings of ( <str|unicode>, <dict> ), but was '
'passed %s' % ( str( args ) ) )
elif isinstance( args[ 0 ], dict ):
badItems = [
key for key in args[ 0 ].keys()
if type( key ) not in ( str, unicode )
]
if badItems:
badDict = {}
for key in badItems:
badDict[ key ] = args[ 0 ][ key ]
raise TypeError( self.__class__.__name__ + ' expects '
'dictionaries structured as <str|unicode>:<dict>, but '
'was passed %d entries that do not conform: %s' % (
len( badItems), str( badDict ) ) )
# If this point is reached, then the baseline structure is
# valid, the instance can be created as an empty dict,
# and it should be populated with iterative calls to
# __setitem__ to assure that the values supplied are
# valid and have the baseline defaults.
dict.__init__( self )
initCalled = True
self.__assureDefaults( args[ 0 ] )
if not self.__checkRequirements( args[ 0 ] ):
raise ValueError( self.__class__.__name__ + ' argument-'
'metadata is only allowed to have certain member-'
'names %s, and is required to have certain of those '
'%s. The supplied structure %s did not pass these '
'checks' % ( str( tuple( self.__allowedKeys ) ),
str( tuple( self.__requiredKeys ) ),
args[ 0 ] ) )
for key in args[ 0 ]:
value = args[ 0 ][ key ]
self.__setitem__( key, value )
if kwargs:
# A dictionary
badItems = [
key for key in kwargs.keys()
if type( key ) not in ( str, unicode )
]
if badItems:
badDict = {}
for key in badItems:
badDict[ key ] = kwargs[ key ]
raise TypeError( self.__class__.__name__ + ' expects '
'dictionaries structured as <str|unicode>:<dict>, but was '
'passed %d entries that do not conform: %s' % (
len( badItems), str( badDict ) ) )
# If this point is reached, then the baseline structure is
# valid, the instance can be created as an empty dict,
# and it should be populated with iterative calls to
# __setitem__ to assure that the values supplied are
# valid and have the baseline defaults.
dict.__init__( self )
initCalled = True
self.__assureDefaults( kwargs )
if not self.__checkRequirements( kwargs ):
raise ValueError( self.__class__.__name__ + ' argument-'
'metadata is only allowed to have certain member-names '
'%s, and is required to have certain of those %s. The '
'supplied structure %s did not pass these checks' % (
str( tuple( self.__allowedKeys ) ),
str( tuple( self.__requiredKeys ) ), kwargs ) )
for key in kwargs:
value = kwargs[ key ]
self.__setitem__( key, value )
# If we reach this point without any errors and initCalled is False,
# then it's an empty structure, so all we need to do is call the most
# basic dict.__init__:
if not initCalled:
dict.__init__( self )
The __init__
of a dict
accepts both an
argument-list (a mapping or an iterable) and/or keyword arguments (another
dict
), so argdict.__init__
needs to mirror those
expectations in order to be a drop-in
replacement for the original
dict
that was called for. Once the initialization begins, it
checks for the iterable/mapping in *args
, and if it exists, it
handles it based on whether it's a tuple or list (expecting a mapping), or
another dict
. In each case, it checks for bad items, and if
any are detected, raises an error. If no errors are encountered, a generic
dict.__init__
is called to perform basic, empty-dict
initialization, then the checked values are provided with the required
default values and added to the instance's data-set with the
__setitem__
method that it provides. As of this post, I
haven't tested the portion of the __init__
method that
handles incoming mappings, or keyword-arguments, since they aren't in
use during normal argument-decoration use-cases. I will (eventually)
get to the point of doing formal unit-testing on the entire thing, though,
and those (probably broken) code-branches will get fixed then.
If __init__
is passed a dict
in its
**kwargs
, it performs the same checks and process as are
done when a dict
pops up in the *args
.
#####################################
# Instance Methods #
#####################################
def __assureDefaults( self, argSpec ):
"""
Assures that the provided argSpec dictionary is populated with the default
values from self.__defaults *if they are not already members of the
dictionary*."""
argSpecKeys = argSpec.keys()
for key in self.__defaults:
defaultValue = self.__defaults[ key ]
if key not in argSpecKeys:
argSpec[ key ] = defaultValue
if not argSpec[ 'hasDefault' ]:
try:
del argSpec[ 'default' ]
except KeyError:
pass
def __checkRequirements( self, argSpec ):
"""
Checks metadata member-names against required, allowed names.
Returns True if the argSpec has all required names and only allowed names,
False otherwise."""
argSpecKeys = set( argSpec.keys() )
requiredKeys = set( self.__requiredKeys )
allowedKeys = set( self.__allowedKeys )
meetsRequirements = (
requiredKeys.intersection( argSpecKeys ) == requiredKeys
)
noExtraKeys = (
len( argSpecKeys.difference( set( self.__allowedKeys ) ) ) == 0 )
return meetsRequirements and noExtraKeys
The __assureDefaults
method does just what it sounds
like it should: it assures that the provided argSpec
, a
dict
, has default values (from the __defaults
class-attribute). __checkRequirements
performs a member-name
check on a provided dict
, returning True
if
the required member-names are all present and there are no
extraneous member-names, or False
if either of those checks
fail. The __checkRequirements
method leverages Python's
set
data-type to perform those checks, yielding a simple
boolean value for both meetsRequirements
and noExtraKeys
.
That approach should allow for easy expansion if it were to
become necessary.
def ToJSON( self ):
"""
Serializes the instance to a JSON string, converting the "expected" key-values
from their native tuple-of-types to a list-of-type-names along the way."""
result = dict( self )
for key in result:
if result[ key ].get( 'expects' ):
result[ key ][ 'expects' ] = [
item.__name__ if hasattr( item, '__name__' ) else str( item )
for item in result[ key ][ 'expects' ]
]
return json.dumps( result, sort_keys=True, indent=4 )
The ToJSON
method makes a copy of the current state
of the instance, then performs some value-changes on the members that
will typically contain vlaues that cannot be converted to JSON. Right
now, that's only the
member, which will
contain built-in types, expects
None
(potentially), and custom
types – classes in particular. It does the conversion by looking
for a __name__
attribute on the item in question, which will
take care of both built-in and custom types, or, if a name isn't
available (because it's a single-value type like None
),
it uses a string representation of the item. I haven't been able to
find any other None
-like types or values yet that need
that sort of handling, but that doesn't mean that there aren't any
(or that one or more won't surface later on).
def __delitem__( self, argName ):
"""
Override of dict__delitem__ to prevent the removal of members once they've
been set."""
raise TypeError( self.__class__.__name__ + ' does not support member '
'deletion.' )
def __setitem__( self, argName, argMetadata ):
"""
Override of dict.__setitem__ that only allows specific keys and type-
checks the value supplied before adding it to the instance's members."""
# Type- and value-check argName
if type( argName ) not in ( str, unicode ):
raise TypeError( self.__class__.__name__ + ' cannot accept '
'dictionary keys that are not text types (str or unicode). '
'%s is a %s' % ( argName, type( argName ).__name__ ) )
# Set default values in argMetadata
self.__assureDefaults( argMetadata )
# Type- and value-check the keys/values of argMetadata
# Perform the requirements-check and raise an error if it fails
if not self.__checkRequirements( argMetadata ):
raise ValueError( self.__class__.__name__ + ' argument-metadata '
'items require %s members, and cannot have members other than '
'%s. The %s value passed is invalid.' % (
self.__requiredKeys, self.__allowedKeys, argMetadata ) )
# If everything passed, go ahead and call dict.__setitem__
dict.__setitem__( self, argName, argMetadata )
The __delitem__
and __setitem__
methods are,
I think, pretty straightforward. __delitem__
in particular,
since all it does is raise an error if it's called, preventing member
deletion as part of the quest for immutability of argdict
instances. The __setitem__
method is slightly more complex,
performing a few (pretty obvious) type- and value-checks, delegating
the creation of default values and structural-requirements checking to
the previously-defined helper-methods.
After altering the existing api_documentation._CreateArgumentMetadata
method to return an argdict
instead of a dict
,
it seems to work well. Given the following code:
class Ook( object ):
"""
Test-class."""
@describe.argument( 'arg1', 'Ook.Fnord (method) arg1 description', int, long, float )
@describe.argument( 'arg2', 'Ook.Fnord (method) arg2 description' )
def Fnord( self, arg1, arg2, *args, **kwargs ):
"""Ook.Fnord (method) original doc-string"""
return None
@classmethod
@describe.argument( 'arg1', 'Ook.Bleep (classmethod) arg1 description', int, long, float )
@describe.argument( 'arg2', 'Ook.Bleep (classmethod) arg2 description' )
def Bleep( cls, arg1, arg2=None, *args, **kwargs ):
"""Ook.Bleep (classmethod) original doc-string"""
return None
@staticmethod
@describe.argument( 'arg1', 'Ook.Flup (staticmethod) arg1 description', int, long, float )
@describe.argument( 'arg2', 'Ook.Flup (staticmethod) arg2 description' )
def Flup( arg1, arg2, *args, **kwargs ):
"""Ook.Flup (staticmethod) original doc-string"""
return None
print '-'*80
print Ook.Fnord._documentation
print '-'*80
print Ook.Bleep._documentation
print '-'*80
print Ook.Flup._documentation
print '-'*80
The output is:
-------------------------------------------------------------------------------- Fnord( self, arg1, arg2, *args, **kwargs ) [instancemethod] Ook.Fnord (method) original doc-string ARGUMENTS: self .............. (instance, required): The object-instance that the method will bind to for execution. arg1 .............. (int|long|float, required): Ook.Fnord (method) arg1 description arg2 .............. (any, required): Ook.Fnord (method) arg2 description -------------------------------------------------------------------------------- Bleep( cls, arg1, arg2, *args, **kwargs ) [instancemethod] Ook.Bleep (classmethod) original doc-string ARGUMENTS: cls ............... (class, required): The class that the method will bind to for executions. arg1 .............. (int|long|float, required): Ook.Bleep (classmethod) arg1 description arg2 .............. (any, optional, defaults to None): Ook.Bleep (classmethod) arg2 description -------------------------------------------------------------------------------- Flup( arg1, arg2, *args, **kwargs ) [instancemethod] Ook.Flup (staticmethod) original doc-string ARGUMENTS: arg1 .............. (int|long|float, required): Ook.Flup (staticmethod) arg1 description arg2 .............. (any, required): Ook.Flup (staticmethod) arg2 description --------------------------------------------------------------------------------
A couple of weird little tweaky things surfaced in the
api_documentation
class while I was testing this, but they were
easily corrected and those changes are in the current downloadable version.
There may be a few small discrepancies between that version and what I
noted in my previous post as a result, though.
This seems like a good point to break, since the next documentation-metadata item I'm planning to tackle is the one for argument-lists, and it's going to be at least somewhat different, so there will likely be a fair chunk of discussion beforehand.
No comments:
Post a Comment