I have commented before on how important it is that programmers document the software that they write and working with code produced by numerous other programmers, as I am currently at the Digital Innovation Lab at UNC, only confirms to me the importance of doing so and how rarely it is done well if at all.
Why It Is Important
Just as writing good, clear expository prose is a sign of (and exercise for) good, clear critical thinking, so is writing good, clear documentation a sign of (and exercise for) good, clear software design and architecture.
Programming is almost never a write-once exercise. Documentation is a means to communicate your design and implementation to others who will be maintaining and extending your code, learning from and improving your ideas, and helping to sustain the initial investment in creating the software in the first place.
The worst case scenario is that a critical programmer leaves because s/he gets a job at another company or institution (programmers tend to enjoy a lot of employment mobility), or dies in a tragic caffeine overdose accident, and someone unfamiliar with the code is left with what s/he left, in whatever state of (in)completion or (im)perfection.
However, even the best case scenario of programmer continuity is challenging, given that code is complicated, a programmer isn’t going to remember all of the details of how s/he designed something and his/her code is inevitably going to be seen and used by others. Just because it looks nice and runs doesn’t mean that all is well under the hood.
How The Internet Ruined Software Design
Even in the old days (I’ve been programming on and off for over 30 years!), software engineers needed a way to handle the growing complexities and inter-dependencies of computer systems, and object-oriented programming was a novel solution to managing complexity and encapsulating design details.
An object, in this conceptualization (first pioneered by XEROX PARC’s Smalltalk language in the 1980s), presents a certain set of features and behaviors to the outside but exactly how those things are implemented is hidden and up to the programmer. The object must be a self-contained black box that acts like a responsible citizen in its local environment, simply making good on the set of contracts it presents to its neighbors, allowing requests to be made on its internals and providing useful information about itself when requested.
Along comes the internet, which allowed users on their PCs to interact (exchange information) with servers and other PCs elsewhere. The result has been the fragmentation of code and data, breaking up the neat and tidy conceptualization of objects and delegation of responsibilities for implementation that existed in the OOP world.
Why I Eschew JavaDoc Style
Any system that encourages and facilitates documenting software is probably good, but I’m disappointed that the JavaDoc style seems to have become an industry de facto standard. Why? Although it produces nice results in HTML for looking at interfaces outside of the code itself – which is great for large-scale programming languages, libraries and systems used by large numbers of people – this system is ill-suited to the most common usage scenario: programmers looking at documentation in the context of the software itself in smaller code bases to fix bugs, make changes, and figure out what is going on. JavaDoc is ill-suited to human consumption where it is needed the most: in the context of the code itself.
Here’s an example of JavaDoc to contrast with a human-readable style of the sort I write. JavaDoc:
/** * Returns the localized preferences values for the key. * * @param key the preferences key * @param useDefault whether to use the default language if no localization * exists for the requested language * @return the localized preferences values, or empty array (if key undefined) */ public String getPreferencesValues(String key, boolean useDefault);
My style for the same function:
// getPreferencesValues() // PURPOSE: Returns the localized preferences values for the key // INPUT: key = preferences key (PortletPreferences) // useDefault = true if use default language if no localization exists // RETURNS: the localized preferences values, or empty array (if key undefined)
What to Document
- What is the purpose and responsibility of this entity?
- What code libraries is it dependent upon and which versions of those are known to work? (For example, “Uses jQuery 1.7”)
- What side-effects does this entity cause? For example, creating or modifying global variables, effecting or undoing hooks, passing JSON objects, etc.
- What are the variables in the outer-most scope, what are their purposes and designs? This is particularly important where the contents can be dynamically created via JSON and other run-time methods, and no clear preset of fields are provided in code.
A function or method should still be treated as a black box that offers and obeys a strict contract which is implicit in the parameters which come in and the results that come out. However, as I’ve implied above, the black box is more permeable than ever (especially in online contexts) thanks to the fragmentation of code and the use of libraries.
While exactly what you document at the beginning of each function or method will depend upon its language, purpose, context and size, I usually include the following whenever they are appropriate for the function/method’s “contract”:
- What is the purpose of this code?
- What are the input parameters and what is the allowable range of values (or special values with special meanings)?
- What value(s) or structure(s) does the code return?
- What are the assumptions of the code – that is, what values or conditions outside of the black box (particularly global states and variables) does it rely upon?
- What side-effects does it cause outside itself? That is, how does it effect the global context and thus break the black box principle by changing global variables or other aspects of the environment?
- What exceptions does this code trigger and/or handle?
- What is left to do in this code? Explain known bugs, shortcomings, code yet to be optimized or updated.
Digital Humanities Post-Script
One of the interesting branches of the emerging field of Digital Humanities examines critically the socially constructed nature of technology, how, as an artifact it reflects, reinforces and participates in the privileging of certain groups, epistemologies and ideologies. (Postcolonial Digital Humanities is a great forum for some of these issues, there have been interesting critiques from the vantage point of feminism and volumes such as The Cultural Logic of Computation critiquing the ideology of “computationalism.”)
As a practicing humanities scholar, I’m quite aware of the cultural values and imperatives implicit in the “world-order” of software design: a tightly managed, well-behaved world of contracts, etc. I’m sure there is much more to say about this product and process from the vantage point of cultural criticism – even if it is unlikely to have much effect on the exploitation of computer-based technologies.