I’ve been trying to articulate something which has been bugging me recently – the way we name software.
There’s a well known quote attributed to Phil Karlton on the subject:
“There are only two hard things in Computer Science: cache invalidation and naming things.”
Naming is hard, but there are some traditions and patterns we can use to avoid bad practice and confusion.
First, I’d like to draw a distinction between naming software and the already well established practice of using stylistic conventions in code. What I’m talking about is more akin to what title you’d give a book than how you’d structure the text inside.
Naming software is, of course, a subset of general nomenclature, quite a complicated (and subjective) field. I think, though, that we can narrow down how we currently name software to a few general models: abbreviation, acronym, noun, proper name and metaphor.
There are plenty of exceptions to each of these, but I think it’s a useful way of understanding usage.
Attributes: Tends to be used for single-purpose tools. Name encapsulates utility.
Attributes: Used in protocols, standard libraries and aspirant standard tools. Names are composed of standard technical terminology when unrolled.
Attributes: Like acronyms, used for standards, canonical resources and aspirants. Name is standard technical terminology.
Attributes: Product name. Can encapsulate a lot of disparate functionality and verbiage. Name doesn’t directly imply utility.
“A word that answers the purpose of showing what thing it is that we are talking about but not of telling anything about it.” (Wikipedia) Ubuntu, for example, is an African word meaning “humanity to others” but for all intents and purposes in computing, it refers to the Ubuntu operating system. Skype is similar. Git is another. They are names which do not convey utility in anything but the most abstract sense.
Attributes: Based on abstracted analogous relations. Uses real world objects to imply internal (and sometimes external) relationships. May allow one to infer function based on comprehension of the metaphor.
Metaphor involves taking the abstract concepts from a piece of software, finding commonalities with otherwise unrelated “real world” or pre-existing systems, and using these pre-existing terms as a scaffold for fleshing out the abstract software concepts. This is fundamental to userland software – the “desktop” and associated documents, folders, etc, is probably the best known metaphor in computing.
Metaphors bug me. They bug me because they seem so elegant in their first draft, and that sort of elegance appeals to software developers. They bug me because they can work well to inform proper names, but make “overmetaphorizing” easy. They bug me because widely used metaphors like the “desktop”, are so ingrained that people tend to forget how unintuitive they can be to a newcomer.
Issues that stem from metaphor:
- When the software needs to be extended, it may stretch or break the metaphor (diluting meaning)
- The metaphor may needlessly restrict the function of the software
- It can be meaningless without context
What really bugs me though, is that metaphors are a promise often broken.
There is the inevitable mixing of metaphors as projects mature. The most usable software projects I use don’t beat about the bush with their naming system. Imagine “git clone” was “git boba” or something equally absurd, tossing out perfectly understandable terminology for a Star Wars reference.
Functional inference is the dragon chasing of software metaphor.
There’s so much software out there that performs mundane or straight forward tasks but have “clever” names for one reason or another (you probably only have to look at the Gemfile of your nearest ruby project to find several examples). Python tends not to be too bad for this, but (I don’t wish to single this out, it’s just an example) there are still some silly things you’ll have to hear, like references to “the cheese shop” which is the Python package index. This terminology seems to have fallen out of official usage, but older mentions of the package index by name which might be seen by new users tend to be accompanied with an explanation that it is the package index.
This sort of stuff is a bit of fun at first, but you can imagine that as these things build up and metaphors fracture, or other metaphors make their way in, one ends up with a tangled web of vocabulary which repeatedly requires explanation. It is needless additional cognitive burden. That’s what it boils down to – needless additional cognitive burden in software bugs the hell out of me. It’s one of many subtle “learning curve steepeners” that creep into projects. The individual effect might not be much, but the cumulative effect is frustration.
Things to consider when naming
- If you’re building a simple tool that performs a specific task, try to use an abbreviation or English word(s).
- If you’re building a library, especially one which aims to be a standard in its area, use industry terminology to name it.
- Metaphor should be considered carefully before being used.
The future of naming
While discussing this with Eamon Leonard, he recalled the IUPAC naming scheme for chemicals that we were taught in school. The point of this system was to establish a limited vocabulary that one could use to describe any chemical compound. Biology also has several methods of classification and naming for organisms. While I think the software in its entirety is probably the only analogue to IUPAC naming, I wonder if we might find some inspiration in biology for giving canonical, descriptive names to our software?