Have modern programming languages failed? From the point of view of learnability and maintainability, yes! What would a truly maintainable and learnable programming language look like? This is the second of a six-part series exploring the future of programming languages (read The World’s Most Maintainable Programming Language: Part 1, The World’s Most Maintainable Programming Language: Part 3, The World’s Most Maintainable Programming Language: Part 4, The World’s Most Maintainable Programming Language: Part 5, and The World’s Most Maintainable Programing Language: Conclusion).

Consistency

The most important feature of a language is that it is completely consistent. There should be no inconsistency or nuance or shade of meaning.

Inconsistency is the enemy of understanding. This is true when discussing possible multiple interpretations of a construct during compilation and analysis or when reading the source code to a program.

Consistency is partly a notional problem. Consider languages that support object orientation but force the programmer to name the invocant of a method explicitly — there is tremendous potential for confusion! Is the invocant this or self or object or do all method and attribute accesses have an implicit invocant? How unmaintainable the practice that raises such a question!

To solve this problem at least, while respecting the principle of learnability and avoiding false cognates, a maintainable programming language should name all invocants after the name of their classes. That is, within a class named AlienInvader, all methods will automatically have access to the invocant through the symbol named alien_invader. (Consider how confusing it would be if the language allowed unfettered creativity. Would there be instances such as alien, predator, martian, blob, horta, kudzu, and poisson?)

That helps consistency in the small, but what about throughout an application or a problem domain?

Another potential point of inconsistency is in using different symbol names for the same types of items. For example, a database handle may be db or dbh or d or handle. Various parts of the code may refer to the same type of thing with different names — inconsistency that leads to misunderstandings. This is a similar but different point. Here the problem is the existence of separate pockets of jargon. When these pocket communities overlap, their jargon conflicts. (As well, the term “handle” is vastly inappropriate for any community and will not appear in the final language.)

To solve this problem, libraries will enforce the use of one particular identifier for each separate entity in the system. Obviously the library designer knows best about how to use the entities modeled by the library, having carefully considered all of the potential use cases (and taking into account the language’s design principles), so there will be no clearer names than those provided. Some coders used to other, less maintainable systems, may object on terms of “creativity” and “expression”, but excess of consistency in service of maintainability is certainly no vice, where any good coder can tell stories of irredeemably creative symbol names providing no value to a system.

The compiler might even go as far as to include a part of speech checker to ensure that method and function names are verb clauses, variable and object and entity names are nouns, and aggregate data structures have the proper number, case, and pluralization. (Typos are a significant source of errors.)

External consistency is a problem with regard to specifications and implementations as well. Not only must maintainable programs be consistent within themselves, but they must be consistent with other programs. Even though many programs end up interoperating, where external consistency is obviously important, allowing even subtle linguistic and semantic drift in small pockets will only lead to difficulties in understanding.

Many programming languages, even those with formal specifications, fall afoul of the problem where the specification is ambiguous or an implementation does not implement the specification appropriately. To alleviate this, all implementations must implement the specification appropriately and no implementation will be complete unless it produces the same output for the same file as another implementation.

Put another way, no program will be complete and correct unless multiple implementations have compiled it to the same code. This suggests that the compiler should be a front-end to two or more separate compilers, ideally running on separate platforms. This need not extend the length of the compilation stage significantly if the tools take full advantage of threading and parallelization techniques, but by reducing ambiguity in the language many of the difficulties in parsing and optimization go away and it should represent a small investment for the sake of program correctness.