Have modern programming languages failed? From the point of view of learnability and maintainability, yes! What would a truly maintainable and learnable programming language look like? This is the third of a six-part series exploring the future of programming languages (read The World’s Most Maintainable Programming Language: Part 1, The World’s Most Maintainable Programming Language: Part 2, The World’s Most Maintainable Programming Language: Part 4, The World’s Most Maintainable Programming Language: Part 5, and The World’s Mode Maintainable Programing Language: Conclusion).
Simplicity
Simple things are easier to learn, so the language will optimize for simplicity, having as few commands as possible.
Because the goal of the language is to be as easy to learn as possible, it must use only a few primitives. Why? Compare decimal math with hexadecimal. People who haven’t already studied programming or higher mathematics find decimal much easier to use. It’s obvious why; it uses almost 40% fewer primitives, lacking A - F!
Consider also the endless homophonic confusion in natural language, where there are multiple possible valid consonant spellings for a single phoneme in differing contexts. Reducing a spoken language to a simple set of separate phonemic representations would undoubtedly make it easier to learn. Yet the ability to combine characters into words in a written language still allows expressibility and extensibility.
The same goes for a programming language.
An example may help. The Latin language is easy to learn (at least in comparison to modern languages) because each letter has a unique sound. (I do not ignore the difference between long and short vowel sounds, as properly written Latin uses the long bar notation to denote long vowels, removing the ambiguity.) If you can pronounce a word in Latin, you can spell it — and vice versa. This is a much better situation than even English, with confusing homophonic pairs and triplets including ghoti/fish, lead/lead, and deer/dear.
It’s possible to go too far in this direction. If you take Turing’s hypothetical universal computing machine and somehow manage to invent the infinite tape necessary to drive it, you only need four primitives. However, the semantic simplicity of such a system is too overwhelming. Perhaps eight to ten primitives is the right number. The goal of simplicity in this context is to create a system where it is impossible to write code containing a construct a newcomer to the language will not recognize.
It’s also much easier to read a one-page guide than a six-hundred page dictionary.
Language design should focus on removing redundant features, options, and choices, to consolidate an essential core of high-level operations into a small, easily learned, unambiguous feature set.
Comprehensiveness
A language with support for different platforms and paradigms and tools is better than a language without, so the standard distribution will include support for everything useful.
If the most important thing you can do with a language is to learn it, the second most important thing you can do is to solve problems with it. Problems come in varying shapes and sizes, so the language designer should only rule out classes of problems to solve if the solution gets in the way of learnability. For example, while it’s possible to support Unicode in an efficient and effective manner, it’s impossible to do so transparently, or at least in a way that makes sense to novice programmers. Thus no language that supports Unicode is truly maintainable.
It’s worthwhile to examine two separate approaches by two existing, imperfect languages to find the right approach.
The Java language has many flaws, mostly related to inconsistency and overcomplexity. Yet people use it primarily because it has a huge standard library. The comprehensiveness of its support overcomes the deficiencies of the core language.
The Haskell language has a very small, simple core based on a few mathematical properties that most people learn in school. Yet it languished in adoption until the most popular implementations adopted a standard mechanism for file IO. Some users who have tried the monadic system might rightly assert that the particulars of this implementation added perhaps too much complexity to a system that already had enough primitives. This only goes to show the tension between simplicity and comprehensiveness and why it’s important to address both while designing the language. Would the Haskell designers have chosen a better set of primitives if they had considered monads at the start? Undoubtedly!
Though the existing literature in programming language design and research often refers to this small set of primitives as a core calculus, the term misleads novices. Where the goal of calculus is to resolve Xeno’s paradox by the recursive application of ever-smaller straight-edged rulers, the goal of designing a programming language is to approximate perfection by the application of ever more perfect language constructs. There is an obvious similarity, but the word “calculus” implies an asymptotic limit for the payoff to effort ratio. Because digital computers are completely digital at heart, with no confusion about 1 and 0, it must thus be possible to ratchet a hierarchy of layered abstractions to likewise eliminate all confusion in successively higher-level languages.
One point remains unaddressed: the issue of the language extension
mechanism. As mentioned earlier, Java suffers from overcomplexity despite
its useful core libraries. One of the reasons for this is an artificial
distinction between primitives and extensions. For example, while the
language rightly eschews operator overloading in general, as it is
difficult to explain and fiendishly difficult to implement in practice
without generating homophonic confusion, it allows it for the
String classes. Novices must understand that these classes are
special and different from all other classes that exist or may exist.
The PHP mini-language, at least until version 4, solved this problem by
making everything a function. This is the hallmark of simplicity — all
extensions produce functions that appear indistinguishable from built-in
functions. It is easy for a reader to understand that
mysql_connect() does connect to a MySQL database, as does
pg_connect() connect to a PostgreSQL database.
Other languages go too far in the opposite direction, layering too much
abstraction. For example, in Perl’s non-core database access layer,
DBI, connecting to a MySQL database and a PostgreSQL database
use the same apparent code: DBI->connect( ... ). Though
this appears to use the principle of consistency by reusing an apparent
primitive, it actually suffers in that it does not clearly distinguish
between different things. connect() is a false cognate because
it is a transitive verb, always requiring a direct object.
Still, even an imperfect language with comprehensive library support has advantages over a perfect language with sparse library support. (Ignore for the moment the revised ontological argument which proves that the most perfect possible language must have good library support.)

'Because the goal of the language is to be as easy to learn as possible, it must use only a few primitives. Why? Compare decimal math with hexadecimal. People who haven’t already studied programming or higher mathematics find decimal much easier to use. It’s obvious why; it uses almost 40% fewer primitives, lacking A - F!'
Hmmm, not so sure about this. I think what you're really saying is people who haven't used hexidecimal already find decimal easier to understand. This is obvious. It is to do with familiarity. Binary has fewer primitives than either hex or dec but do people find it easier to deal with - of course not.
I admire the principal of writing a series of articles such as this but I think a lot of your arguments are convoluted, overly wordy and often way wide of the mark.
'''
[Java] rightly eschews operator overloading in general, as it is difficult to explain and fiendishly difficult to implement in practice
'''
Bullshit. Operator overloading is quite simple (it's nothing more than writing a method), and it's the only way to maintain consistency between builtins and user-defined types.
primitive != easy to use. Consider a human language with 100 words. Instead of "train" you have to say something like "the thing that runs on rails". The sentences in that language will be terribly verbose and full of duplication. That's why higher level concepts and abstractions are needed.
mysql_connect() and pg_connect() are also really bad examples. Have you heard of polymorphism?
As I have stated before. The language should grow with the ability of the one that uses it. You don't give first graders Dostoyevski to read. Some are never able to read him.
So give the beginners some constructs they can do something useful with but also provide all the means an expert would use. If the beginner cannot understand it, it is because he is a beginner, and needs to learn more, not because the expert wasn't able to write it down better (although that can be the case too, but then it wasn't an expert).
I completely agree with Ben, decimal is not easier to understand because of the number of primitives; it is just what people are used to. Limiting the number of commands in your computer language could limit its usefulness. The number of commands should not be limited, but should be set so that the user for the language is capable of creating useful programs.
Eliminating options does not make something easier to learn. Everyone learns differently and everyone has his own preference when it comes to speaking, so why not in programming? I'm not in favor of redundancy, but if there are two ways to do the same thing, a novice still only needs to learn one way, but later he may find one way easier than the other. Eliminating synonyms seems a lot like Aldus Huxley's "newspeak" from Brave New World which we know was not double plus good.
... continued
Here's an example I came up with:
Patient p1;Patient p2;
if ( p1.isDonorMatch( p2 ) )
{
p1.scheduleKidneyRemoval();
p2.scheduleKidneyTransplant();
}
Now is p1 the donor? I hope so. I see this kind of code all the time too. It is not immediatley clear which instance is which. I would have named p1 'donor' and p2 'recipient'. That would clear things up. Also, the isDonorMatch function is confusing too. Which patient is the donor - the invoker or the passed in paramter?
I'm not a programmer but I understand the basic concepts. I just think the progammer needs to program more effeciently, rather it be making their code readable with standard naming conventions, or utilizing memory effeciently.
I think that as our hardware technology gets better, programmers are programming uneffeciently and wasting memory utilization in their code.
No one is breaking down programming bit for bit anymore, are they? There is more memory and faster IO nowadays so I assume programmers are getting sloppy when compared to programming 15 to 20 years ago. Twenty years ago you either programmed using resources effiently or you got bad performance. It really doesn't matter to these programmers today as they can program sloppy and still get pretty good performance.
From what I'm getting is that you should stick to the fundamentals of programming and think of it bit by bit. I'm not a programmer and I'm only 27 years old so I don't care if this makes sense or not.
Later you code slangers
I take it you guys have never heard of the "One instruction set computer"?
OISC has only one instruction (subtract and branch if negative), but I don't think it makes programming more enjoyable whatsoever!!
http://en.wikipedia.org/wiki/OISC
Have fun in the Turing Tarpit :)