If you want to criticize a language or platform or technology you don’t use, attack its scalability. Though the word could mean that the entity requires a proportionate amount of effort to work for small loads as well as large ones, the common derogatory connotation is that it’s inappropriate for the largest, manliest projects where so-called real developers build serious systems.
(If that offends you, good. It should.)
The problem with that definition is that most projects in the world do not require immediate responses for thousands of simultaneous users. What percentage of projects do? 10% 5%? 1%? How many projects require multi-machine clusters with high-availability and high-reliability guarantees? (Contrarily, how many business-critical projects run off of a spreadsheet with a few macros created by a non-programmer?)
That doesn’t mean that scalability to large or highly-available levels is useless; it’s important for those projects. In general, however, it’s a lousy way to evaluate a piece of technology as it’s an abnormal use of that technology. Besides that, is there any single general purpose solution that fits more than one of these large projects, or do they all require some degree of customization?
Upwards scalability for performance is not the only way to judge a technology. It’s probably not even the best way. There’s also downwards scalability for performance (a million dollar mainframe is kind of a waste if all you want to do is serve a couple of thousand web pages every day) and, more importantly, upwards and downwards scalability for the people using the technology.
Another way to evaluate a technology is by how well it works for beginners as well as experts. How difficult is it to start using a technology? Does it require experience in related technologies, or is it a useful introduction to the field? Can an expert in a similar technology be productive with a limited introduction to the new entity?
As well, how does the technology work for people with substantial experience in it? Does it reward that experience, or will frustrations or limitations hobble people who use it for large or complex projects? A language or tool that’s easy to learn may be good only for small or simple projects; it may not even allow you to consider approaching larger or more complex tasks.
For example, it’s easy to teach a child basic drawing commands with the Logo programming language. That doesn’t mean it’s easy to draw tesselations. Likewise, it’s easy to explain the notation of prefix addition of two integers in Lisp or Scheme, but explaining a series of deeply nested S-expressions, including macro expansions and special forms, is still complex.
There are plenty of good reasons for these tools to exist, however. If you only need to do something simple, perhaps automating a series of command-line tasks, a language or tool as easy to start with as a shell one-liner may be the right choice, even though creating and maintaining a thousand-line shell script can be an exercise in pain.
As well, there are technologies that are easy to learn but remain powerful and reward continued knowledge.
The places in which the tool or technology apply are very important too. Expecting an engineer to use Matlab productively is very different from expecting an administrative assistant to script a spreadsheet by using embedded C++.
This is more than just the focus and strengths of a language, however. There aren’t a lot of one-liners written in Python or Java for very different reasons.
Despite the (dismissive and logically useless) argument that “language X wasn’t designed from the start for purpose Y”, good, general-purpose languages tend to spread to other areas for reasons often disconnected from the designer’s original goal. Perl’s versatility as a language capable of powerful text processing made it a natural and obvious choice for early CGI programs. Though awk and sed (and shell, using them) are also capable text processors, the combination and unification in Perl made it a better fit than other specialized languages and tools.
Likewise, how many set-top boxes do you see running Java applets these days?
Horizontal scalability matters. How much work is it to solve problems in different domains? You can solve the nine-queens problem or find prime numbers with a backtracking regular expression engine, but is the effort worth the results? I don’t want to discount the value of fun or clever hacks. However, that’s not the normal mode of operation for a language or platform. It may expose new niches into which the technology can spread, but successful mainstream use in that niche may require more work and attention to quality and polish.
The ease of which you can bring a language to a new niche brings up an often-ignored point.
The Ex Factor
The history of useful programming languages shows clear evidence of increasing capabilities for abstraction. Mark Jason Dominus’s Design Patterns of 1972 explains how this should work in programming language design.
If you think C is low-level, compare it to assembly language (with or without macros). Loops and control flow beyond “compare and jump” are nice. However, if you think C is high-level, try to do string manipulation with only the standard library. A language that allows greater abstraction tends to be more scalable along all of these axes than a language that disallowes such abstraction.
(I lay aside the situation where you need extremely low-level access to a piece of hardware, for example, but note that even modern CPUs offer layers of abstraction above the raw transistor layer.)
Points of Abstraction
Consider various points of abstraction for several well-known languages.
C has a macro preprocessor that performs textual substitution. You can
change the syntax of the language, if you are careful—for example,
end in place of
Lisp and Scheme have more powerful macro systems that let you transform program statements (as nested lists) into different forms. This is more powerful than simple textual substitution for at least two reasons. First, the regularity of language syntax makes it much easier to produce valid code from a transformation than syntax-unaware textual substitutions. Second, the regularity of language syntax means that code using the macros looks just like any other code; there is no obvious difference.
Smalltalk has a regular syntax (through not quite as regular as Lisp or Scheme) and a powerful metaprogramming facility at the heart of the tools provided by an image. The class browser—and its complete integration with the language and program itself—allow plenty of opportunities to change the way the language and libraries behave.
Perl, Python, and Ruby all support compiling code as the program runs. This provides the opportunity to use normal code to write code without having to restart the program to insert the new code. (This is also a feature of the other languages mentioned previously, but you have to be more clever about doing it in C.)
Java has several powerful IDEs that can generate and rearrange code for you at the click of a mouse. Alternately, you can use a data-driven model to generate code from another language such as Perl or Ruby, or even from XML documents.
Lossless Source Code Compression
Peter Scott made the point at OSCON 2006 that an average, one-hundred line Perl program can accomplish about as much work as an average, thousand line C program, yet developers take the thousand line program much more seriously than the hundred line program as it’s ten times the code. Any fool can write and maintain a hundred line program, but once you get to a thousand lines of code, you’re talking about real programming.
There’s a secret there that deserves further thought.
What if you could turn a million line project into a hundred thousand line project without losing any expressibility or features? (Ignore the cost of duplicating the functions of the project in a new language; that’s a different discussion.) Is the smaller program any less serious because it requires fewer lines of code than the larger? Is it any less valuable? Does it deserve any less rigor in source control or quality assurance or testing or design or refactoring or maintenance discipline?
Which language is more powerful, the one that allows you (or is that requires you) to build million line systems, or the one that allows you to do the same thing with an order of magnitude less code?
That’s an easy question. The more interesting question is to ask “What makes one language more powerful than another?” If answered accurately, perhaps that’s a much better way to compare programming languages than how quickly they can perform IO- or network-bound tasks.
David N. Welton’s The Economics of Programming Languages argues that the productivity benefits of switching to a new language must overcome substantial costs. The potential of the language for producing software in less time, with fewer risks, and with higher productivity and quality governs its value for switchers.
What’s the secret? Expressability: the language facility with which to build abstractions. A very rough way to consider it is the ease with which you can extend the language in ways that feel language-like. Adding a new method to all arrays in Ruby from Ruby itself is easy; adding a new method to all arrays in Java is impossible. Redefining all loop structures in C to avoid fencepost errors is tedious and difficult. Doing so in Lisp or Scheme is much easier, if even necessary.
The distance between a problem and its solution—and the cohesiveness of the solution with the whole project—is an important and subtler point. If all you have are objects, any abstraction you can devise must fit into an object pattern, even if a closure or higher order functions would be a simpler fit.
I suspect there’s a way to analyze programming languages by examining real projects built by teams of similar skill and experience. Perhaps further research would reveal both if the expressivity/abstraction factor exists and to what extent it matters in developing high-quality software. Yet that would require thinking about languages and platforms and technologies in terms very different from whose project has more concurrent users.
More’s the pity.