I recently started writing an article about Domain Specific Languages (DSL’s) and Ruby. While conceptually I think I understand what a DSL is, I think a specific definition is elusive. After scrutinizing several articles and the Wikipedia, I can propose a limited definition of a DSL as a custom language designed to solve a specific problem.

Some examples of DSL’s listed are unix mini (or little) languages, such as sed, awk, troff, m4 or make. The list is quite long, and actually, could go on forever, since most any language, computer based or not, could be considered a DSL. But for now, I’ll limit this entry to languages used in the computer domain.

From my work, I see DSLs used for two main purposes. One, is as a friendly way to provide data (configuration or otherwise) to a program. The other is a friendly way to let users write business rules for a particular task. This is usually motivated by the desire to let the end user write code without realizing they are actually coding.

Except for very simple command lists, you see this second type less often because the complexity usually requires one to either build a real mini language or, if using a traditional general purpose language (GPL), the derived DSL is usually more cryptic or complex than just using the GPL itself.

When designing a DSL, the programmer has to weigh the options of taking the effort to build a full featured language (using the likes of YACC or Bison) or to make a simpler language that can usually be parsed with a hand built parser. It’s the choice between ’simple and now’ or ‘full featured and later’.

But the danger is that ’simple and now’ languages, if successful, tend to grow into ugly and complex later.

Consider make. I’m not intimately familar with the origin of this language, but here is my wild guess as to how it came about:

Programmer: Hmm, I’m tired of repeating these build steps over and over. I need to make a control file to do this for me. I also want to be able to take this with me to other platforms, so I’ll need to use a portable language. Hmm, I know C, I’ll use it.

Now lets see, I don’t want to go to the effort of writing a real language, I just need some simple features, so I’ll write my own parser.

I need a simple way to defined dependencies and a target, something like:

  target : dependencies

Yeah, the ‘:’ is good. You don’t see colons used much in filenames.
Now I need to define a list of actions. Hmm, how about:

  target : dependencies
  begin_actions
    action1
    action2
    ...
  end_actions

Wow, this is hard. These blocks are killing me. Hey, wait a minute, I can get rid of these blocks if I just make the user use a tab as the first character on an action line, kind of like Fortran, but more sinister since you can’t see the tab character (woohaahaaha). This way I can do a simple character test in C and don’t have to do any complex parsing. The user shouldn’t mind too much.

Having a critical syntax that depends upon an invisible character is just a horrible design — for the end user. For the programmer, it was pragmatic and reasonable.

Instead of writing a homegrown parser, another alternative is to create a grammar and use a tool like YACC and create a parser. This definitely falls on the complex side of the scale. For someone who doesn’t do this everyday, even simple tasks take a huge amount of brain power and one ends up focusing more on minutia of the DSL, and not on higher level usability issues.

Several years ago I needed to write a description file of geometrical stack. Several vendors had their own format, which were mostly line based, but a couple supported scoping inside a block. None, however, supported variables or constants. Their files were basically glorified configuration files.

I started to write my own using Racc. It was a great learning experience for me. We chose to write our own parser because we wanted to limit what could be done in the file (why, I don’t know). I spent about three weeks on the project and things were progessing nicely. It almost looked like Ruby. But, it was tedious, and other things got prioritized over the project before I could finish.

Later I revisited the project. This time I thought, hey, why should I write my own parser, XML/XSLT and xmlproc will do this for me. So, within a couple of days, I had done what took me three weeks previously. I thought it looked readable and the time and was able to partially convince a colleage that it was readable. About a week later, when I came back and revisited the file, I realized, XML is not readable. Sure, if your brain is in XML mode, then it can filter out the syntax noise. But when one is concentrating on getting a particular job accomplished and thier brain is forced to task switch between their problem domain and mentally parsing XML, overloaded synapses are a certainty.

The third time around, after the Ruby DSL hype had been going around for a while, I decided to use Ruby. This time, I was able to create the DSL in about five minutes. It was readable, and I was able to focus on the end users frame of reference.

The moral of this story is, don’t write a mini language if you don’t have too. And, don’t settle for a simple DSL when a full featured one is needed. Consider extending a GPL into a DSL. Particularly an expressive language that is good at creating a readable DSL — like Ruby.

Back to the original question of a what exactly is a DSL. One can either write a DSL from scratch or use a GPL with a few added functions to create a DSL. But if any GPL can be made into a DSL, doesn’t that make all languages DSLs?