Mailing list membership is a must for anyone who works with
complicated computer technology, such as a programming language, a
server, or a professional software package. On popular mailing lists
for difficult topics, such as Linux distributions, messages stream in
at every hour of the day and night.
How many messages on these lists get satisfactory answers? How long
does it take to resolve the questions? These are just two of the
simpler ways to measure a mailing list’s effectiveness (we will
encounter others as we proceed).
In a series of blogs I’ll present the results of a modest research
project of mine to measure the effectiveness of two mailing lists,
which will be the start of what I hope to be a larger study.
Why do this study?
It’s worth measuring the effectiveness of mailing lists for several
reasons. First, anyone who runs a project (and anyone who depends on
the software for critical functions) should care about whether this
central resource is working well.
Second, mailing lists are part of larger phenomenon of grassroots
information sharing, as seen in wikis, news blogs, the open
Internet-based peer review of scientific articles, social networking
sites, and web mash-ups. Many of these are being closely studied by
social scientists; for instance, Harvard held a conference on
Wikipedia earlier this month. The humble mailing list belongs in the
ranks of these social trends. Just consider that it’s made up of
spontaneous contributions by thousands of people who don’t wait for
experts to provide information, but generate it themselves and share
it in peer-to-peer fashion.
Finally comes the motivation that drove me to do the research in this
article. Mailing lists are just one instance (but one that’s
relatively easy to follow and measure) of what I refer to as
community documentation. In the computer field, it is part of
an information ecosystem that includes reference material by project
developers, web pages, IRC chat rooms, and other places where
non-professionals (often on a volunteer basis) try to educate the
public.
People increasingly turn to this ecosystem in place of formal
documentation created by companies such as O’Reilly Media. Users of
this resource should care about its quality. I’ve already made it the
subject of two major articles: Splitting Books Open: Trends in Traditional and Online Technical Documentation and Rethinking Community Documentation.
Research questions
The questions I started with while examining mailing lists were:
-
How many questions get answered?
-
How long does it take to answer questions (measured both in the number
of messages and in the elapsed time)? -
How many external resources (books, web sites, and standard
documentation such as Unix man and info pages) are referred to? -
How much noise—unhelpful or irrelevant messages—do the
lists suffer from?
Research process
I chose to do this preliminary research on two mailing lists I know
well: the Fedora and Ubuntu lists. Fedora and Ubuntu are two of the
most popular Linux distributions, but both require a good deal of
tinkering if you want to do anything not explicitly built in to each
distribution. The lists are very active, and anything is fair game:
hardware, drivers, applications ranging from mail clients to
enterprise-level servers, and related projects such as the SELinux and
AppArmor security systems.
Research on mailing lists is facilitated by the way they organize
messages. When people answer a question (or add to another answer) the
mail software links all the messages in a set of messages called a
thread. A mail client can group messages by thread, so they
are easy to segregate from the rest of an active list.
My goal was to track particular threads from the initial message
(which asked a question) to the end. To choose the threads, I
generated random timestamps during a period of about three weeks, and
examined the first message dated after each timestamp. If the thread
was started by a technical question requiring help from the list, I
read the entire thread and determined where it led. I skipped threads
that were not about technical support, such as threads about what
would come in the next release of the software, or about how many
users the software has. This is because I’m interested in how well
mailing lists serve as sources of information that help people use
their systems more effectively.
I classified a total of 206 messages on 28 threads. This is not a
large sample, but it’s enough to show some trends and generate issues
for further research. The reading and classification took about five
hours.
Threads did not always correspond neatly to conversations. Sometimes a
new person would post a new question as part of an existing thread. I
ignored this question and any responses, just as if the person had
used a different thread. On the other hand, it’s possible that someone
started a new thread to answer a question I was following, and if so,
I would have missed the discussion (but so might the list members
following the thread).
With the data in hand, I had to classify each thread as resolved or
unresolved. The classification was obvious whenever the person who
posted the original question ended the thread, either to thank people
for solving the problem or to report that the problem still
existed. If the original correspondent did not clearly indicate the
status, I usually gave the mailing list the benefit of the doubt: so
long as answers seemed on-topic and intelligent, I classified the
thread as resolved. In just one case I classified the thread as
unresolved, because the answers looked like wild stabs in the dark and
did not coalesce into a coherent plan of action.
To judge how much noise was on the list, I also classified messages
into the following categories:
- New
-
This was simply the message that began the thread; the original question.
- Helpful
-
This category covered slightly over half the messages. A message does
not have to lead directly to a solution to be classified as
helpful. Often, someone must rule out a possible cause before finding
the real cause, just as a doctor must run a test to rule out a serious
disease before sending the patient home to take aspirin.Such diagnostic recommendations are also useful in educating other
list members about possible causes of similar problems they encounter
on their systems. - Unhelpful
-
A few messages take the readers in the wrong direction and delay
resolution. For instance, if a message directs someone to the wrong
source for software, I classify it as unhelpful. - Irrelevant
-
This covers a wide range of messages that do not make any attempt to
solve a problem. Common messages of this type include “Don’t use
capital letters” and “Please put your comments at the bottom, not the
top, when you quote other messages.”These messages may play a useful role in maintaining the health of the
list and facilitating its use. But one can’t deny that every minute
one spends paging through such messages is a minute that one is not
spending on solving the problem.
How many questions were resolved?
This is the fundamental question that determines the value of the
list. Figure 1 shows the distribution between resolved and
unresolved threads.
Figure 1. Fourteen resolved threads and fourteen unresolved threads
As you can see, the 28 threads are perfectly divided: half are
resolved and half are not. Is this result good or bad? It depends on
how you look at it.
Optimist or pessimist?
As a member of the lists, I consider the results good enough to
encourage people to post questions. The chances of receiving the
advice that breaks a logjam are pretty high.
But from a broader standpoint of meeting the needs of the community,
the results are not so good. In my sample, half the people who came to
the list left without the information they needed.
I’ll temper the results by pointing out that topics related to Fedora
and Ubuntu are particularly difficult to answer because they’re so
varied. A mailing list on a programming language or a utility such as
a spreadsheet would probably generate a higher percentage of answers.
In the conclusion to this article I’ll examine the complicated issues
that surround the definition of what information people need.
(To be continued.)


It's unfortunate that some of the most important things about mailing lists are also the hardest to measure. A great thing to track would be the audience's learning rate, for example, and one way to measure that would be to track how long it takes for "the same" question to get answered each time it appears. I put "the same" in quotes because, of course, a human would have to classify when a question is the same as some question already seen on the list.
Related to this number would be the FAQ generation rate: is the information produced on the mailing list being organized into more permanent reference forms, such as a FAQ? If so, how many times must a question appear before it gets into the FAQ? After that, how often is the new FAQ item's URL given as the answer to questions on the mailing list?
As long as random musings are permitted: I wish there were a way to subscribe to a thread, rather than a whole mailing list. Any entity that goes through state changes should be subscribable, whether a mostly static web page, a wiki page, a ticket in a bug tracker, or a thread in a mailing list. If I see a thread on a list, and it doesn't get resolved, I can at least subscribe so that a) if an answer does appear, I'll be notified, or b) if I stumble across the answer later, I can easily post the answer to the interested parties. Of course, b) is already possible, but our interfaces think in terms of mailing lists (which are really just filters to keep certain threads grouped together) and therefore thread-based manipulation is still primitive. The thread, not the list, is the fundamental unit of discussion, but our tools are only beginning to adjust to this.