How many people out there have the following problem (as summarized in the following statement)?
“My application isn’t performing well in production because of some heap settings, and there are some configuration changes I need to make in production, but my operations group won’t give me access to the systems I need access to. I’ll talk about JNDI configuration of the DataSource, the heap size parameters, how to start the JMX console, but it is like my words are hitting a brick wall. Our platform is Java, but our system administrators don’t know the inner workings of the JVM heap or the significance of a stack trace. Every time something happens to a JVM, they escalate it to management as an unacceptable bug, but more than often is was a misconfiguration problem. To compound the problem, I haven’t been able to find Java programmers who are aware of the concerns of our operations group, and they’ve reached a very inefficient equilibrium. The administrators are always pointing at a stack trace and blaming the developers and the developers are always blaming the administrators. Every week there is another political blame festival, and I’m getting tired of the conflict.”
It’s a Shared Responsibility
A good system administrator usually knows how to compile something like Apache from source, they might not know how to write a little bit of C, but they probably know enough about “./configure” and “Make”, and as time progresses, they’ll learn about custom packaging. Incredible system administrators know all about the kernel, and will apply patches to kernel source. Don’t let a system administrator plead ignorance when it comes to knowing how the JVM’s heap works. And, don’t let good admins wiggle themselves out of having to know every single detail about the Java platform. If you use Tomcat, they need to know as much about Tomcat as they know about Apache. Go further that this, a System administrator responsible for anything that has to do with Java *should*know at least the meaning of all the acronyms we Java programmers throw at them.
Ideally, your system administrators, might know *more* about the operational side of the JVM - heap size, command line flags, JMX Console, thread dumps, and remote debugging . Chances are good that they don’t, chances are good that unlike everything else on the target platform, every time there is a Java problem, they throw it back at the Java programmers and tell them to fix it. It the classic, “Java’s not really our problem, we just set up the machines” anti-pattern.
But, programmers are also not without fault. What happens when you call this on your production platform?
File file = new File(“filename”);
FileChannel channel = new RandomAccessFile(file, “rw”).getChannel();
// Use the file channel to create a lock on the file.
// This method blocks until it can retrieve the lock.
FileLock lock = channel.lock();
If your administrator comes back and tells you that you are running out of file handles, do you know what command to run to figure out what process is out of control? Let’s say you develop on Windows XP, and you deploy on Linux, do you know how you would go about even finding out what that command was? An experienced programmer knows the answer (Linux: lsof, Windows: <shrug> ) because he or she has been burned by the problem in the past. A good Java programmer knows about technologies that already exist on a platform. For example, let’s say you need to cache or proxy some content, if you were “just a Java programmer”, you wouldn’t know about a solution like Squid or mod_proxy. If you needed to modify an outbound HTTP header, Mr. “Just a Java Programmer”, might code a Servlet Filter as opposed to telling your system administrator to configure mod_header.
The point is that as a programmer, you are going to need to know about platform specific issues. Even though Java is platform-independent, your architecture and code need to be informed by the capabilities of the target platform. Java’s “platform-independence” is not an excuse for not digging into the details.
Mr. Miyagi says, “There are no ‘Admins’. There are no ‘Programmers’.”
Just because they hire “Programmers” and “System Administrators” doesn’t mean that they actually exist. There are some differences, but we shouldn’t be encouraging people to make an exclusive choice. I’ve seen very few people (no one, in fact) who are both black-belt developers and black-belt system administrators. Maybe there is some super-guru meditating in a machine room somewhere who could prove the following statement wrong, but “You can’t be both”. But, I’ve never seen a superstar developer who didn’t know a lot about system administration. (I’m not discounting the possibility, I’m sure they exist in some huge, ultra-regimented organization at which I would never last.) You can’t be both. You can’t be just one. You should strive to be neither.
Java Makes it a Little Worse
If you wrote that file locking code file in C, chances are you’d know exactly what was going on in the target platform.
Java’s “platform-independence” is both its biggest strength and its biggest weakness. Because Java is platform-independent, deliverables don’t need to be packaged in platform dependent packages (DEB, RPM, and MSI). In other words, to your “platform-dependent” sysadmins, Java applications don’t come in familiar packages…. You could teach your sysadmin to run Ant and Maven, but that’s usually not happening. Most of the time, your sysadmin gets a WAR file and a set of instructions, and, compared to OS distribution package, this is an annoyance.
You end up having to explain the abstraction: “there’s an OS, than there is this Virtual Machine that we run atop the OS called a JVM…the programs we write don’t really know anything about the underlying OS”. Tell a really capable Debian admin that they are going to have to support a bunch of Java applications, and they will subconsciously shudder. Sysadmins are, by definition, OS focused, they enjoy dealing with native applications, and they tend to fear the JVM as it is an abstraction. Abstractions are for Developers, Admins deal with physical machines. That is the disconnect. If a Java application were packaged just like Apache 2.2 (in an RPM or a DEB), if your application was installed via “./configure”, “make”, “make install”, and you provided some configure flags, they’d happily dig into the details.
(You could say that .NET should have the same problem, but it doesn’t. This is what Microsoft is referring to when they say that .NET has a lower cost of ownership than Java. What they mean to say is: “Windows Administrators have all been hypnotized to follow our lead, and we will make sure that they all know how to support the .NET CLR.” Tell a Windows administrator that you are going to ask them to support some .NET applications and they’ll happily start singing a song about how the Great, Visionary Leader Gates leads us all to Unchallenged Victorious Glory.)
The Short-term Solution: Hold Hands and Sing a Song
If your organization has friction between Operations and Development, a quick short-term fix is to nominate one person from each team to serve as a liaison to the other group. Operations is complaining about buggy, difficult-to-configure code? Developers complaining about lack of access to critical systems? Take the second most vocal members of each team and budget some time for them to sit in on some meetings. Even though it might seem to be a waste of time, have one developer sit with someone in operations while they setup a machine or deploy the application to production. Have a system administrator sit with a developer as they plan out the architecture for a new system, or as the implement a new feature.
The second fix is to ask each team to identify a body of knowledge they would like everyone in the other group to know. Administrators might feel better about granting certain members of the development team access to network resources - “if we’re going to give you access to a production machine, here are the prerequisites. And, if you don’t have these skills don’t bother asking us for administrative rights to anything.” Likewise, developers might want to know that the System Admins know something about how a Servlet works, or the JVM - “If you are going to be the one we work with, you are going to have to know everything about the JVM heap, hopefully more than we do, if you don’t, than we can’t work with you.” Doing this won’t just ease communication; it will encourage members of both teams to break out of the narrow job descriptions.
Most importantly, when there is a disconnect between the two groups, update these prerequisites and encourage people to self-identify the limits of their own knowledge (which almost never happens, ask five people if they know about “JDLC Closures” [made-up], count how many nod yes and continue to listen to you make stuff up, see how long it takes them to say, “Oh, wait, I thought you meant something else….”)
In short, make sure that some of your developers and some of your admins are allowed to ignore the boundaries, and make sure that everyone in your team knows the prerequisites in the other group.
The Long-term Solution: Stop Developing Applications, Stop Administering Systems
…do away with System Administrators and Application Developers. ;-) No, really, I mean that. And, I’m not saying this as a threat to the world’s System Administrators. In the future, the “System Administrator” who installs an OS, configures the OS, installs the application server, tweaks the container configuration, writes deployment scripts, writes monitoring scripts, and deploys new releases will likely become the “Appliance Administrator” overseeing a number of virtual machines in a system like Xen. It is highly probably that the WARs and JARs that are output by today’s development teams will become entire production networks in the near-term future.
Building today’s “applications” produces a series of “application components”: Ok, to build this application, run Ant from these four directories, then I’ll need you to read these 30 page documents about configuration and setup: set the oracle JDBC configuration, configure the JMS parameters, and make sure these five JARs are in the right directories. Then, I’ll need you to stand up, turn around three times, and shout the word “SHAZAM”, after that put this script in /etc/init.d and depending on your preference increase the size of the Heap. Ok, when we deploy, we put these 5 WAR files in this directory, once we’ve tested them and told you they are ready for production, you’ll need to copy them from the deployment machine, stop your Tomcat server, copy the file to the ? directory and then restart your server. After you restart, be sure to tail the log file to see if there were any bad stack traces. (Doesn’t that get boring after a while?)
Building tomorrow’s “applications” will produce an “application ecosystem”: We’ve deployed the system to development and tested it. When you are ready, move the instances to production. We tweaked the memory allocation on the web servers, and we’ve added another two application servers to the mix.
Tomorrow’s virtualization technology exists today. Tools like Xen already allow you to capture a machine in a file and migrate that virtual machine instance between servers. Application development hasn’t caught up with System administration in this respect, we are still building JARs and WARs. Maybe once we evolve beyond this stage, we’ll all stop arguing about turf.
Also, Don’t worry you’ll still have your jobs. Calm down. But, in this long-term solution, both jobs change. Programmers will need to know more about VM technologies and much less about physical machines. System administrators will need to know less about application specific configuration and more about appliance configuration.