Read part one, before continuing.
Ten minutes later, Dave the DNS administrator was in the garage. He joined Kip, Tom, Sally, Douglas, Velma, Suse, and myself. “Before we begin, let’s review exactly what this script does,” I said.
Kip looked around, and decided to retell his overview, “The script reads database records from a table, and then for each record, it reads a HTML file on the disk, updates the database record, and then loops to the next record.”
“You forgot something,” I said.
“Oh,” he said. “After the database update, it waits before going on to the next record.”
“You didn’t say wait the first time,” I said. “You said rest.”
“Yeah,” Kip said. “Actually, it’s the
“How long does the script sleep?” I asked.
“Just one second,” Kip replied. He looked over at Sally and Tom, “We were concerned that the script could hammer the database unless we put in the sleep statement.”
I turned to look at Sally and Tom, “So you were concerned about performance?”
“Yes,” Tom said. “We’re dealing with a lot of data and file activity. The script runs 4 times a day.”
“How much data?” I asked.
“The table has sixty thousand records,” Sally said.
I grabbed a piece of chalk from Velma, and wrote 60,000 on the garage floor. “We are starting with 60,000 records”. Underneath it, I wrote 1+. “And we know that each loop iteration will take at least one second, because we explicitly sleep for one second.”
Everyone nodded their heads in agreement.
I turned to Tom, “How many seconds are in a minute?”
“Sixty,” he said.
“How many seconds are in an hour?”
“Three thousand, six hundred.”
“And how many seconds are in a day?”
Everybody looked around, quickly trying to do the math in their head. I turned to Dave, the DNS administrator. “How many seconds are in a day, Dave?”
“Eighty-six thousand, four hundred, ” he said. I wrote 86,400 in chalk on the floor.
“How did you know that?” Douglas asked.
“Most DNS administrators are familiar with that number, because it’s commonly used the set the Time To Live value of DNS records.”
“And the script only has a window of twenty-one thousand, six hundred seconds to run in, since it runs four times a day.”
Everyone looked at the numbers on the floor. Sally was the first one to speak. “So, in an effort to reduce the impact of the script, we made it worse with the delay.”
“Exactly,” I said. I picked up the source code. “Without thoroughly looking at this code, I would estimate that ninety-eight percent of the time, this script is doing nothing but waiting. On decent enterprise hardware, the file operations, and the database updates should only take milliseconds.”
“That’s right,” Velma said.
“So, how do we fix this?” Kip asked.
“Well, if there is a concern regarding impacting other systems, just change your sleep iteration. The simplest thing to do is to not sleep for every iteration, but for a percentage of iterations. If the script slept for 1 second every 10 iterations, it should finish in under two hours.”
“That’s an easy fix,” Kip said.
Velma took the chalk and wrote down some additional figures. “For the best case, the script should know how many records it has, and how many seconds it has in the given window to process all of those records.”
“Right, ” I said. “Turn it into a mathematical function, but decrease the window size by an hour, just to be on the safe side.”
Dave looked around and raised his hand. “Uh, is there anything else you needed me for?”
“No, ” I said. “That was it.”
Dave walked back to his bike, mumbling about the 10 mile bike ride back to his house.
“How did you know I didn’t use the perl
Time::HiRes module?” Kip asked. “I could have slept for fractions of a second.”
“If you did, the
use statement would have been visible in the first page of your code printout,” I replied.
“Well, how did you know I didn’t use threading?”
“Because nobody casually implements threading in perl scripts.”
There was a somber moment until Tom spoke up. “Wow. I never expected the sleep statement to be the problem with this.”
I picked up all of the graphs and charts from the table. “This is the most obvious solution, given the information you all have told me. You guys had the answer all along, you just never expected it to be that simple.”
“Remember, ” I said. “A good systems administration team always works together, and is not afraid to look at the code written by a developer.” I turned to Kip. “At the same time the developer should not be afraid to discuss performance metrics, and profiling code behavior with a systems administrator.”
Douglas was about to say something when a wild haired CTO walked into the garage. She was carrying a sleek 2 rack-unit server with a hatchet embedded into the casing. “Somebody hacked my Linux server!” she exclaimed.
I looked back at the others. “Sorry guys, I have another case.”
The small team of geeks said their thanks and walked out of the garage. I turned to the CTO holding up a jar full of quarters, “Payment up front, and I can’t guarantee anything if you didn’t preserve the ARP cache.”