March 2006 Archives

chromatic

AddThis Social Bookmark Button

SourceForge, still a center of participation in the open source world, has one more day left of voting for the SourceForge Community Choice Awards. Here’s your chance to give your favorite project a bit more recognition for excellence. (Why not choose something that doesn’t walk away with awards every year?)

Stéphane Faroult

AddThis Social Bookmark Button

All right, in the natural vs surrogate key religious war, I rather feel on the natural key side (I have nothing against using a surrogate key as a shorthand for a complex primary key, but I have seen too many people adding to a table a system-incremented column and calling it primary key without other analysis).
But I must admit that counters, whether you call them identity or auto-increment columns, or simply sequences, are extremely valuable for relating the techy’s to the bean counter’s vision of activity. In most applications, there is at least one such counter that represents fairly acurately the business activity - whether it is an increasing order number, invoice number, transaction identifier or whatever. Rather than execute complex queries with conditions on date columns to collect those so precious “key performance indicators”, it may be easier to check the data dictionary, look for sequences or identity columns, and take a daily (or anything) snapshot of the current highest values, together with statistics about the number of executed statements and whatever your DBMS has to offer.

chromatic

AddThis Social Bookmark Button

Suppose you’re a college student performing a research project. Part of that research is a survey with several types of questions. To make your life easier, you’ve decided to create a small web application to ask the questions and record the answers — this also gives you a SQL database you can query to analyze the results.

(No, I’m not a college student — a friend is and the question stumped me.)

Stéphane Faroult

AddThis Social Bookmark Button

I have been recently involved in trying to improve the performance of an application that primarily handles messages (more and more of them). It is a kind of state machine. Three different types of entities are managed (one of the types happens to be sets of another type), and messages are received that tell what is the most recent status of each entity. So far so good, except that everything revolves around the current status of each entity, and that the main bottleneck seems to be queries that hit repeatedly a table MESSAGE_HISTORY(msg_id, entity1_id, entity2_id, entity3_id, status, changed_by, timestamp), finding out which is the most recent status for a given entity1_id, entity2_id or entity3_id (as you may have guessed, only one of those contains a value in each row), associated with the greatest msg_id for the enity in question.
The solution that looked best was what was presented to me as “denormalizing”, and associating its current status to each and every entity.

Advertisement