After using an Ingres ARUNA database for ten years, the pathology laboratory at the University Hospital of Lausanne, Switzerland decided it was time to replace their software with a more open Oracle DIAMIC package. No easy task for a facility which handles over 50,000 lab samples per year and is charged with managing the medical records of 500,000 patients. No easy task that is, unless you know Perl.
Here’s Marc-Henri Poget’s story about how Perl enabled him to work around some messy stuff and simplify the migration from an Ingres to an Oracle Database.
Using Perl to Migrate Medical Data From an Ingres to an Oracle Database
The University Hospital of Lausanne, Switzerland, is an 850 bed hospital which serves the needs of the 600,000 people living in the French speaking part of Switzerland. The pathology laboratory information system, used by 80 people, handles 50,000 samples per year.
By 2000, the ARUNA software package had been in use in the laboratory for a decade. The need arose to replace it with a package that was more open and had the ability to evolve. In September 2001, the project aimed at replacing ARUNA with the new DIAMIC software package began. As pathologists rely heavily on previous results for their diagnosis, the ARUNA database held more than 20 years of medical data on-line, some data even coming from systems older than ARUNA. This posed a unique challenge of migrating the Ingres ARUNA database into the Oracle DIAMIC database.
The vendor of DIAMIC proposed the following migration strategy. First, an Oracle database with the same structure as the original Ingres ARUNA database is created. Then, flat files are extracted from the Ingres database and fed into the Oracle ARUNA database. Finally, the vendor develops an application to migrate the data from the Oracle ARUNA to the Oracle DIAMIC database.
To support this strategy, the vendor proposed extracting the flat files from the Ingres ARUNA database using Excel and ODBC. This method facilitated processing the lines to modify their format (for instance to remove the non-significant blanks). The last step was to import these files in the Oracle ARUNA, using a tool written by the vendor, which provides bulk load capabilities. The major drawbacks of this approach are the following:
- Both Ingres and Oracle databases are on Unix machines; it doesn’t make sense to use Windows apps as a gateway.
- Excel doesn’t allow the extraction and handling of very large files from the Ingres database (they contains about half a million of patients).
- In our computer department, we already have experience using SQL Loader from Oracle for bulk loads, so it seemed more natural to use that rather than the vendor’s tool.
- An additional issue that the vendor’s solution doesn’t address is how to ensure that the Oracle ARUNA databases’ tables have exactly the same structure as the Ingres ARUNA databases.
To overcome the above-mentioned problems, I chose to work entirely on the Unix platforms and to develop Perl scripts for the necessary tasks.
I used an Ingres utility to generate the DDL (Data Definition Language) SQL statements that were used to create the Ingres tables. I then wrote the Perl “DDL Generation Script” to produce the SQL statements for generating the Oracle ARUNA database using the Oracle SQL Plus utility. This script performs mapping between some data types only found in Ingres into their Oracle equivalents. In order to enhance consistency when using SQL Loader, the script also produces the control files that direct SQL Loader’s operations.
Ingres only provides a way to export a whole table as a flat file, but I prefer being able to extract the data in several steps. For this reason, I chose to install DBI and the Ingres DBD, so that I could write the Perl “Extract script”. The script receives the extraction criteria from its command line and it performs the required data formatting on the output files.
Using both the above-mentioned Perl scripts and the DBI interface, I’ve been able to carbon copy an Ingres database into an Oracle database having the same structure. This approach has allowed me to transfer the 500,000 patients and their associated medical records between the two databases in a few hours.
Marc-Henri Poget is a project manager in the computer department of the University Hospital of Lausanne, Switzerland. He holds a degree in software engineering from the Swiss Institute of Technology (www.epfl.ch) as well as an MBA. He was instrumental in the deployment of DIAMIC, a software package used in the pathology laboratory. His interests include project management, troubleshooting and open source software. He can be reached at Marc-Henri.Poget@hospvd.ch.
To learn how large and small companies are using Perl to meet their goals, check out Perl Success Stories.
If you have a Perl success story of your own that you’d like to share, please let me know. You can reach me at: firstname.lastname@example.org.