Saturday, September 22, 2012

For the love of all things scientific, can we please build better databases??

Many scientists work with animal models to learn about human diseases.  While most are far from perfect models, we owe a lot of medical advances to them. I work with an animal model for HIV/AIDS and we use a database that was developed onsite to track clinical information about our animals.  For example, when one of my animals is bled or has a clinical procedure, the veterinarians and technicians will input stats like blood pressure and heart rate, as well as make notes of any unusual signs or symtoms, like if the blood clotted earlier than usual.   Now I need this data from the database so I can make charts of the information about the animals in my study.  Unfortunately, this database does not allow for exporting data to other programs and doesn't even let you select data to copy and paste into another program.  So when I want a chart that plots the animals' weights from birth to present, I can ask the database for a report and I can tell it to put all of the weight records in chronological order and it will give me two columns of information, date of record and weight.  However (!),  I then have to manually enter each datapoint into excel or graphpad in order to get my plot.  It's an incredibly tedious task and a huge waste of time but that's not the biggest shame of having a primitive database system.  The thing about the system that gets me is that the software could easily be used to correlate information across data fields.  This feature is something I've long lamented that the medical community lacks. Though the things holding progress back in that sector are mostly privacy issues but we don't have privacy issues in animal research!

So for each animal we have a wealth of biological data sitting in this database that we can't easily analyze and then each department has huge amounts of their own data for subsets of the animals. For example, the genetics department has looked at the genotypes of the animals for genes they are interested in and the virology and immunology department has information about viral status and immune responses and all sorts of other interesting data. but they are all on separate computers and in separate programs.  If we could have one central database where every researcher enters information about each animal they study, then we could easily correlate things like genotype to viral load or immune status.  but wait! If this database could automatically look for correlations across fields, then we  wouldn't even have to wait until some idea occurred to a scientist!  The computer could automatically report any correlations and we could look further into them to weed out the false positives.  It would also be a boon for epidemiology within the colonies. If there were reports of symptoms spreading through a group of animals, the technicians would make a note of it and we could look for clusters of notes about runny noses or coughing...

Now I have a few friends who can program and are big supporters of research.  I have no idea how big of a task it would be to write such a program but if we could find enough programers who also dig on science and want to help scientific research, I wonder if this program could be developed on a volunteer basis.  I mean with all of the open source software that people write and improve on everyday, surely there are the means to make this incredibly useful software, right??  What's more, if I could set up something like a kickstarter to get a pool of money to pay some programers, would there be enough interest from the public to make that pool big enough?  And could this program evolve into a functional database for human biological information? That would be incredible! Though the hurdles there would be substantial and the bureaucracy perhaps too daunting for me.  At any rate, I'm going to toy with these ideas for a while. 

No comments:

Post a Comment