by Freda Cameron – writer, traveler, blogger, volunteer and supporter of the DLC. Follow her @fredacameron.
October 26, 2012 — Excitement! That’s what I felt when Anne Yoder, Director of the Duke Lemur Center told us there were forty-five years of lemur data to be consolidated. With our technical backgrounds and combined years working for SAS Institute Inc, my husband, Richard Roach, and I understand the value of integrated data—and, we love the lemurs. It was an easy decision for us to volunteer our programming skills to assist Dr. Sarah Zehr with this data project, which is supported by the DLC, the National Evolutionary Synthesis Center (NESCent) and Duke Natural Sciences, because we believe in the mission of the DLC and the value of these data.
The goal is to incorporate data from multiple current systems into a single streamlined system to be used for reporting and analysis. The disparate data sources include research notes, spreadsheets, and the medical and husbandry databases.
Without disrupting the systems already in use at the DLC, lemur data is imported using SAS® Enterprise Guide®. In other words, no one outside this project had to learn a new tool for data entry. The information is consolidated within a master file warehouse. The master file includes longitudinal data of each lemur’s history. Subsets of information, such as medical data, are merged with subsets of the master file to create reports or answer research questions.
Special coding is often used within the SAS® programs. Examples of the coding include matching lemur siblings to their litters by mom or dad, counting births or tracking lemurs as they are moved around the housing facilities at the DLC. Also, many new recodes and edits were added in SAS to make data more reliable and understandable. These edits are now an automated part of the database and automatically created each time the database is refreshed, which only takes a few seconds. Sarah adds ancillary data via spreadsheets to further enhance the utility of the master database.
When new data is imported to the warehouse, Sarah reviews it for accuracy. Any data discrepancies are corrected in the source system so that there is “one version of the truth” whenever subsequent analyses are run against the warehouse.
Programs are created and stored within SAS® Enterprise Guide®, enabling a flow of programs that are setup to run in a sequence. This ensures that the data results can be replicated with confidence as well as audited for quality assurance.
The project began in January 2012 and after only a few months, Sarah was able to generate reports and analyses for DLC staff to use on a daily basis. We started with the more straightforward husbandry data and after a flurry of our programming activity, she is still going through the output to make sure it is accurate. The fact that she rarely asks for our programming assistance these days means that what is in there is working and phase I of the project is a success! We are gearing up for phase II, which will involve the more complicated medical records, so Richard and I will be standing by for new programming requests. Richard and I believe that our volunteer efforts make a difference, not only for the wonderful staff at the DLC, but for the insight into the marvelous lemurs.