http://sascdmonlinetraining.blogspot.com/2015/11/the-problem-with-data-issues-in-clinical-data.html
In general, the CDM department may not
spend enough resources to check the quality of the data. This is because CDM’s
main responsibility is to collect and structure the incoming data. Since the
biostatistics department is generally responsible for the final study results,
they must often exercise control on data quality before accepting the raw
clinical data. The problem often occurs when SAS statistical programmers and
statisticians in the biostatistics department process the original ‘unchecked’
clinical data to get incorrect results and conclusions. For example, even
simple checks such as viewing invalid
values for the variable gender are not performed. This could result in confusion
and frustration.
According to the 2001 survey by the Data
Warehousing Institute in figure 1, the sources of data quality problems across
all industries can be identified below. It is interesting to note that while
most data issues are caused by data entry errors, there is still a substantial
amount of data issues that are caused by system related changes, conversions or
errors. This indicates that similar types of validation checks should be
applied throughout the process of data collection, storage, ransfer, conversion
and update. For clinical trials, various studies suggest that up to 5 percent
of raw data values in clinical trial databases are erroneous initially.
Examples of using ‘unchecked’ data that
resulted in significant delays and costs include:
- In February 2003, the U.S. Treasury
Department mailed 50,000 Social Security checks without a beneficiary name. The
missing names data issue was due to a software program maintenance error.
- In October 1999, the $ 125 million NASA
Mars Climate Orbiter, an interplanetary weather satellite, was lost in space
due to a data conversion error. The data issue was due to performing certain
calculations in English units (yards) when it should have used metric units
(meters).
Specifically, this paper will review an
effective method to implement a clinical data acceptance testing procedure to check
data quality with each data transfer, conversion or update. The two main
categories of clinical data issues may be grouped as incorrect and incomplete
data. In general, incorrect data issues consist of unexpected raw values, invalid
raw values, incorrect conversion of raw values or inconsistent raw values with
another variable or record. Also, incomplete data issues consist of missing
values when required.