Once my boss asked me to review an Open CDISC report for SDTM dataset packages. I wasn’t sure what to do at first. With the help of my colleagues, I gradually develop a sense of how to review the CDISC report.
I’d like to share some review processes and common Open CDISC Summary from PINNACLE21.
OPEN CDISC Format: Excel sheet including Dataset Summary, Issue Summary, Details, Rules, and General information such as Configuration, Define.xml, Generated Date, Engine Version, MedDRA Version 23.1, Terminology Version.
Dataset Summary: Including Processed Sources, Domain, Label, Class, Source, Records number, Errors, Warnings, Notice.
Issue Summary: Including Source dataset, Rule ID, Message, Severity, Found, Explanation.
We are focusing on explaining the item has Severity equals to Error or Warning.
|SD1082||variable length is too long for actual data||Error||Company x have defined fixed length for standard variable cross the domain.|
|SD0013||xSTDTC is after xENDTC||Error||Data Issue, subjid=xxx, reported to DM|
|Duplicate Records||Warning||xSPID is used for unique records|
For records xStart Date is after xEnd Date, we will check in the Raw Data Level to identify the subject and event level.
We can also cross-check with other datasets to see if similar issues happen multiple times.
Details: List the severity messages with respect to Variable values, records numbers.
Rule: Describe each Rule ID and their belonging Categories (Terminology, Presence, Format, Consistency, Limit)