Background Story:
Once my boss asked me to review an Open CDISC report for SDTM dataset packages. I wasn’t sure what to do at first. With the help of my colleagues, I gradually develop a sense of how to review the CDISC report.
I’d like to share some review processes and common Open CDISC Summary from PINNACLE21.
OPEN CDISC Format: Excel sheet including Dataset Summary, Issue Summary, Details, Rules, and General information such as Configuration, Define.xml, Generated Date, Engine Version, MedDRA Version 23.1, Terminology Version.
Dataset Summary: Including Processed Sources, Domain, Label, Class, Source, Records number, Errors, Warnings, Notice.
Issue Summary: Including Source dataset, Rule ID, Message, Severity, Found, Explanation.
We are focusing on explaining the item has Severity equals to Error or Warning.
For example,
Rule ID | Message | Severity | Explanation |
---|---|---|---|
SD1082 | variable length is too long for actual data | Error | Company x have defined fixed length for standard variable cross the domain. |
SD0013 | xSTDTC is after xENDTC | Error | Data Issue, subjid=xxx, reported to DM |
SD1117 SD1201 | Duplicate Records | Warning | xSPID is used for unique records |
For records xStart Date is after xEnd Date, we will check in the Raw Data Level to identify the subject and event level.
We can also cross-check with other datasets to see if similar issues happen multiple times.
Details: List the severity messages with respect to Variable values, records numbers.
Rule: Describe each Rule ID and their belonging Categories (Terminology, Presence, Format, Consistency, Limit)
Happy Studying!