Different types of data in healthcare settings
Depending on the use and the kind of data, every data set has its own properties for a valorisation query. The main distinctions are those between primary and secondary data use and structured vs. unstructured data.
Primary and secondary use of data
Hospitals use primary data for the reasons they are gathered for: data on blood pressure, heart rates and so on, to assess the patient’s condition in view of his/her treatment. When these data are used for what they weren’t originally collected for, it is called the secondary use of data, for example for retrospective studies. As discussed in this article on the regulations around RWD valorisation projects, the primary or secondary use of data impact the related data governance and relevant regulations.
Structured and unstructured data
Structured data are well organised: they have been entered into specific fields, often with restrictions, such as a date-format requiring the entry of dd/mm/yyyy. These restrictions guarantee the data integrity, i.e., the accuracy and consistency of data in a database. They ensure telephone numbers don’t end up in blood pressure fields.
And even though unstructured data are also kept in a database, these are often text fields with progress notes in a narrative form. Queries on these fields are possible, but their results are impacted by misspellings, typographical errors, the use of different synonyms and abbreviations by different authors.
For example, finding diabetic patients running a query on a text field will return a list of presumed diabetic patients, but you can never be sure, as patients ‘with diabetic symptoms’ might also be listed.
To eliminate this confusion, classification systems such as ICD-10-CM and SNOMED-CT were created.