Page History
Contributors: Igor Engel, Sebastian Mate, Jan Christoph, Hans-Ulrich Prokosch, Thomas Ganslandt
Note |
---|
TODO TG: content |
Background
The §21 dataset is a German benchmarking dataset mandated by the law on hospital reimbursement (Paragraph 21: "Übermittlung und Nutzung von Daten - des Gesetzes über die Entgelte für voll- und teilstationäre Krankenhausleistungen" (Krankenhausentgeltgesetz - KHEntgG)). It is primarily used to derive the German inpatient billing system based on Diagnosis Related Groups (DRGs) and has to be provided by every German hospital on a yearly basis. It is thus supported by all hospital patient administration and billing systems established in Germany. Due to its standardization and high availability, the §21 dataset has been established as a source dataset for benchmarking projects and other regional and national collaborations. It is thus also an ideal candidate for a "quick win" i2b2 implementation, as its format is well-structured and documented and it can be produced easily by any German hospital, rapidly providing a functional prototype with local data.
Note |
---|
It should be noted, however, that the §21 dataset is primarily reimbursemenetreimbursement-oriented, which may limit its usefulness in the research area. |
The §21 dataset is based on a standard delimited ("CSV" = comma separated values) format with the following files and contents of interest, among others (full specification in German language)
...
Since the §21 files are linked together only by an encounter number, the patient number has to be added in order to achieve full compatibility with the IDRT CSV import module. This addition of patient numbers IDs is carried out in a an integrated pre-processing step before the actual import.
After the initial import, the §21 dataset would be represented in i2b2 in a very low-level format still close to the source CSV files: all coded items would just be displayed with their codes (and not the related names), and items that relate to large, hierarchical terminologies (e.g. ICD diagnosis classification) would be presented as very long lists. In a final post-processing step, the relevant standard terminologies (in this case ICDICD10, OPS and G-DRG) as well as §21-specific codelists (e.g. for admission type, clinical service) are imported and merged with the dataset to provide a well-structured and human-readable hierarchical representation (see figure below).
Raw import | Import after terminology integration |
---|---|
Importing §21 files with the IDRT Import & Mapping Tool
The §21 import functions are fully integrated into the IDRT Import and Mapping Tool (IMT), which provides the following dialogue to configure an import job:
Name | Description |
---|---|
Truncate i2b2 Project? | If you check this, the project will be truncated before the data is uploaded. |
Truncate Previous Queries? | This will truncate the previous queries tables. |
Set patient_count after import | This will fill the c_totalnum column. |
P21 Folder | Here you can select the folder that contains all P21 files you want to upload. |
P21 Version | Here you can select the corresponding version for your P21 dataset. |
Date Pattern | Select the date pattern used in the P21 Files. |
Save Settings? | This will save the settings for the next import. Note: The truncate checkboxes are never saved. |
When you click finish, the upload starts. You can observe the progress in the progress bar of the Import Browser.
Using TOS ETL Jobs for §21 import
The underlying §21 import jobs are provided as part of the IDRT Talend Open Studio distribution. They consist of 3 separate master jobs for the pre-processing of the raw data files, the actual loading and the post-processing of the imported data. As there exist several versions of the §21 dataset (depending on the year of patient discharge), the correct version of the import configuration files need to be copied into the directory containing the raw data files.
Using the TOS jobs instead of the IMT frontend enables the creation of automated ETL scripts (e.g. for nightly updates) or combining several import steps into complex ETL pathways.