Keyword Mapping¶
Keyword mapping is the process of extracting individual keyword values from report data and assigning them to standard data model fields. The configuration for keyword mapping contains the following data fields:
|
Table type for which the keyword is applicable. |
|
Keyword text to find in the table data. |
|
Data model field that the keyword value should be mapped to. |
For each configured keyword:
The process looks for the keyword text in each row of the raw data table.
Note
This search is done in table_data_raw, as table_data_edited will have been trimmed by this point in the process.
The keyword may span several cells, so it iterates on the table cells while checking for the keyword text.
If a keyword is found, the next table cell to the right is assumed to contain the value for the keyword.
If the value is numeric, the neighbouring cell to the right (if any) is assumed to be a potential unit of measure.
Header mapping fields are updated with the extracted values and UOMs.
Note
The keyword matching logic could reuse the same logic as the template matching process, as it is essentially the same operation of searching for a smaller set of strings within a larger set of strings, and finding the matching indices.
Warning
This process could be more robust if the data type of the extracted value was validated against the data type of the data model field it is being mapped to.
If the data model field is sampleContainerID, the process looks up that container ID in the set of previously identified samples to find the corresponding sample ID. This is a common scenario where only the container is referenced on the page.
Note
The inverse could be supported as well, inferring the container ID from the sample ID.
Keyword Mapping Configuration¶
Keywords can be configured in the Manage Configuration page, or from the Map Header view.
In the Map Header view shown above, the Map Keywords button has been clicked, which opens the keyword mapping configuration dialog:
Each row in the table is displayed as a series of pills, one for each column, except for the final column which is assumed to be the value for the keyword.
The user selects one or more consecutive pills to define the keyword, and the app will display the resulting value.
The user can then select the field in the standard model that the keyword should be mapped to. The table type defaults to that of the current table and cannot be changed.
Any previously configured keywords are preselected in the table.
The user can select Find Headers to run the extraction and view the results.
Keyword extraction is run automatically in the prediction process.
Note
Report pages often contain data that belongs to other table types (typically report information such as field, well, and document ID). The app could let the user define keywords that update other parts of the data model, not just the current table.