Advanced Metadata Import
The Metadata Table Editor can be used to import metadata from an external file. It involves more steps than the basic import tool, but is more flexible and has some basic error checking associated with data types. General features of this importer are:
- Can import from Excel (.xlsx/.xls) or text files with a common delimiter can be used.
- The structure of the metadata table (the columns, their type, and the key column) must be set up before the metadata (contents) are imported.
- It is generally recommended that one column be designated as the key column. Entries in that column must have unique entries.
- The default data type for columns on creation is text, but this can be altered before import commences. When importing the metadata, an error will result if entries are found that do not match the expected data type.
- Association with metadata is done by matching data element names with the entries in the first column of the spreadsheet. Name matching can be based on exact or partial matches.
- Association of data with metadata is done as a separate step from import, providing flexibility. For example, if information in more than one column together uniquely identifies a sample, but the information within a single given column does not uniquely do so.
- Association of data with metadata can be done row by row if key column entries and the names of the relevant data elements are not related.
The Metadata Table Editor can also be used to create and populate a metadata table directly if desired.
Importing metadata using the Metadata Table Editor requires that the table structure is defined first. Information about how to do this is below. However, if the information about the data is in an excel file and the entries in the first column are unique, then the Import Metadata tool described in section 3.2.1 can be used to define the table and import the metadata in one step. That populated metadata table can then be opened in the Metadata Table Editor, where it can be refined, for example setting a new key column (or choosing not to have one), setting data types on columns, associating data as desired, and so on.
To set up a or alter a table structure using the Metadata Table Editor:
- Go to:
File | New | Metadata Table ()
This opens a new metadata table with no columns and no rows.
- Click on the button labeled Setup Table at the bottom of the view. A window appears like that shown in figure 3.12. The table structure will be defined using this tool.
Figure 3.12: Dialog used to add columns to an empty Metadata Table. - Click on the button labeled Setup Structure from File.
A window will appear as shown in figure 3.13.
Figure 3.13: Creating a metadata table structure based on an external file.You need to provide the following information:
- Filename The Excel or delimited text file to import. Column names should be in the first row of this file.
- Encoding For text files only: the encoding used to create the file. The default is UTF-8.
- Separator For text files only: The character used to separate the columns. The default is semicolon (;).
For each column in the external file, a column will be created in the new metadata table. By default the type of these imported columns is "Text". You will see a reminder to set the column type for each column and to designate one of the columns as the key column.
You may modify the following information for each column:
- Name. A mandatory header name or title for the column.
- Description. An optional description of the information that will be held in the column. The description will appear as a tool tip, visible when you hover the mouse cursor over the column name in the metadata table.
- Key column. Put a check in the box in the one column that will be the "key" column. All rows in this column must be populated and all entries in this column must be unique.
- Type. The type of value allowed. The available types are:
- Text Simple text.
- Whole number Integer values, like 42 or -7.
- Decimal number Decimal values, like 3.14 or 1.72e13.
- Yes / No Yes/No or True/False values are accepted. Capitalization is not necessary.
- Date Local dates such as 2015-04-23 for April 23rd, 2015.
- Date and time Local date and time such as 2015-04-23 13:37 for 1:37pm on April 23rd, 2015. Note the use of 24-hour clock and that no time zone information is present.
- Navigate between the columns using the () Prev and () Next buttons, or by using
left/right arrow keys with Alt key held down.
Modifications made to a particular column take effect as you navigate to another column, or if you close the dialog using Done.
The () and () buttons are used undo and redo changes respectively.
- When the columns have been configured, click on the button labeled Done.
The table structure can also be defined manually by clicking on the () button and defining each column in turn.
Columns may be deleted using the () button. After metadata has been imported, additional columns can be added to the table structure. This can be done by importing the altered structure from an external file, where any columns not already in the metadata table will be added. Alternatively, individual columns can be added using the () and () buttons, which insert new columns before and after the current column respectively.
The metadata table can then be populated by importing information from an external file. The column names in the metadata table in the Workbench will be matched with those in the external file to determine which values go into which cell. Only cell values in columns with an exact name match will be imported. If the file used contains columns not in the metadata table, the values in those columns will be ignored. Conversely, if the metadata table contains columns not present in the file, imported rows will have no values for those columns.
- Click on the button labeled Manage Data button at the bottom of the view. A window appears like that shown in in figure 3.14.
Figure 3.14: Tool for managing the metadata itself. Notice the button labeled Import Rows from File.When working with an existing metadata table and adding extra rows, it is generally recommended that a key column be designated first. If a key column is not present, then all rows in the file will be imported. With no key column designated, if any rows from that file were imported into the same metadata table earlier, a duplicate row will be created. With a key column, rows with a new, unique entry for that column are added to the table and existing rows with a key entry in the file will be updated, incorporating any changes present in the file. Duplicate rows will not be created.
- Click on the button labeled Import Rows from File and select the external file of metadata. This brings up the window shown in figure 3.15.
Figure 3.15: Tool to import rows into a Metadata Table.The options presented in that window are:
- File. The file containing the metadata to import. This can be Excel (.xlsx/.xls) format or a delimited text file.
- Encoding. For text files only: The text encoding of the seledcted file. Specifying the correct encoding is important to ensure that the file is correctly interpreted.
- Separator. For text files only: the character used to separate columns in the file.
- Locale. For text files only: the locale used to format numbers and dates within the file.
- Date format. For text files only: the date format used in the imported file.
- Date-time format. For text files only: the date-time format used in the imported file.
The date and date-time templates uses the Java patterns for date and time formatting.
Meaning of some of the symbols:
Symbol Meaning Example y Year 2004; 04 d Day 10 M/L Month 7; 07; Jul; July; J a am-pm PM h Hour (0-12 am pm) 12 H Hour (0-23) 0 m Minute 30 s Second 55 Examples of using this:
Format Meaning Example dd-MM-yy Short date 31-12-15 yyyy-MM-dd HH:mm Date and Time 2015-11-23 23:35 yyyy-MM-dd'T'HH:mm ISO 8601 (standard) format 2015-11-23T23:35 With a short year format (YY), 2000 will be added when imported as, or converted to, Date or Date and time format. Thus, when working with dates before the year 2000 or after 2099, please use a four digit format for the year (YYYY).
- Click the button labeled Finish button when the necessary fields have been filled in.
Row information can also be added manually by clicking on the () button and typing in the information for each column.
The progress and status of the row import can be seen in the Processes tab of the Toolbox. Any errors resulting from an import that failed can be reviewed here. The most frequent errors are associated with selecting the wrong separator or encoding, or wrong date/time formats when importing rows from delimited text files.
Once the rows are imported, The metadata table can be saved.