Metadata¶
Warning
The columns and values in the following example is purely fictional, and may be inconsistent.
In the following examples we are looking at a project with the input sections foo
, bar
and baz
.
The section and parameters are as follows
foo
foo
bar
baz
bar
foo
qux
quux
baz
bar
quuz
corge
In addition, the following sections are always present in all projects
run
: Contains information about the run itselffile_modification
: When did the underlying files change the last timesplit
: How the mesh was splitparameters
: A link table to the parameters listed abovesystem_info
: Information about the system the run was executed on
This gives the following entity relationship diagram
The metadata of a project is stored in it’s own schema.
Note
sqlite
does not implement a schema like other databases does.
Hence: Each project has its own database file.
Extracting the metadata¶
To extract the entire content of the database above we will use the MetadataReader
class.
from bout_runners.metadata.metadata_reader import MetadataReader
metadata_reader = MetadataReader()
metadata = metadata_reader.get_all_metadata()
The metadata
parameter above is a DataFrame
with the following content
run.id |
run.file_modification_id |
run.latest_status |
run.name |
run.parameters_id |
run.split_id |
run.start_time |
run.stop_time |
run.submitted_time |
run.system_info_id |
bar.id |
bar.foo |
bar.quux |
bar.qux |
baz.id |
baz.bar |
baz.corge |
baz.quuz |
file_modification.id |
file_modification.bout_git_sha |
file_modification.bout_lib_modified |
file_modification.project_executable_modified |
file_modification.project_git_sha |
file_modification.project_makefile_modified |
foo.id |
foo.bar |
foo.foo |
foo.foobar |
parameters.id |
parameters.bar_id |
parameters.baz_id |
parameters.foo_id |
split.id |
split.number_of_nodes |
split.number_of_processors |
split.processors_per_node |
system_info.id |
system_info.machine |
system_info.node |
system_info.processor |
system_info.release |
system_info.system |
system_info.version |
|
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 |
1 |
1 |
complete |
testdata_1 |
1 |
1 |
2011-01-01 13:57:16.172488 |
2012-01-01 13:57:16.172488 |
2020-04-20 13:57:16.997700 |
1 |
1 |
1 |
a |
0.2 |
1 |
0.2 |
1 |
b |
1 |
7272a99cb05c9f6105b774d4c5cf1061796c4f7d |
2019-12-03 11:05:29.000000 |
2039-01-01 13:57:16.172488 |
7272a99cb05c9f6105b774d4c5cf1061796c4f7d |
1900-01-01 10:50:58.000000 |
1 |
0 |
0 |
0.1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
x86_64 |
f71196cc64ed |
4.19.76-linuxkit |
Linux |
#1 SMP Thu Oct 17 19:31:58 UTC 2019 |
|
1 |
2 |
1 |
complete |
testdata_2 |
2 |
1 |
2011-01-01 13:57:16.172488 |
2012-01-01 13:57:16.172488 |
2020-04-20 14:57:16.997700 |
1 |
2 |
10 |
a |
0.24 |
1 |
0.2 |
1 |
b |
1 |
7272a99cb05c9f6105b774d4c5cf1061796c4f7d |
2019-12-03 11:05:29.000000 |
2039-01-01 13:57:16.172488 |
7272a99cb05c9f6105b774d4c5cf1061796c4f7d |
1900-01-01 10:50:58.000000 |
2 |
1 |
1 |
0.2 |
2 |
2 |
1 |
2 |
1 |
1 |
1 |
1 |
1 |
x86_64 |
f71196cc64ed |
4.19.76-linuxkit |
Linux |
#1 SMP Thu Oct 17 19:31:58 UTC 2019 |
|
2 |
3 |
1 |
complete |
testdata_3 |
3 |
1 |
2011-01-01 13:57:16.172488 |
2012-01-01 13:57:16.172488 |
2020-04-20 16:57:16.997700 |
1 |
2 |
10 |
a |
0.24 |
1 |
0.2 |
1 |
b |
1 |
7272a99cb05c9f6105b774d4c5cf1061796c4f7d |
2019-12-03 11:05:29.000000 |
2039-01-01 13:57:16.172488 |
7272a99cb05c9f6105b774d4c5cf1061796c4f7d |
1900-01-01 10:50:58.000000 |
1 |
0 |
0 |
0.1 |
3 |
2 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
x86_64 |
f71196cc64ed |
4.19.76-linuxkit |
Linux |
#1 SMP Thu Oct 17 19:31:58 UTC 2019 |
|
3 |
4 |
2 |
complete |
testdata_4 |
2 |
1 |
2011-01-01 13:57:16.172488 |
2012-01-01 13:57:16.172488 |
2020-04-20 14:57:16.997700 |
1 |
2 |
10 |
a |
0.24 |
1 |
0.2 |
1 |
b |
2 |
7272a99cb05c9f6105b774d4c5cf1061796c4f7d |
2019-12-03 11:05:29.000000 |
2001-01-01 13:57:16.172488 |
7272a99cb05c9f6105b774d4c5cf1061796c4f7d |
2000-01-01 10:50:58.000000 |
2 |
1 |
1 |
0.2 |
2 |
2 |
1 |
2 |
1 |
1 |
1 |
1 |
1 |
x86_64 |
f71196cc64ed |
4.19.76-linuxkit |
Linux |
#1 SMP Thu Oct 17 19:31:58 UTC 2019 |
|
4 |
5 |
1 |
error |
testdata_1_force |
1 |
1 |
2011-01-01 13:57:16.172488 |
2012-01-01 13:57:16.172488 |
2020-04-20 14:57:16.997700 |
1 |
1 |
1 |
a |
0.2 |
1 |
0.2 |
1 |
b |
1 |
7272a99cb05c9f6105b774d4c5cf1061796c4f7d |
2019-12-03 11:05:29.000000 |
2039-01-01 13:57:16.172488 |
7272a99cb05c9f6105b774d4c5cf1061796c4f7d |
1900-01-01 10:50:58.000000 |
1 |
0 |
0 |
0.1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
x86_64 |
f71196cc64ed |
4.19.76-linuxkit |
Linux |
#1 SMP Thu Oct 17 19:31:58 UTC 2019 |
|
5 |
6 |
2 |
running |
testdata_5 |
2 |
2 |
2013-01-01 13:57:16.172488 |
2029-04-20 14:57:16.997700 |
2 |
2 |
10 |
a |
0.24 |
1 |
0.2 |
1 |
b |
2 |
7272a99cb05c9f6105b774d4c5cf1061796c4f7d |
2019-12-03 11:05:29.000000 |
2001-01-01 13:57:16.172488 |
7272a99cb05c9f6105b774d4c5cf1061796c4f7d |
2000-01-01 10:50:58.000000 |
2 |
1 |
1 |
0.2 |
2 |
2 |
1 |
2 |
2 |
2 |
2 |
2 |
2 |
x86_64 |
MrTrollMan |
4.19.76-linuxkit |
Linux |
#1 SMP Thu Oct 17 19:31:58 UTC 2019 |
||
6 |
7 |
2 |
submitted |
testdata_6 |
2 |
2 |
2030-04-20 14:57:16.997700 |
2 |
2 |
10 |
a |
0.24 |
1 |
0.2 |
1 |
b |
2 |
7272a99cb05c9f6105b774d4c5cf1061796c4f7d |
2019-12-03 11:05:29.000000 |
2001-01-01 13:57:16.172488 |
7272a99cb05c9f6105b774d4c5cf1061796c4f7d |
2000-01-01 10:50:58.000000 |
2 |
1 |
1 |
0.2 |
2 |
2 |
1 |
2 |
2 |
2 |
2 |
2 |
2 |
x86_64 |
MrTrollMan |
4.19.76-linuxkit |
Linux |
#1 SMP Thu Oct 17 19:31:58 UTC 2019 |
A lot of the columns (<name>_id
) are simply keys.
If we only the run.id
column we can change the constructor argument drop_id
metadata_reader.drop_id = "keep_run_id"
metadata = metadata_reader.get_all_metadata()
metadata
now contains
run.id |
run.latest_status |
run.name |
run.start_time |
run.stop_time |
run.submitted_time |
bar.foo |
bar.quux |
bar.qux |
baz.bar |
baz.corge |
baz.quuz |
file_modification.bout_git_sha |
file_modification.bout_lib_modified |
file_modification.project_executable_modified |
file_modification.project_git_sha |
file_modification.project_makefile_modified |
foo.bar |
foo.foo |
foo.foobar |
split.number_of_nodes |
split.number_of_processors |
split.processors_per_node |
system_info.machine |
system_info.node |
system_info.processor |
system_info.release |
system_info.system |
system_info.version |
|
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 |
1 |
complete |
testdata_1 |
2011-01-01 13:57:16.172488 |
2012-01-01 13:57:16.172488 |
2020-04-20 13:57:16.997700 |
1 |
a |
0.2 |
0.2 |
1 |
b |
7272a99cb05c9f6105b774d4c5cf1061796c4f7d |
2019-12-03 11:05:29.000000 |
2039-01-01 13:57:16.172488 |
7272a99cb05c9f6105b774d4c5cf1061796c4f7d |
1900-01-01 10:50:58.000000 |
0 |
0 |
0.1 |
1 |
1 |
1 |
x86_64 |
f71196cc64ed |
4.19.76-linuxkit |
Linux |
#1 SMP Thu Oct 17 19:31:58 UTC 2019 |
|
1 |
2 |
complete |
testdata_2 |
2011-01-01 13:57:16.172488 |
2012-01-01 13:57:16.172488 |
2020-04-20 14:57:16.997700 |
10 |
a |
0.24 |
0.2 |
1 |
b |
7272a99cb05c9f6105b774d4c5cf1061796c4f7d |
2019-12-03 11:05:29.000000 |
2039-01-01 13:57:16.172488 |
7272a99cb05c9f6105b774d4c5cf1061796c4f7d |
1900-01-01 10:50:58.000000 |
1 |
1 |
0.2 |
1 |
1 |
1 |
x86_64 |
f71196cc64ed |
4.19.76-linuxkit |
Linux |
#1 SMP Thu Oct 17 19:31:58 UTC 2019 |
|
2 |
3 |
complete |
testdata_3 |
2011-01-01 13:57:16.172488 |
2012-01-01 13:57:16.172488 |
2020-04-20 16:57:16.997700 |
10 |
a |
0.24 |
0.2 |
1 |
b |
7272a99cb05c9f6105b774d4c5cf1061796c4f7d |
2019-12-03 11:05:29.000000 |
2039-01-01 13:57:16.172488 |
7272a99cb05c9f6105b774d4c5cf1061796c4f7d |
1900-01-01 10:50:58.000000 |
0 |
0 |
0.1 |
1 |
1 |
1 |
x86_64 |
f71196cc64ed |
4.19.76-linuxkit |
Linux |
#1 SMP Thu Oct 17 19:31:58 UTC 2019 |
|
3 |
4 |
complete |
testdata_4 |
2011-01-01 13:57:16.172488 |
2012-01-01 13:57:16.172488 |
2020-04-20 14:57:16.997700 |
10 |
a |
0.24 |
0.2 |
1 |
b |
7272a99cb05c9f6105b774d4c5cf1061796c4f7d |
2019-12-03 11:05:29.000000 |
2001-01-01 13:57:16.172488 |
7272a99cb05c9f6105b774d4c5cf1061796c4f7d |
2000-01-01 10:50:58.000000 |
1 |
1 |
0.2 |
1 |
1 |
1 |
x86_64 |
f71196cc64ed |
4.19.76-linuxkit |
Linux |
#1 SMP Thu Oct 17 19:31:58 UTC 2019 |
|
4 |
5 |
error |
testdata_1_force |
2011-01-01 13:57:16.172488 |
2012-01-01 13:57:16.172488 |
2020-04-20 14:57:16.997700 |
1 |
a |
0.2 |
0.2 |
1 |
b |
7272a99cb05c9f6105b774d4c5cf1061796c4f7d |
2019-12-03 11:05:29.000000 |
2039-01-01 13:57:16.172488 |
7272a99cb05c9f6105b774d4c5cf1061796c4f7d |
1900-01-01 10:50:58.000000 |
0 |
0 |
0.1 |
1 |
1 |
1 |
x86_64 |
f71196cc64ed |
4.19.76-linuxkit |
Linux |
#1 SMP Thu Oct 17 19:31:58 UTC 2019 |
|
5 |
6 |
running |
testdata_5 |
2013-01-01 13:57:16.172488 |
2029-04-20 14:57:16.997700 |
10 |
a |
0.24 |
0.2 |
1 |
b |
7272a99cb05c9f6105b774d4c5cf1061796c4f7d |
2019-12-03 11:05:29.000000 |
2001-01-01 13:57:16.172488 |
7272a99cb05c9f6105b774d4c5cf1061796c4f7d |
2000-01-01 10:50:58.000000 |
1 |
1 |
0.2 |
2 |
2 |
2 |
x86_64 |
MrTrollMan |
4.19.76-linuxkit |
Linux |
#1 SMP Thu Oct 17 19:31:58 UTC 2019 |
||
6 |
7 |
submitted |
testdata_6 |
2030-04-20 14:57:16.997700 |
10 |
a |
0.24 |
0.2 |
1 |
b |
7272a99cb05c9f6105b774d4c5cf1061796c4f7d |
2019-12-03 11:05:29.000000 |
2001-01-01 13:57:16.172488 |
7272a99cb05c9f6105b774d4c5cf1061796c4f7d |
2000-01-01 10:50:58.000000 |
1 |
1 |
0.2 |
2 |
2 |
2 |
x86_64 |
MrTrollMan |
4.19.76-linuxkit |
Linux |
#1 SMP Thu Oct 17 19:31:58 UTC 2019 |
Or, we may simply want only the parameters
metadata_reader.drop_id = None
metadata = metadata_reader.get_parameters_metadata()
bar.id |
bar.foo |
bar.quux |
bar.qux |
baz.id |
baz.bar |
baz.corge |
baz.quuz |
foo.id |
foo.bar |
foo.foo |
foo.foobar |
parameters.id |
parameters.bar_id |
parameters.baz_id |
parameters.foo_id |
|
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 |
1 |
1 |
a |
0.2 |
1 |
0.2 |
1 |
b |
1 |
0 |
0 |
0.1 |
1 |
1 |
1 |
1 |
1 |
2 |
10 |
a |
0.24 |
1 |
0.2 |
1 |
b |
2 |
1 |
1 |
0.2 |
2 |
2 |
1 |
2 |
2 |
2 |
10 |
a |
0.24 |
1 |
0.2 |
1 |
b |
1 |
0 |
0 |
0.1 |
3 |
2 |
1 |
1 |
Updating the database¶
In order to update the latest_status
, start_time
and stop_time
the StatusChecker
class can be used.
To do a one-time update we can run
from bout_runners.metadata.status_checker import StatusChecker
db_connector = runner.db_connector # Assuming runner is a BoutRunner object
status_checker = StatusChecker()
status_checker.check_and_update_status()
and for a loop which runs until all statuses are complete
status_checker.check_and_update_until_complete()