Data and models are meaningless without good metadata. Metadata should be independently understandable; machine actionable; discoverable, dynamic, interactive. bllFlow uses consistent labels and metadata throughout the workflow. We explain why in our accompanying [document]. tl;dr Nature Videos


bllFlow-variables is the data file that describes labels and metadata in bblFlow. Add to your own bllFlow-variables or make a pull request.

Labels and metadata are used in five key areas of research reporting:

  1. Data cleaning and variable transformation – such as applying truncation rules, centering, etc.
  2. Aggregated results – such as Table 1 - Characteristics of study population.
  3. Model description – model coefficeints.
  4. Summary statistics – metrics model performance.
  5. Validation data – example dataset to verify algorithm scoring (for predictive models).