
The first step in any metric management system is to gather the appropriate measurements during a training run. DKube makes it fast and simple to collect and store metrics.
Adding a few MLFlow-compatible API calls will store the metric information as part of the training run and model metadata. This information is saved throughout the model life cycle, so that it can be reviewed after the training run has completed, or later in the process to allow model improvement.
Metrics can be autologged for extreme simplicity, or specific metrics can be logged by instrumenting the code with the name of the metric required.
The metric information is stored as part of the model version, making it easy to see what combination of inputs - code, data, and hyperparameters - lead to the associated outcomes. This provides a convenient way to understand the impact of different inputs, and can be used as a launching point to create a new run with differing inputs.
And the metrics are not restricted to a single user. All users in the group have access to them, encouraging cooperation and facilitating incremental progress across the organization.