wandb

class aitoolbox.torchtrain.callbacks.wandb.AlertConfig(metric_name: str, threshold_value: float, objective: str = 'maximize', wandb_alert_level: wandb.sdk.wandb_alerts.AlertLevel = None)[source]

Bases: object

metric_name: str
threshold_value: float
objective: str = 'maximize'
wandb_alert_level: AlertLevel = None
class aitoolbox.torchtrain.callbacks.wandb.WandBTracking(metric_names=None, batch_log_frequency=None, hyperparams=None, tags=None, alerts=None, wandb_pre_initialized=False, source_dirs=(), log_dir=None, is_project=True, project_name=None, experiment_name=None, local_model_result_folder_path=None, **kwargs)[source]

Bases: AbstractExperimentCallback

Weights And Biases Logger

Find more on: https://wandb.ai

Note

Before this callback can be used you need to have wandb account and be credentialed on the machine. Instructions for this process can be found on wandb GitHub: https://github.com/wandb/client

Parameters:
  • metric_names (list or None) – list of metric names tracked in the training history. If left to None, all the metrics in the training history will be logged.

  • batch_log_frequency (int or None) – frequency of logging. If set to None batch level logging is skipped. Instead of also mid-epoch logging only end-of-epoch logging is executed.

  • hyperparams (dict or None) – dictionary of used hyperparameters. If set to None the callback tries to find the hyperparameter dict in the encapsulating TrainLoop running the callback.

  • tags (list or None) – used for wandb init. From wandb documentation: A list of strings, which will populate the list of tags on this run in the UI. Tags are useful for organizing runs together, or applying temporary labels like “baseline” or “production”. It’s easy to add and remove tags in the UI, or filter down to just runs with a specific tag.

  • alerts (list[AlertConfig] or None) – list of alerts where each alert configuration is specified as an AlertConfig dataclass. User should provide the metric_name based on which the alert should be triggered. The last calculated value of the metric is then compared with the provided threshold_value. The objective can be either “maximize” or “minimize”.

  • wandb_pre_initialized (bool) – if wandb has been initialized already outside the callback (e.g. at the start of the experiment script). If not, the callback initializes the wandb process.

  • source_dirs (tuple or list) – list of source code directories which will be stored by wandb. If empty list is given the callback will try to get this information from the running TrainLoop. If this is also not available the callback leaves wandb default code saving operation which saves the execution python script.

  • log_dir (str or None) – save directory location

  • is_project (bool) – set to True if the wandb project folder should be placed into the TrainLoop-created project folder structure or to False if you want to save into a specific full path given in the log_dir parameter.

  • project_name (str or None) – root name of the project

  • experiment_name (str or None) – name of the particular experiment

  • local_model_result_folder_path (str or None) – root local path where project folder will be created

  • **kwargs – additional arguments for wandb.init() wrapped inside this callback

on_epoch_end()[source]

Logic executed at the end of the epoch

Returns:

None

on_batch_end()[source]

Logic executed after the batch is inserted into the model

Returns:

None

log_mid_train_loss()[source]

Log the training loss at the batch iteration level

Logs current batch loss and the accumulated average loss.

Returns:

None

log_train_history_metrics(metric_names)[source]

Log the train history metrics at the end of the epoch

Parameters:

metric_names (list) – list of train history tracked metrics to be logged

Returns:

None

static send_configured_alerts(alerts, metrics_log)[source]

Send wandb alerts

Sending of alerts depends on current metric values in the metrics_log satisfying the conditions specified in the alert configuration.

Parameters:
  • alerts (list[AlertConfig]) – list of alerts where each alert configuration is specified as an AlertConfig dataclass. User should provide the metric_name based on which the alert should be triggered. The last calculated value of the metric is then compared with the provided threshold_value. The objective can be either “maximize” or “minimize”.

  • metrics_log (dict) – dict of metrics names and their corresponding current values.

Returns:

None

on_train_loop_registration()[source]

Execute callback initialization / preparation after the train_loop_object becomes available

Returns:

None

try_infer_additional_logging_details()[source]
check_alerts()[source]