\n\n\n\n How to Set Up Monitoring with Weights & Biases (Step by Step) \n

How to Set Up Monitoring with Weights & Biases (Step by Step)

📖 9 min read1,769 wordsUpdated Mar 21, 2026

How to Set Up Monitoring with Weights & Biases (Step by Step)

If you’re managing machine learning experiments and still logging everything to spreadsheets or scattered text files, you’re seriously missing out—weights & biases set up monitoring will save you dozens of painful hours and sleepless nights wrestling with disorganized experiment data.

What You’ll Build and Why It Matters

We’re setting up weights & biases monitoring to track, visualize, and debug machine learning training runs efficiently—no more lost metrics or guessing which hyperparameters made your model spike.

Prerequisites

  • Python 3.8+ (Weights & Biases is fairly forgiving but stick to 3.8 or later for compatibility)
  • pip installed, preferably version 23.0+ (updates often fix bugs with package dependencies)
  • Create a free Weights & Biases account at https://wandb.ai
  • Basic familiarity with machine learning training scripts (either TensorFlow, PyTorch, or sklearn works)
  • Familiarity with command line tools (weights & biases CLI is a must)

Step-by-step Setup

Step 1: Install the Weights & Biases Python Package

pip install wandb

This is the only package you need for monitoring your ML runs. It hooks into your codebase and gives you live dashboards, metric tracking, and artifact management.

Why? Because wandb’s client handles everything related to upload, synchronization, and syncing with the server automatically. This saves you the headache of manual logging.

Common pitfalls: If you get a version conflict or failed dependency issue, upgrade pip and setuptools:

pip install --upgrade pip setuptools

Step 2: Log in to Your Wandb Account

wandb login

This command prompts for your API key. Head over to your account settings on https://wandb.ai/settings and copy your API key. Paste it when the CLI asks.

Why? You need to authenticate your CLI tool with your cloud project so your runs are linked with your user account and projects.

Gotcha: If you accidentally paste whitespaces or hit enter early, wandb will reject your key and show a cryptic error about authentication failure.

Step 3: Initialize Wandb in Your Training Script

import wandb

# Start a new run
wandb.init(project="my-ml-project", entity="your-username")

# Example: Log hyperparameters
config = wandb.config
config.learning_rate = 0.001
config.batch_size = 32

Inserting this snippet at the start of your main training loop lets wandb capture everything inside that run.

Why? The initialization step creates a new run object on the server side, letting you log metrics and artifacts in real-time. Without it, you won’t see any data in your workspace.

Common error: If you forget to call wandb.init(), your calls to wandb.log() won’t do anything, and it can silently fail. Always double-check this.

Step 4: Log Metrics and Hyperparameters During Training

for epoch in range(num_epochs):
 # Train your model here
 train_loss = compute_train_loss()
 val_loss = compute_val_loss()
 
 # Log metrics to wandb
 wandb.log({
 "epoch": epoch,
 "train_loss": train_loss,
 "val_loss": val_loss
 })

This snippet must be inside your main training loop. wandb.log() flushes data to the server asynchronously.

Why? Logging lets you track model performance step by step. You can spot overfitting or plateaus and adjust hyperparameters accordingly.

Typical mistake: Not sending logs frequently enough, which leads to incomplete run data if your job crashes. Make sure wandb.log() gets called after every meaningful update (usually every epoch or batch).

Step 5: Save Model Artifacts for Version Tracking

# Save your model checkpoint locally
torch.save(model.state_dict(), "model.pt")

# Upload the checkpoint as an artifact to wandb
artifact = wandb.Artifact('model', type='model')
artifact.add_file("model.pt")
wandb.log_artifact(artifact)

Wandb artifacts let you track versions of models, datasets, or other outputs. This keeps your training reproducible and debuggable.

Why? Artifacts enable collaborative workflows and integration with CI/CD. You can compare models and even roll back to previous checkpoints easily.

Gotcha: Forgetting to call wandb.log_artifact() means your saved files won’t show up in the project dashboard. Also, large artifacts might fail silently if storage quotas are hit — double check your project’s usage limits.

Step 6: Visualize Results in the Wandb Dashboard

Open https://wandb.ai and navigate to your project. You’ll see live graphs updating with your logged metrics, hyperparameters, and artifacts.

Why? Visualization is the killer feature that makes weights & biases set up monitoring truly worthwhile—seeing your metrics trends helps you understand model behavior in real time.

Heads up: If your metrics don’t appear, double-check that your wandb.init() has the correct project name and that you’re logged into the correct account (entity). Also, verify that your networking allows cloud connections (sometimes corporate firewalls block this).

Step 7: Advanced Integration – Automate Wandb Runs with CI/CD

# Example GitHub Actions workflow snippet

name: Run Training and Log to W&B

on: [push]

jobs:
 train:
 runs-on: ubuntu-latest
 steps:
 - uses: actions/checkout@v2
 - name: Set up Python
 uses: actions/setup-python@v2
 with:
 python-version: 3.9
 - name: Install dependencies
 run: pip install wandb torch
 - name: Login to Wandb
 run: echo ${{ secrets.WANDB_API_KEY }} | wandb login
 - name: Run training script
 run: python train.py

This script shows an automated CI workflow with GitHub running your training and pushing results to wandb automatically.

Why? Production-level teams need monitoring that integrates into pipelines — manual runs are tedious and error-prone.

Potential issues: You must store your Wandb API key safely as a secret environment variable (never commit it). Forgetting this means CI jobs silently fail authentication.

The Gotchas Nobody Tells You

  • Quotas and Limits: Your free tier on Wandb lets you log roughly a few thousand runs per month before hitting bandwidth or artifact storage limits. If you blast 10K+ epochs or large datasets, expect throttling. You won’t get explicit warnings immediately; just check your project quota.

    Solution: Clean up old runs regularly and archive large artifacts externally.
  • Latency between logging and dashboard refresh: Wandb uploads asynchronously, so sometimes your latest metrics show up a few seconds late, which is frustrating if you’re debugging at batch-level granularity.

    Solution: Add `wandb.log(…, commit=True)` to flush data at critical points.
  • Environment inconsistencies: Wandb monitors your Python environment’s package versions. If your code runs in a Docker or remote environment without exact package lists (requirements.txt), your experiment might not be reproducible despite logged metrics.
    Solution: Always fix and log package versions.
  • Network issues in restricted environments: Corporate and academic servers often block Wandb’s telemetry uploads by default, causing silent failures or indefinite hangs.
    Solution: Use local offline mode (`wandb.init(mode=”offline”)`) and sync later, or whitelist domains on firewalls.
  • Logging too much data: Dumping every single batch metric can bloat your runs and slow down your UI. Use summary stats or sample at intervals.
    Solution: Log at epoch level or every N batches, not every batch.

Full Working Code Example

import wandb
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset

# Dummy dataset
X = torch.randn(100, 10)
y = torch.randint(0, 2, (100,))

dataset = TensorDataset(X, y)
loader = DataLoader(dataset, batch_size=16, shuffle=True)

# Simple binary classifier
class SimpleModel(nn.Module):
 def __init__(self):
 super().__init__()
 self.fc = nn.Linear(10, 2)
 def forward(self, x):
 return self.fc(x)

model = SimpleModel()
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Initialize wandb run
wandb.init(project="my-ml-project", entity="your-username")
wandb.config.update({
 "epochs": 5,
 "batch_size": 16,
 "learning_rate": 0.001,
})

for epoch in range(wandb.config.epochs):
 running_loss = 0.0
 for inputs, labels in loader:
 optimizer.zero_grad()
 outputs = model(inputs)
 loss = criterion(outputs, labels)
 loss.backward()
 optimizer.step()
 running_loss += loss.item()
 avg_loss = running_loss / len(loader)

 print(f"Epoch {epoch+1}, Loss: {avg_loss:.4f}")
 wandb.log({"epoch": epoch + 1, "loss": avg_loss})

# Save and upload model artifact
torch.save(model.state_dict(), "model.pt")
artifact = wandb.Artifact('simple-model', type='model')
artifact.add_file("model.pt")
wandb.log_artifact(artifact)

wandb.finish()

What’s Next

Once you’ve got your weights & biases set up monitoring and logging happening gracefully, I recommend adding wandb Sweeps to automate hyperparameter tuning. It’s way better than manually rerunning scripts for every parameter combo and lets you focus on results instead of fiddly experiments.

FAQ

Q: Can I use weights & biases with frameworks other than PyTorch?

A: Absolutely. Wandb supports TensorFlow, Keras, Hugging Face Transformers, sklearn—you name it. It basically wraps around your training loop and logs data. You just need to insert similar wandb.init() and wandb.log() calls in the right places.

Q: How do I keep sensitive credentials safe when using wandb in CI pipelines?

A: Store your Wandb API key in your CI provider’s secrets management system (e.g., GitHub Secrets, GitLab CI variables). Never hardcode keys in source. Then, use environment variables to supply keys during CI runtime, like echo $WANDB_API_KEY | wandb login.

Q: Is it possible to run wandb logging offline and sync later?

A: Yes. You can initialize wandb with wandb.init(mode="offline"), which caches logs locally. Later, run wandb sync to upload past run data when network access is restored. This is helpful for air-gapped environments.

Wandb Project and Run Data at a Glance

To understand why weights & biases set up monitoring outshines traditional spreadsheet logging, here’s a quick table comparing typical manual tracking vs wandb monitoring across key metrics:

Aspect Manual Logging (Excel/CSV) Weights & Biases Monitoring
Real-time visibility No (must wait to open files) Yes (live dashboards updating automatically)
Hyperparameter tracking Often forgotten or inconsistent Automated; always linked to runs
Model artifact versioning Manual file saves, no metadata Built-in artifact version control
Collaboration Email files or share folders Teams share live projects with role-based access
Integration with CI/CD Manual steps Automated via scripts and APIs
Storage limits Local disk space Cloud quotas and archiving options

Tailored Recommendations for Different Developer Personas

If you’re a solo hobbyist tuning small models on your laptop, start by setting up wandb locally with offline mode enabled so you don’t worry about network problems. The UI and logging will give your experiments a sense of organization that spreadsheets can’t match.

If you’re a data scientist balancing multiple models and collaborators, invest time in integrating wandb runs with your team’s Git repos and cloud infrastructure. Automate your logging and artifacts—trust me, no one wants to ask for another CSV export.

For ML engineers building production pipelines, embed wandb sweeps and CI/CD automation early in your process. You’ll want consistent experiment state, rollback options, and monitoring integrated into release cycles to avoid debugging black holes weeks later.

Data as of March 21, 2026. Sources: Weights & Biases Official Site, Weights & Biases Documentation

Related Articles

🕒 Published:

✍️
Written by Jake Chen

AI technology writer and researcher.

Learn more →
Browse Topics: AI Security | compliance | guardrails | safety | security

See Also

Ai7botAgent101Bot-1Agntai
Scroll to Top