W&B Best Practices Guide

Includes PyTorch Lightning integration details
Created on February 14|Last edited on March 22
Comment
﻿
W&B Experiments and LoggingSDK Installation and LoginW&B RunsWhat Can I log and How do I log it? Scalar MetricsRich Media (e.g. images)ChartsTablesHyper-parameter Visualization and OptimizationW&B ArtifactsLogging ArtifactsConsuming ArtifactsTracking Artifacts By ReferenceExample: Seeing different versions of a TableModel Checkpointing in PyTorch LightningModel RegistryReportsProgrammatic ReportsOther Useful ResourcesImport/Export APISlack AlertsIntegrations: Going Beyond the Core W&B Primitives
﻿
W&B Experiments and Logging
SDK Installation and LoginTo start using W&B, you first need to install the Python package (if it's not already there)
pip install wandb
Once it's installed, authenticate your user account by logging in through the CLI or SDK. You should have receive an email to sign up to the platform, after which you can obtain your API token
wandb login --host <YOUR W&B HOST URL> <YOUR API TOKEN>
OR through Python:
wandb.login(host=os.getenv("WANDB_BASE_URL"), key=os.getenv("WANDB_API_KEY"))
Once you are logged in, you are ready to track your workflows!
W&B RunsAt the core of W&B is a Run, which is a logged unit of execution of Python code. A Run captures the entire execution context of that unit: Python library versions, hardware info, system metrics, git state, etc.. To create a run, call wandb.init(). There are a bunch of important arguments you can pass to wandb.init()  to provide additional context for the run and enable you to organize your runs later:
import wandb
﻿
wandb.init(project="pytorch-lightning-e2e",   
           entity="wandb",	        # Team
           group='my_group',	        # for organizing runs (e.g. distributed training)
           job_type='training',		# for organizing runs (e.g. preprocessing vs. training)
           config={'hyperparam1': 24,   # Hyperparams and other config
		   'hyperparam2': 'resnet'})
﻿
Run set29
﻿
See the full documentation for wandb.init for other arguments to customize its behavior.
What Can I log and How do I log it? Within a run context, you can log all sorts of useful info such as metrics, visualizations, charts, and interactive data tables explicitly with wandb.log. Here is a comprehensive guide of wandb.log and its api docs. 
Scalar MetricsScalar metrics can be logged by passing them in to  wandb.log as a dictionary with a name. 
wandb.log({"my_metric": some_scalar_value})
Each time wandb.log is called, that increments a variable W&B keeps track of called step. This is the (x-axis) you see with all the time-series charts. If you call wandb.log every epoch, then the step represents the epoch count, but you may be calling it other times in validation or testing loops, in which case the step is not as clear. To pass a step manually (simply add step = my_int_variable) to wandb.log. This can be important to getting your charts at the resolution you want. 
In Pytorch Lightning modules, you may want to set step to trainer.global_step  for example. It is recommended to pack as many metrics as you can into a single dictionary and logging them in one go vs. separate wandb.log calls, each of which increment the step. 
﻿
Run set29
﻿
You will notice that if you log a scalar metric multiple times in a run, it will appear as a line chart with the step as the x-axis, and it will also appear in the Runs Table. The entry in the Runs Table is the summary metric, which defaults to the last value logged during the course of the run. You can change this behavior by setting the summary metric in the run using run.summary["my_metric_name"]=some_value . This is useful if you want to compare runs according to different aggregations of a given metric (e.g. mean, max, min) as opposed to simply the last one. 
wandb.init()
﻿
for i in range(5):
    wandb.log({"my_metric": i})
﻿
wandb.summary["my_metric"] = 2 # 2 instead of the default 4
﻿
wandb.finish()
Rich Media (e.g. images)Logging rich media works roughly the same as scalars except you wrap your rich media in a wandb Data Type (e.g. wandb.Image). 
wandb.log({"my_image": wandb.Image("my_image.jpg"))
The different Data Types are flexible in what formats of files or objects they accept. wandb.Image will accept image files, Pillow objects, or numpy arrays, for example.  wandb.Images in particular, have a whole host of arguments for specifying captions, segmentation masks, or bounding boxes. 
When you log a rich media type, this creates a panel in the workspace which renders the rich media below the run name it belongs to. If you call wandb.log with a Data Type multiple times in a run, you will see a slider appear below the panel. This panel lets you slide over the step variable mentioned above. 
﻿
Run set29
﻿
Here is a more complete guide of logging rich media types and a list of all currently supported Data Types. 
ChartsThere are a variety of ways to log charts to W&B, but they all boil down to two modes: 
Static: Logging a pre-built chart (e.g. matplotlib, plotly)
Logging a pre-built chart works the same as logging any rich media type. You can take your chart generated through matplotlib  or plotly, or serialize it to an image or html and log it using wandb.log:
from matplotlib.pyplot import figure
import mpld3
﻿
fig = figure()
ax = fig.gca()
ax.plot([1,2,3,4])
﻿
mpld3.save_html(fig,"test.html")
﻿
wandb.log({"matplotlib_to_html": wandb.Html(open("test.html"), inject=False)})
Dynamic: Logging a chart's raw data and dynamically rendering the chart in W&B
This requires logging the raw data backing a desired chart as a wandb.Table (see below) and then using Vega to render the data graphically in the W&B UI. Fortunately, W&B has some api abstractions under wandb.plot.<plot_type> which perform these two steps automatically for common charts out there and all you have to do is use the following pattern:
# Confusion matrices
wandb.log({"conf_mat": wandb.plot.confusion_matrix(y_true=ground_truth, preds=predictions, class_names=class_names)})
﻿
# ROC Curves
wandb.log({"roc": wandb.plot.roc_curve(ground_truth, predictions)})
﻿
# PR Curves
wandb.log({"pr": wandb.plot.pr_curve(ground_truth, predictions)})
﻿
﻿
The benefit of dynamic charts is they will overlay chart data from multiple runs, making it easier to compare runs against each other vs. across separate plots. For a full list of supported plots, check out this page. To create plots outside this list, you will need to log the raw data and use the Custom Chart Editor to edit/create a Vega spec to render the data how you like it:
wandb.log({"my_chart_data": wandb.Table(dataframe=my_pandas_dataframe)})
TablesTables are a special wandb Data Type, which allow you to log data, including other wandb Data Types, into an interactive dataframe in the workspace. This is especially useful for logging model predictions in order to filter them and inspect errors. To log a table you can add data row-by-row or as a pandas dataframe or Python lists. The elements of the dataframe can be any wandb Data Type (e.g. wandb.Image, wandb.Html, wandb.Plotly) or simple scalar or text values:
# Add data as a list of lists or pandas dataframe
my_data = [
  [0, wandb.Image("img_0.jpg"), 0, 0],
  [1, wandb.Image("img_1.jpg"), 8, 0],
  [2, wandb.Image("img_2.jpg"), 7, 1],
  [3, wandb.Image("img_3.jpg"), 1, 1]
]
          
# create a wandb.Table() with corresponding columns
columns=["id", "image", "prediction", "truth"]
test_table = wandb.Table(data=my_data, columns=columns)
﻿
# Add data incrementally
for img_id, img in enumerate(mnist_test_data):
    true_label = mnist_test_data_labels[img_id]
    guess_label = my_model.predict(img)
    test_table.add_data(img_id, wandb.Image(img), \
                         guess_label, true_label)
﻿
wandb.log({"test_table": test_table})
Use tables to log validation, sample predictions, or model errors, not entire training datasets. They can handle up to 200k rows but UI performance will vary depending on how many rich media types you have embedded. Here is a comprehensive guide to logging tables. 
Note on Tables: when logging tables you will see in the workspace wandb.summary["my_table_name"] like below. This is using a weave expression to query logged data in W&B and render it appropriately. Read more about weave here. The upshot for right now is that W&B by default only renders the last version of a table (the summary one) logged in a run. So if you are logging tables multiple times throughout a run, you will only see the last one by default.  
﻿
project("wandb", "pytorch-lightning-e2e").runs.summary["validation_table"]
 - 4 of 176
image
ground_truth
prediction
1
2
3
4
You can interact with a table in a variety of ways. Here we create a new derived column based on errors in the predictions, then group and we can easily compare the distribution across runs. We can also project data in a table to 2D and get image tooltips: very useful for logging embeddings and visualizing them. 
﻿
Run set29
﻿
Hyper-parameter Visualization and OptimizationAnything logged in wandb.config appears as a column in the runs table and is considered a hyperparameter in W&B. These hyperparameters can be viewed dynamically in a Parallel Coordinates Chart, which you can add and manipulate in a workspace. You can edit this chart to display different hyperparameters or different metrics. The lines in the chart are different runs which have "swept" through the hyperparameter space. You can also plot a parameter importance chart to get a sense of what hyper-paramaeters are most important or correlated with the target metric. These importances are calculated using a random forest trained in your browser! Here are docs on the Parallel Coordinates Plot and the Parameter Importance Plot﻿
﻿
Run set29
﻿
W&B provides a mechanism for automating hyper-parameter search through W&B Sweeps. Sweeps allows you to configure a large set of experiments across a pre-specified hyper-parameter space. To implement a sweep you just need to: 
Add wandb.init() to your training script, ensuring that all hyper-parameters are passed to your training logic via wandb.config. 
﻿Write a yaml file with your hyper-parameter search specified i.e. method of search, hyper-parameter distributions and values to search over. 
Run the sweep controller, which runs in W&B through wandb.sweep or through the UI. The controller will delegate new hyperparameter values to wandb.config of the various agents running. 
﻿Run agents in however many machines you want to run the experiments with wandb.agent
The agents will execute the training script replacing the wandb.config with queued hyper-parameter values the controller is keeping track of. 
If you prefer to use other hyper-parameter optimization frameworks, W&B has integrations with RayTune, Optuna, among others. 
W&B ArtifactsArtifacts enable you to track and version any serialized data as the inputs and outputs of runs. This can be datasets (e.g. image files), evaluation results (e.g. heatmaps), or model checkpoints. W&B is agnostic to the formats or structure of the data you want to log as an artifact. 
﻿
Logging ArtifactsTo log an artifact, you first create an Artifact object with a name , type, and optionally description and metadata dictionary. You can then add any of these to the artifact object:
local files
local directories
wandb Data Types (e.g. wandb.Plotly or wandb.Tables) which will render alongside the artifact in the UI
remote files and directories (e.g. s3 buckets)
wandb.init(project="pytorch-lightning-e2e", entity='wandb', job_type="upload_data")
﻿
# Create Artifact
training_images = wandb.Artifact(name='training_images', 
                                 type="training_data",
                                 description='MNIST training data')
﻿
# Add serialized data i.e. directories, files, plots, html, W&B Tables
training_images.add_dir('./sample_images')
﻿
# Add other assets to better contextualize your artifact
training_images.add(wandb.Html('my_plotly_figure.html'), 'data_distribution_plot')
﻿
# Log to W&B, automatic versioning
wandb.log_artifact(training_images)
Each time you log this artifact, W&B will checksum the file assets you add to it and compare that to previous versions of the artifact. If there is a difference, a new version will be created, indicated by the alias v1 , v2, v3, etc. Users can optionally add/subtract additional aliases through the UI or API. Aliases are important because they uniquely identify an artifact version, so you can use them to pull down your best model for example.
﻿
project("wandb", "pytorch-lightning-e2e").artifact("Nature_100")
Nature_100Version 0
All Versions
Aliases
latest
Versions
v6
v5
v4
v3
v2
v1
v0
VersionMetadataUsageFilesLineage
Direct lineage view
Expanded graph
Include generated artifacts
Some nodes are concealed in this view - Break out items to reveal more.
Artifact - raw_images
Nature_100:v0
Run - log_datasets
lilac-resonance-1
Runs
6
clean-violet-7
resume_training
hopeful-bird-6
resume_training
deft-oath-5
training
easy-wildflower-4
training
effortless-tree-3
training
hardy-butterfly-2
training
React Flow
Consuming ArtifactsTo consume an artifact, execute the following:
with wandb.init(project="pytorch-lightning-e2e", entity='wandb', job_type="model_training"})
# Indicate we are using a dependency
training_imgs_artifact = wandb.use_artifact("training_images:latest")
training_images_dir = training_imgs_artifact.download()
Tracking Artifacts By ReferenceYou may already have large datasets sitting in a cloud object store like s3 and just want to track what versions of those datasets Runs are utilizing and any other metadata associated with those datasets. You can do so by logging these artifacts by reference, in which case W&B only tracks the checksums and metadata of an artifact and does not copy the entire data asset to W&B. Here are some more details on tracking artifacts by reference. 
With artifacts you can now refer to arbitrary data assets through durable and simple names and aliases (similar to how you deal with Docker containers). This makes it really easy to hand off these assets between people and processes and see the lineage of all data, models, and results. 
Example: Seeing different versions of a TableAny wandb.Table that you log is logged as an artifact of type Runs Table automatically. You can see all the tables you've logged and their respective versions. For instance, it is common to log a table periodically throughout a run:
wandb.init()
﻿
for i in range(epochs):
   wandb.log({"my_table": wandb.Table(...)})
As discussed above, the default table view in the workspace will be the summary view i.e. the last table logged in the run. If you want to see and compare other versions of the table, go the artifacts tab of the project and look at the Runs Table artifacts. Find your table name and click on the version you care about. If you then go to Files and click on <my_table_name>.table.json you will see the table render. 
click on the file name to render the table
﻿
﻿
Model Checkpointing in PyTorch LightningIf you are using the WandbLogger with the PyTorch Lightning Trainer, the ModelCheckpoint Callback will automatically log model checkpoints to W&B. See more details in the PyTorch Lightning integration docs. 
wandb.init(project='pytorch-lightning-e2e',
           entity='wandb',
           job_type='training',
           config={
                  "model_name": "resnet",
                  "batch_size": 16
           })
﻿
# Logs all checkpoints.
wandb_logger = WandbLogger(log_model='all', checkpoint_name=f'nature-{wandb.run.id}') 
﻿
checkpoint_callback = ModelCheckpoint(every_n_epochs=1)
﻿
model = NatureLitModule(model_name=wandb.config['model_name']) # Access hyperparameters downstream to instantiate models/datasets
﻿
trainer = Trainer(logger=wandb_logger,  # W&B integration
                  callbacks=[checkpoint_callback],
                  max_epochs=5,
                  log_every_n_steps=5)                          
trainer.fit(model, datamodule=nature_module)
W&B has many other integrations with frameworks like Keras and Hugging Face, which offer similar functionality. 
Model RegistryThe model registry is a central place to house and organize all the model tasks and their associated artifacts being worked on across an org:
Document your models with rich model cards
Maintain a history of all the models being used/deployed
Facilitate clean hand-offs and stage management of models
Tag and organize various model tasks
Setup slack automatic notifications when models progress
To use the model registry, you need:
 To have logged some artifacts of type model. 
Create a Registered Model Task in the Model Registry. 
﻿
﻿
﻿
﻿
﻿
  3. In the artifacts page for a given project, click the Link to Registry button next to artifacts of type model. (You can also do this programmatically through run.wandb.link_artifact﻿
﻿
﻿
When you link to the registry, this creates a new version of that Registered Model, which is just a pointer to the artifact version living in that project. That may sound confusing at first, but there's a reason W&B segregates the versioning of artifacts in a project from the versioning of a Registered Model. The process of linking a model artifact version is equivalent to "bookmarking" that artifact version under a Registered Model task. 
Typically during R&D/experimentation, researchers generate 100s, if not 1000s of model checkpoint artifacts, but only one or two of them actually "see the light of day." This process of linking those checkpoints to a separate, versioned registry helps delineate the model development side from the model deployment/consumption side of the workflow. The globally understood version/alias of a model should be unpolluted from all the experimental versions being generated in R&D and thus the versioning of a Registered Model increments according to new "bookmarked" models as opposed to model checkpoint logging. 
﻿
﻿
Model consumers, whether they are engineers, researchers, or CI/CD processes, can go to the model registry as the central hub for all models that should "see the light of day": those that need to go through testing or move to production. 
﻿
project("wandb", "model-registry").artifact("Nature Classification")
Nature ClassificationVersion 15
All Versions
Action History
Aliases
staging
evaluate
baseline
latest
production
Versions
v20
v19
v18
v17
v16
v15
v14
v13
v12
v11
v10
v9
v8
v7
v6
v5
v4
v1
v0
VersionMetadataUsageFilesLineage
Version overview
Full Name
wandb/model-registry/Nature Classification:v15
Aliases
v15
Tags
Digest
2ef81745cf2d63f855d3d2de43193100
Source Version
nature-sitw55md:v9
Created By
clean-violet-7
Created At
February 10th, 2023 00:23:42
Num Consumers
0
Num Files
1
Size
44.9MB
TTL Remaining
Inactive
Upstream Artifacts
nature-sitw55md:v4Nature_100:v0
Description
You can view the history of a Registered Model, what all versions have been linked to it overtime, along with who did what to a given model. 
﻿
project("wandb", "model-registry").artifact("Nature Classification")
Nature ClassificationVersions
All Versions
Action History
Aliases
staging
evaluate
baseline
latest
production
Versions
v20
v19
v18
v17
v16
v15
v14
v13
v12
v11
v10
v9
v8
v7
v6
v5
v4
v1
v0
Model card
Tags
Full name
wandb/model-registry/Nature Classification
Type
model
Created At
July 20th, 2022
Automations
Automate actions based on changes in this registered model.
Slack notifications
Notify the team when changes happen in the model registry.
Description
This model is a fine-tuned version of inceptionv3 from Keras, trained on a subset of the iNaturalist 2017 dataset to classify nature photos into 10 types of living creature (is it a bird, plant, mammal, etc), with the formal Latin taxonomy:
[Amphibia, Animalia,  Arachnida, Aves, Fungi, Insecta, Mammalia, Mollusca, Plantae, Reptilia]
Expected Inputs:Training:Preprocessing consists of resizing to 299x299, rescaling, and horizontally flipping
train_datagen = ImageDataGenerator(
      rescale=1. / 255,
      shear_range=0.2,
      zoom_range=0.2,
      horizontal_flip=True)
val_datagen = ImageDataGenerator(rescale=1. / 255)

train_generator = train_datagen.flow_from_directory(
    train_dir,
    target_size=(299, 299),
    batch_size=cfg.batch_size,
    class_mode='categorical')
Inference:299x299 images expected with following the validation generator used in training:
val_generator = val_datagen.flow_from_directory(
    val_dir,
    target_size=(299, 299),
    batch_size=cfg.batch_size,
    class_mode='categorical',
    shuffle=False)
Model Raw Output:Batched tensor of 10 probabilities for the different nature classes which can be post-processed with the following index list:
{Amphibia: 0,  Animalia: 1, Arachnida: 2,  Aves: 3, Fungi: 4, Insecta:5, Mammalia:6,  Mollusca: 7, Plantae: 8, Reptilia: 9}
ReferencesInceptionv3 paper: https://arxiv.org/abs/1512.00567
Versions
1-10
 of 10+
Version
Aliases
Logged By
Tags
Created
TTL Remaining
# of Consuming Runs
Size
m.format
20
v20
decent-terrain-67
Wed Apr 12 2023
Inactive
1
4.8MB
pickle
19
staging
v19
fresh-universe-65
Wed Mar 22 2023
Inactive
0
4.8MB
pickle
18
v18
northern-plasma-26
Fri Mar 24 2023
Inactive
0
3.0MB
onnx
17
evaluate
v17
serene-forest-14
Tue Mar 07 2023
Inactive
35
3.0MB
-
16
v16
playful-serenity-57
Wed Feb 08 2023
Inactive
0
4.8MB
pickle
15
v15
clean-violet-7
Fri Feb 10 2023
Inactive
0
44.9MB
-
14
baseline
latest
v14
neat-hill-23
ghost
Fri Nov 20 2020
Inactive
51
104.9MB
-
13
v13
11mo9wbm
Fri Nov 20 2020
Inactive
0
104.9MB
-
12
production
v12
generous-rain-52
Mon Jan 09 2023
Inactive
0
4.8MB
pickle
11
v11
vivid-snow-42
Wed Nov 16 2022
Inactive
1
4.8MB
pickle
Loading...
Automations
No automations yet
Automate actions based on changes in the collection.
﻿
project("wandb", "model-registry").artifact("Nature Classification")
Nature ClassificationVersions
All Versions
Action History
Aliases
staging
evaluate
baseline
latest
production
Versions
v20
v19
v18
v17
v16
v15
v14
v13
v12
v11
v10
v9
v8
v7
v6
v5
v4
v1
v0
Model card
Tags
Full name
wandb/model-registry/Nature Classification
Type
model
Created At
July 20th, 2022
Automations
Automate actions based on changes in this registered model.
Slack notifications
Notify the team when changes happen in the model registry.
Description
This model is a fine-tuned version of inceptionv3 from Keras, trained on a subset of the iNaturalist 2017 dataset to classify nature photos into 10 types of living creature (is it a bird, plant, mammal, etc), with the formal Latin taxonomy:
[Amphibia, Animalia,  Arachnida, Aves, Fungi, Insecta, Mammalia, Mollusca, Plantae, Reptilia]
Expected Inputs:Training:Preprocessing consists of resizing to 299x299, rescaling, and horizontally flipping
train_datagen = ImageDataGenerator(
      rescale=1. / 255,
      shear_range=0.2,
      zoom_range=0.2,
      horizontal_flip=True)
val_datagen = ImageDataGenerator(rescale=1. / 255)

train_generator = train_datagen.flow_from_directory(
    train_dir,
    target_size=(299, 299),
    batch_size=cfg.batch_size,
    class_mode='categorical')
Inference:299x299 images expected with following the validation generator used in training:
val_generator = val_datagen.flow_from_directory(
    val_dir,
    target_size=(299, 299),
    batch_size=cfg.batch_size,
    class_mode='categorical',
    shuffle=False)
Model Raw Output:Batched tensor of 10 probabilities for the different nature classes which can be post-processed with the following index list:
{Amphibia: 0,  Animalia: 1, Arachnida: 2,  Aves: 3, Fungi: 4, Insecta:5, Mammalia:6,  Mollusca: 7, Plantae: 8, Reptilia: 9}
ReferencesInceptionv3 paper: https://arxiv.org/abs/1512.00567
Versions
1-10
 of 10+
Version
Aliases
Logged By
Tags
Created
TTL Remaining
# of Consuming Runs
Size
m.format
20
v20
decent-terrain-67
Wed Apr 12 2023
Inactive
1
4.8MB
pickle
19
staging
v19
fresh-universe-65
Wed Mar 22 2023
Inactive
0
4.8MB
pickle
18
v18
northern-plasma-26
Fri Mar 24 2023
Inactive
0
3.0MB
onnx
17
evaluate
v17
serene-forest-14
Tue Mar 07 2023
Inactive
35
3.0MB
-
16
v16
playful-serenity-57
Wed Feb 08 2023
Inactive
0
4.8MB
pickle
15
v15
clean-violet-7
Fri Feb 10 2023
Inactive
0
44.9MB
-
14
baseline
latest
v14
neat-hill-23
ghost
Fri Nov 20 2020
Inactive
51
104.9MB
-
13
v13
11mo9wbm
Fri Nov 20 2020
Inactive
0
104.9MB
-
12
production
v12
generous-rain-52
Mon Jan 09 2023
Inactive
0
4.8MB
pickle
11
v11
vivid-snow-42
Wed Nov 16 2022
Inactive
1
4.8MB
pickle
Loading...
Automations
No automations yet
Automate actions based on changes in the collection.
ReportsReports are flexible documents you can build on top of your W&B projects. You can easily embed any asset (chart, artifact, table) logged in W&B into a report alongside markdown, LaTeX, code blocks, etc. You can created rich documentation from your logged assets without copy-pasting static figures into word docs or managing excel spreadsheets. Reports are live in that as new experiments run, they will update accordingly. This report you are viewing is a good example of what all you can put into them. 
Programmatic ReportsIt may be useful to programmatically generate a report, such as for a standard model comparison analysis you might be doing repeatedly when retraining models, or after doing a large hyperparamater search. The W&B Python sdk provides a means of programmatically generating reports very easily under wandb.apis.reports. Check out the docs and this quickstart notebook. 
Other Useful Resources
Import/Export APIAll data logged to W&B can be accessed programmatically through the import/export API (also called the public API). This enables you to pull down run and artifact data, filter and manipulate it how you please in Python. 
Slack AlertsYou can set slack alerts within a run to trigger when things happen in your training / evaluation scripts. For example, you may want to notify you when training is done or when a metric exceeds a certain value. 
There is built-in alerting in the Model Registry as well. If you go to Settings, you will see the option to add a slack alert when a new model is linked:
﻿
project("wandb", "model-registry").artifact("Nature Classification")
Nature ClassificationVersions
All Versions
Action History
Aliases
staging
evaluate
baseline
latest
production
Versions
v20
v19
v18
v17
v16
v15
v14
v13
v12
v11
v10
v9
v8
v7
v6
v5
v4
v1
v0
Model card
Tags
Full name
wandb/model-registry/Nature Classification
Type
model
Created At
July 20th, 2022
Automations
Automate actions based on changes in this registered model.
Slack notifications
Notify the team when changes happen in the model registry.
Description
This model is a fine-tuned version of inceptionv3 from Keras, trained on a subset of the iNaturalist 2017 dataset to classify nature photos into 10 types of living creature (is it a bird, plant, mammal, etc), with the formal Latin taxonomy:
[Amphibia, Animalia,  Arachnida, Aves, Fungi, Insecta, Mammalia, Mollusca, Plantae, Reptilia]
Expected Inputs:Training:Preprocessing consists of resizing to 299x299, rescaling, and horizontally flipping
train_datagen = ImageDataGenerator(
      rescale=1. / 255,
      shear_range=0.2,
      zoom_range=0.2,
      horizontal_flip=True)
val_datagen = ImageDataGenerator(rescale=1. / 255)

train_generator = train_datagen.flow_from_directory(
    train_dir,
    target_size=(299, 299),
    batch_size=cfg.batch_size,
    class_mode='categorical')
Inference:299x299 images expected with following the validation generator used in training:
val_generator = val_datagen.flow_from_directory(
    val_dir,
    target_size=(299, 299),
    batch_size=cfg.batch_size,
    class_mode='categorical',
    shuffle=False)
Model Raw Output:Batched tensor of 10 probabilities for the different nature classes which can be post-processed with the following index list:
{Amphibia: 0,  Animalia: 1, Arachnida: 2,  Aves: 3, Fungi: 4, Insecta:5, Mammalia:6,  Mollusca: 7, Plantae: 8, Reptilia: 9}
ReferencesInceptionv3 paper: https://arxiv.org/abs/1512.00567
Versions
1-10
 of 10+
Version
Aliases
Logged By
Tags
Created
TTL Remaining
# of Consuming Runs
Size
m.format
20
v20
decent-terrain-67
Wed Apr 12 2023
Inactive
1
4.8MB
pickle
19
staging
v19
fresh-universe-65
Wed Mar 22 2023
Inactive
0
4.8MB
pickle
18
v18
northern-plasma-26
Fri Mar 24 2023
Inactive
0
3.0MB
onnx
17
evaluate
v17
serene-forest-14
Tue Mar 07 2023
Inactive
35
3.0MB
-
16
v16
playful-serenity-57
Wed Feb 08 2023
Inactive
0
4.8MB
pickle
15
v15
clean-violet-7
Fri Feb 10 2023
Inactive
0
44.9MB
-
14
baseline
latest
v14
neat-hill-23
ghost
Fri Nov 20 2020
Inactive
51
104.9MB
-
13
v13
11mo9wbm
Fri Nov 20 2020
Inactive
0
104.9MB
-
12
production
v12
generous-rain-52
Mon Jan 09 2023
Inactive
0
4.8MB
pickle
11
v11
vivid-snow-42
Wed Nov 16 2022
Inactive
1
4.8MB
pickle
Loading...
Automations
No automations yet
Automate actions based on changes in the collection.
Integrations: Going Beyond the Core W&B Primitiveswandb.log, wandb.Artifact, wandb.Table, and wandb.sweep can take you far in building your machine learning system of record, forming the core of some best practices we see top machine learning research teams employ in their everyday workflows. Beyond these primitives, our team continues to build out integrations with higher level frameworks and tools whereby simply adding a single W&B callback or function argument causes everything to be automatically logged under the hood. Check out our integrations page and double check the docs of your favorite machine learning repo as their might be already be a W&B integration in place! Let us know if you'd like to see W&B integrated in a package or tool we aren't yet logging!
﻿
Add a comment