PyTorchおよびTensorFlowモデルのモデルパラメーター数を計算する方法

この記事では、TensorFlowとPyTorchディープラーニングモデルのパラメーター数の計算に関する短いチュートリアルを、事例とともに提供します。
Created on March 23|Last edited on May 30
Comment
私たちは、大規模モデルに簡単にアクセスできる時代に生きています。誰もが、学習済みDeberta v3モデルを使ってKaggle Kernelを作成し、任意のデータセットでそのモデルを微調整することができます。ただし、多くの人々が気づいていないことは、100GBを超えるトレーニングデータで学習済みの75～100 Mパラメーターモデルを使っていることです。 
確かに、パラメーター数を増やすことでパフォーマンス向上につながる可能性がありますが、これは、ストレージサイズ��増加と相まった結果として、推論時間の増加にもつながります。このため、モデルに含まれるパラメーター数を記録する必要があるかもしれません。
パフォーマンスがほぼ同様でありながら、パラメーター数が10分の1または20分の1少ないモデルに興味はありませんか？モデルパラメーター vs パフォーマンスグラフ、または単純にベンチマーキングであっても、これは理解しておく必要がある事実です。
いくつかの例を見て、PyTorchモデルとTensorFlow内のパラメーター数を計算する方法を確認しましょう。
目次コードPyTorchTensorflowサマリー推奨文献
﻿
﻿
DeepMind's Flamingo: Visual & Language Communication Combined
DeepMind recently released a combined visual and language model (a VLM) called Flamingo, capable of a variety of tasks taking text and image input simultaneously.
Meta AI Releases OPT-175B, Set Of Free-To-Use Pretrained Language Models
Meta AI announced a blog post today that they have released a new set of language models under the name "Open Pretrained Transformer". These models aim to replicate GPT-3 while being freely available for local use and training.
﻿
コード
PyTorchPyTorchには、モデルパラメーター数を計算するためのユーティリティ関数はありませんが（少なくとも現時点で）、モデルクラスのプロパティがあり、これを使ってモデルパラメーターを取得できます。
以下のスニペットを使ってすべてのモデルパラメーターを取得します：
total_params = sum(
	param.numel() for param in model.parameters()
)
このスニペットを簡単に見ていきます：
model.parameters():PyTorchモジュールには、 parameters()と呼ばれるメソッドがあり、これは、すべてのパラメーターについてイテレータを返します。
param.numel(): model.parameters()によって返されたイテレータオブジェクトを使って、 .numel()関数を使って要素の数を計算します
sum(...):すべてのパラメーターグループを合計します（モジュールには、レイヤーとしてサブモジュールが含まれている場合があります）
注：このスニペットは、モジュール内のすべてのパラメーターを返します。トレーニング可能およびトレーニング不可能なものがあります。トレーニング可能なパラメーターのみが必要な場合、以下のスニペットを使用します。
💡
trainable_params = sum(
	p.numel() for p in model.parameters() if p.requires_grad
)
Tensorの特別な .requires_gradプロパティを使って、トレーニング可能なパラメーターかどうかを判断します。Tensorで「requires_grad」が真に設定されている場合、autogradエンジンはこのTensorを変更できます（すなわち、それは「トレーニング可能」）。
TensorflowTensorflowは、パラメーター数を計算するための、 count_paramsというユーティリティ 関数を提供します。これは、keras utils（keras.utils.layerー_utils）で利用可能です。
以下のスニペットを使って、Tensorflowモデルのトレーニング可能なパラメーターとトレーニング不可能なパラメーターをすべてカウントします。
from keras.utils.layer_utils import count_params
﻿
model = ...
﻿
trainable_params = sum(count_params(layer) for layer in model.trainable_weights)
non_trainable_params = sum(count_params(layer) for layer in model.non_trainable_weights)
﻿
この情報を使って何ができるのでしょうか？Weights & Biasesの助けを借りて、パラメーター数を、 wandb.configパラメーターまたは サマリーとしてW&B runに記録し、後でレビューや比較ができます。
wandb.config.update({"Model Parameters": trainable_model_params})
######################           または          #####################
wandb.run.summary["Model Parameters"] = trainable_model_params
サマリーこの記事では、TensorFlowモデルとPyTorchモデルのパラメーター数の計算方法を確認しました。W&Bのすべての機能を確認するには、この5分間のガイドをご覧ください。数学と「ゼロから作る」コードの実装に関するその他のレポートにご興味がございましたら、以下のコメント欄で、または当社のフォーラム✨からご連絡ください。
﻿Fully Connectedで、「GPUの利用」や「モデルの保存」など、その他の基礎的開発トピックを扱うレポートをチェックしてください。
推奨文献
Setting Up TensorFlow And PyTorch Using GPU On Docker
A short tutorial on setting up TensorFlow and PyTorch deep learning models on GPUs using Docker.
How to Compare Keras Optimizers in Tensorflow for Deep Learning
A short tutorial outlining how to compare Keras optimizers for your deep learning pipelines in Tensorflow, with a Colab to help you follow along.
Preventing The CUDA Out Of Memory Error In PyTorch
A short tutorial on how you can avoid the "RuntimeError: CUDA out of memory" error while using the PyTorch framework.
How to Initialize Weights in PyTorch
A short tutorial on how you can initialize weights in PyTorch with code and interactive visualizations.
Recurrent Neural Network Regularization With Keras
A short tutorial teaching how you can use regularization methods for Recurrent Neural Networks (RNNs) in Keras, with a Colab to help you follow along.
Tutorial: Regression and Classification on XGBoost
A short tutorial on how you can use XGBoost with code and interactive visualizations.
﻿
﻿
Add a comment
Iterate on AI agents and models faster. Try Weights & Biases today.