「とりあえずcategorical_crossentropyを使っているけど、Label Smoothingって試す価値ある?」
そう感じている方は多いと思います。今回はGoogle ColabとCIFAR-10を使い、Label Smoothingのsmoothingパラメータを0.0 / 0.05 / 0.1 / 0.2の4パターンで比較しました。過学習への影響・精度・val_lossの変化を実験で確認します。
📘 この記事でわかること
- Label Smoothingとは何か、なぜ過学習を抑えるのか
- smoothing値ごとに精度・val_lossがどう変わるか
- 実務での適切なsmoothingパラメータの目安
Label Smoothingとは
通常の損失関数(例:categorical_crossentropy)は正解ラベルを「完全に1」、それ以外を「完全に0」として扱います。モデルはこの硬い正解を過信しやすく、過学習の一因になります。
Label Smoothingは正解ラベルの確信度を少しだけ下げ、残りのクラスにも微量の確率を分配することで、モデルを「そこまで自信満々にならなくてよい」状態に誘導する手法です。
正解クラスの目標値:\( y_{\text{smooth}} = 1 - \varepsilon + \dfrac{\varepsilon}{K} \) 不正解クラス:\( \dfrac{\varepsilon}{K} \)
(\(\varepsilon\):smoothing値、\(K\):クラス数)
例えば CIFAR-10(K=10)でsmoothingを0.1に設定すると、正解クラスの目標値は 1−0.1+0.01 = 0.91 になり、各不正解クラスには0.01ずつが割り当てられます。
| smoothing値 | 正解クラスの目標値 | 各不正解クラスの目標値 |
|---|---|---|
| 0.0(なし) | 1.000 | 0.000 |
| 0.05 | 0.955 | 0.005 |
| 0.1 | 0.910 | 0.010 |
| 0.2 | 0.820 | 0.020 |
実験コード
使用環境はGoogle Colab(GPU:T4)、データセットはCIFAR-10です。smoothing値以外の条件はすべて同一にして、Label Smoothingの影響だけを取り出します。
環境準備(最初に一度だけ実行)
# ── 環境準備(最初に一度だけ実行)──────────────────────
!apt-get -y install fonts-ipafont-gothic
!rm -rf /root/.cache/matplotlib
!pip install -q japanize_matplotlib
print("環境準備完了")
実行結果をクリックして内容を開く
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
fonts-ipafont-mincho
The following NEW packages will be installed:
fonts-ipafont-gothic fonts-ipafont-mincho
0 upgraded, 2 newly installed, 0 to remove and 51 not upgraded.
Need to get 8,237 kB of archives.
After this operation, 28.7 MB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu jammy/universe amd64 fonts-ipafont-gothic all 00303-21ubuntu1 [3,513 kB]
Get:2 http://archive.ubuntu.com/ubuntu jammy/universe amd64 fonts-ipafont-mincho all 00303-21ubuntu1 [4,724 kB]
Fetched 8,237 kB in 0s (34.7 MB/s)
Selecting previously unselected package fonts-ipafont-gothic.
(Reading database ... 122363 files and directories currently installed.)
Preparing to unpack .../fonts-ipafont-gothic_00303-21ubuntu1_all.deb ...
Unpacking fonts-ipafont-gothic (00303-21ubuntu1) ...
Selecting previously unselected package fonts-ipafont-mincho.
Preparing to unpack .../fonts-ipafont-mincho_00303-21ubuntu1_all.deb ...
Unpacking fonts-ipafont-mincho (00303-21ubuntu1) ...
Setting up fonts-ipafont-mincho (00303-21ubuntu1) ...
update-alternatives: using /usr/share/fonts/opentype/ipafont-mincho/ipam.ttf to provide /usr/share/fonts/truetype/fonts-japanese-mincho.ttf (fonts-japanese-mincho.ttf) in auto mode
Setting up fonts-ipafont-gothic (00303-21ubuntu1) ...
update-alternatives: using /usr/share/fonts/opentype/ipafont-gothic/ipag.ttf to provide /usr/share/fonts/truetype/fonts-japanese-gothic.ttf (fonts-japanese-gothic.ttf) in auto mode
Processing triggers for fontconfig (2.13.1-4.2ubuntu5) ...
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.1/4.1 MB 52.2 MB/s eta 0:00:00
Preparing metadata (setup.py) ... done
Building wheel for japanize_matplotlib (setup.py) ... done
環境準備完了
import・データ準備・モデル構築関数
import tensorflow as tf
from tensorflow import keras
import matplotlib.pyplot as plt
import japanize_matplotlib
import time
# ── データ準備 ────────────────────────────────────────
(x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0
# Label Smoothingを使うのでone-hot変換が必要
y_train_oh = keras.utils.to_categorical(y_train, 10)
y_test_oh = keras.utils.to_categorical(y_test, 10)
# ── モデル構築(smoothing値以外は固定)──────────────────
def build_model(name):
return keras.Sequential([
keras.layers.Input(shape=(32, 32, 3)),
keras.layers.Conv2D(64, (3, 3), activation='relu', padding='same'),
keras.layers.MaxPooling2D((2, 2)),
keras.layers.Conv2D(128, (3, 3), activation='relu', padding='same'),
keras.layers.MaxPooling2D((2, 2)),
keras.layers.GlobalAveragePooling2D(),
keras.layers.Dense(128, activation='relu'),
keras.layers.Dropout(0.2),
keras.layers.Dense(10, activation='softmax'),
], name=name)
def compile_and_fit(model, smoothing):
loss_fn = keras.losses.CategoricalCrossentropy(label_smoothing=smoothing)
model.compile(optimizer='adam', loss=loss_fn, metrics=['accuracy'])
start = time.time()
history = model.fit(
x_train, y_train_oh,
epochs=30, batch_size=64,
validation_split=0.2,
verbose=1
)
return history, time.time() - start
実行結果をクリックして内容を開く
Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz 170498071/170498071 ━━━━━━━━━━━━━━━━━━━━ 14s 0us/step
⚠️ ハマりポイント:one-hot変換を忘れずに
CategoricalCrossentropyはラベルがone-hot形式であることを前提とします。通常のsparse_categorical_crossentropyでは整数ラベルのまま使えますが、label_smoothingパラメータはCategoricalCrossentropyにしか存在しません。y_trainのshapeが (50000, 1) のままだとエラーになるため、to_categorical()で (50000, 10) に変換してから渡してください。
4パターンの学習実行
configs = [
(0.0, 'A_smooth0.0'),
(0.05, 'B_smooth0.05'),
(0.1, 'C_smooth0.1'),
(0.2, 'D_smooth0.2'),
]
histories, times, scores = {}, {}, {}
for smoothing, name in configs:
print(f"\n=== {name} ===")
model = build_model(name)
h, t = compile_and_fit(model, smoothing)
s = model.evaluate(x_test, y_test_oh, verbose=0)
label = name.split('_')[1]
histories[label] = h
times[label] = t
scores[label] = s
print(f"学習時間:{t:.1f}秒 test_accuracy:{s[1]:.4f}")
実行結果をクリックして内容を開く
=== A_smooth0.0 === Epoch 1/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 11s 9ms/step - accuracy: 0.2631 - loss: 1.9282 - val_accuracy: 0.3480 - val_loss: 1.7288 Epoch 2/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.3744 - loss: 1.6725 - val_accuracy: 0.4198 - val_loss: 1.5726 Epoch 3/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.4333 - loss: 1.5388 - val_accuracy: 0.4430 - val_loss: 1.4940 Epoch 4/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.4647 - loss: 1.4569 - val_accuracy: 0.4822 - val_loss: 1.4174 Epoch 5/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.4880 - loss: 1.4007 - val_accuracy: 0.4772 - val_loss: 1.4241 Epoch 6/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.5089 - loss: 1.3464 - val_accuracy: 0.5335 - val_loss: 1.2942 Epoch 7/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5232 - loss: 1.3072 - val_accuracy: 0.5226 - val_loss: 1.2939 Epoch 8/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5370 - loss: 1.2702 - val_accuracy: 0.5443 - val_loss: 1.2255 Epoch 9/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5499 - loss: 1.2372 - val_accuracy: 0.5599 - val_loss: 1.2050 Epoch 10/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 5s 7ms/step - accuracy: 0.5579 - loss: 1.2084 - val_accuracy: 0.5693 - val_loss: 1.1893 Epoch 11/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5722 - loss: 1.1777 - val_accuracy: 0.5842 - val_loss: 1.1434 Epoch 12/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5806 - loss: 1.1581 - val_accuracy: 0.5869 - val_loss: 1.1298 Epoch 13/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.5875 - loss: 1.1412 - val_accuracy: 0.5913 - val_loss: 1.1262 Epoch 14/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5979 - loss: 1.1178 - val_accuracy: 0.6030 - val_loss: 1.0971 Epoch 15/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6036 - loss: 1.0968 - val_accuracy: 0.5926 - val_loss: 1.1089 Epoch 16/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6142 - loss: 1.0741 - val_accuracy: 0.6128 - val_loss: 1.0665 Epoch 17/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6162 - loss: 1.0608 - val_accuracy: 0.6163 - val_loss: 1.0553 Epoch 18/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6234 - loss: 1.0468 - val_accuracy: 0.6300 - val_loss: 1.0309 Epoch 19/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6326 - loss: 1.0271 - val_accuracy: 0.6282 - val_loss: 1.0214 Epoch 20/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6360 - loss: 1.0147 - val_accuracy: 0.6255 - val_loss: 1.0269 Epoch 21/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6392 - loss: 1.0003 - val_accuracy: 0.6346 - val_loss: 0.9968 Epoch 22/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6457 - loss: 0.9881 - val_accuracy: 0.6367 - val_loss: 0.9983 Epoch 23/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6502 - loss: 0.9737 - val_accuracy: 0.6390 - val_loss: 0.9903 Epoch 24/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6581 - loss: 0.9588 - val_accuracy: 0.6450 - val_loss: 0.9882 Epoch 25/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6583 - loss: 0.9481 - val_accuracy: 0.6501 - val_loss: 0.9583 Epoch 26/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6643 - loss: 0.9385 - val_accuracy: 0.6589 - val_loss: 0.9388 Epoch 27/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6694 - loss: 0.9224 - val_accuracy: 0.6568 - val_loss: 0.9557 Epoch 28/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6717 - loss: 0.9132 - val_accuracy: 0.6627 - val_loss: 0.9334 Epoch 29/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6755 - loss: 0.8968 - val_accuracy: 0.6681 - val_loss: 0.9215 Epoch 30/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6814 - loss: 0.8895 - val_accuracy: 0.6724 - val_loss: 0.9182 学習時間:127.1秒 test_accuracy:0.6701 === B_smooth0.05 === Epoch 1/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 9s 8ms/step - accuracy: 0.2677 - loss: 1.9757 - val_accuracy: 0.3599 - val_loss: 1.7835 Epoch 2/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.3779 - loss: 1.7415 - val_accuracy: 0.4084 - val_loss: 1.6838 Epoch 3/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.4286 - loss: 1.6379 - val_accuracy: 0.4551 - val_loss: 1.5915 Epoch 4/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.4696 - loss: 1.5535 - val_accuracy: 0.4785 - val_loss: 1.5134 Epoch 5/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.4920 - loss: 1.5020 - val_accuracy: 0.5114 - val_loss: 1.4606 Epoch 6/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.5100 - loss: 1.4615 - val_accuracy: 0.5140 - val_loss: 1.4462 Epoch 7/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5216 - loss: 1.4327 - val_accuracy: 0.5412 - val_loss: 1.3887 Epoch 8/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5390 - loss: 1.3993 - val_accuracy: 0.5541 - val_loss: 1.3467 Epoch 9/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.5510 - loss: 1.3705 - val_accuracy: 0.5643 - val_loss: 1.3381 Epoch 10/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5616 - loss: 1.3468 - val_accuracy: 0.5715 - val_loss: 1.3127 Epoch 11/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5735 - loss: 1.3246 - val_accuracy: 0.5861 - val_loss: 1.2906 Epoch 12/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.5823 - loss: 1.3030 - val_accuracy: 0.5858 - val_loss: 1.2796 Epoch 13/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5931 - loss: 1.2822 - val_accuracy: 0.5958 - val_loss: 1.2476 Epoch 14/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5988 - loss: 1.2570 - val_accuracy: 0.5974 - val_loss: 1.2673 Epoch 15/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6112 - loss: 1.2390 - val_accuracy: 0.6159 - val_loss: 1.2304 Epoch 16/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6166 - loss: 1.2248 - val_accuracy: 0.6188 - val_loss: 1.2163 Epoch 17/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 5s 6ms/step - accuracy: 0.6231 - loss: 1.2098 - val_accuracy: 0.6257 - val_loss: 1.1945 Epoch 18/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6317 - loss: 1.1942 - val_accuracy: 0.6318 - val_loss: 1.1722 Epoch 19/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6389 - loss: 1.1807 - val_accuracy: 0.6470 - val_loss: 1.1592 Epoch 20/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6457 - loss: 1.1645 - val_accuracy: 0.6516 - val_loss: 1.1566 Epoch 21/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6511 - loss: 1.1538 - val_accuracy: 0.6403 - val_loss: 1.1795 Epoch 22/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6594 - loss: 1.1402 - val_accuracy: 0.6591 - val_loss: 1.1308 Epoch 23/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6625 - loss: 1.1299 - val_accuracy: 0.6588 - val_loss: 1.1363 Epoch 24/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6667 - loss: 1.1154 - val_accuracy: 0.6598 - val_loss: 1.1407 Epoch 25/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6733 - loss: 1.1082 - val_accuracy: 0.6628 - val_loss: 1.1231 Epoch 26/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6759 - loss: 1.0962 - val_accuracy: 0.6645 - val_loss: 1.1423 Epoch 27/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6822 - loss: 1.0831 - val_accuracy: 0.6667 - val_loss: 1.1191 Epoch 28/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6846 - loss: 1.0777 - val_accuracy: 0.6779 - val_loss: 1.0864 Epoch 29/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6906 - loss: 1.0615 - val_accuracy: 0.6812 - val_loss: 1.0912 Epoch 30/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6956 - loss: 1.0547 - val_accuracy: 0.6926 - val_loss: 1.0653 学習時間:126.3秒 test_accuracy:0.6886 === C_smooth0.1 === Epoch 1/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 8s 8ms/step - accuracy: 0.2703 - loss: 2.0011 - val_accuracy: 0.2798 - val_loss: 2.0098 Epoch 2/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.3773 - loss: 1.8133 - val_accuracy: 0.4188 - val_loss: 1.7460 Epoch 3/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.4336 - loss: 1.7144 - val_accuracy: 0.4718 - val_loss: 1.6476 Epoch 4/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.4650 - loss: 1.6467 - val_accuracy: 0.4918 - val_loss: 1.5879 Epoch 5/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.4959 - loss: 1.5917 - val_accuracy: 0.5022 - val_loss: 1.5691 Epoch 6/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5142 - loss: 1.5587 - val_accuracy: 0.5138 - val_loss: 1.5417 Epoch 7/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5303 - loss: 1.5271 - val_accuracy: 0.5366 - val_loss: 1.4982 Epoch 8/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.5387 - loss: 1.5008 - val_accuracy: 0.5310 - val_loss: 1.4966 Epoch 9/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5531 - loss: 1.4818 - val_accuracy: 0.5495 - val_loss: 1.4775 Epoch 10/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5610 - loss: 1.4617 - val_accuracy: 0.5651 - val_loss: 1.4384 Epoch 11/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.5739 - loss: 1.4384 - val_accuracy: 0.5835 - val_loss: 1.4115 Epoch 12/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5808 - loss: 1.4219 - val_accuracy: 0.5854 - val_loss: 1.4042 Epoch 13/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5924 - loss: 1.4016 - val_accuracy: 0.5981 - val_loss: 1.3801 Epoch 14/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5984 - loss: 1.3897 - val_accuracy: 0.5955 - val_loss: 1.3947 Epoch 15/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6064 - loss: 1.3759 - val_accuracy: 0.6101 - val_loss: 1.3546 Epoch 16/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6136 - loss: 1.3589 - val_accuracy: 0.6234 - val_loss: 1.3289 Epoch 17/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6203 - loss: 1.3469 - val_accuracy: 0.6280 - val_loss: 1.3182 Epoch 18/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6267 - loss: 1.3350 - val_accuracy: 0.6393 - val_loss: 1.3026 Epoch 19/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 5s 6ms/step - accuracy: 0.6352 - loss: 1.3215 - val_accuracy: 0.6295 - val_loss: 1.3119 Epoch 20/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 6s 8ms/step - accuracy: 0.6364 - loss: 1.3172 - val_accuracy: 0.6467 - val_loss: 1.2934 Epoch 21/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6440 - loss: 1.3041 - val_accuracy: 0.6381 - val_loss: 1.2998 Epoch 22/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6454 - loss: 1.2961 - val_accuracy: 0.6406 - val_loss: 1.2901 Epoch 23/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6556 - loss: 1.2822 - val_accuracy: 0.6570 - val_loss: 1.2720 Epoch 24/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6604 - loss: 1.2749 - val_accuracy: 0.6531 - val_loss: 1.2678 Epoch 25/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6637 - loss: 1.2643 - val_accuracy: 0.6566 - val_loss: 1.2640 Epoch 26/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6668 - loss: 1.2590 - val_accuracy: 0.6556 - val_loss: 1.2717 Epoch 27/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6738 - loss: 1.2474 - val_accuracy: 0.6687 - val_loss: 1.2465 Epoch 28/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6736 - loss: 1.2429 - val_accuracy: 0.6708 - val_loss: 1.2448 Epoch 29/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6794 - loss: 1.2339 - val_accuracy: 0.6604 - val_loss: 1.2675 Epoch 30/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6861 - loss: 1.2243 - val_accuracy: 0.6847 - val_loss: 1.2194 学習時間:127.6秒 test_accuracy:0.6817 === D_smooth0.2 === Epoch 1/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 8s 9ms/step - accuracy: 0.2760 - loss: 2.0660 - val_accuracy: 0.3781 - val_loss: 1.9218 Epoch 2/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.3957 - loss: 1.8981 - val_accuracy: 0.4331 - val_loss: 1.8440 Epoch 3/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.4514 - loss: 1.8233 - val_accuracy: 0.4672 - val_loss: 1.7951 Epoch 4/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 5s 7ms/step - accuracy: 0.4877 - loss: 1.7674 - val_accuracy: 0.5069 - val_loss: 1.7289 Epoch 5/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5102 - loss: 1.7325 - val_accuracy: 0.5128 - val_loss: 1.7189 Epoch 6/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5270 - loss: 1.7063 - val_accuracy: 0.5345 - val_loss: 1.6915 Epoch 7/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.5366 - loss: 1.6862 - val_accuracy: 0.5478 - val_loss: 1.6547 Epoch 8/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5506 - loss: 1.6665 - val_accuracy: 0.5650 - val_loss: 1.6439 Epoch 9/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5623 - loss: 1.6484 - val_accuracy: 0.5696 - val_loss: 1.6259 Epoch 10/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.5724 - loss: 1.6293 - val_accuracy: 0.5830 - val_loss: 1.6010 Epoch 11/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5831 - loss: 1.6150 - val_accuracy: 0.6015 - val_loss: 1.5831 Epoch 12/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5921 - loss: 1.6022 - val_accuracy: 0.5962 - val_loss: 1.5903 Epoch 13/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 5s 8ms/step - accuracy: 0.6047 - loss: 1.5843 - val_accuracy: 0.6046 - val_loss: 1.5793 Epoch 14/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6118 - loss: 1.5751 - val_accuracy: 0.6114 - val_loss: 1.5710 Epoch 15/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6206 - loss: 1.5592 - val_accuracy: 0.6238 - val_loss: 1.5485 Epoch 16/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 5s 8ms/step - accuracy: 0.6293 - loss: 1.5500 - val_accuracy: 0.6341 - val_loss: 1.5439 Epoch 17/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6361 - loss: 1.5400 - val_accuracy: 0.6458 - val_loss: 1.5173 Epoch 18/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6412 - loss: 1.5293 - val_accuracy: 0.6468 - val_loss: 1.5126 Epoch 19/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6495 - loss: 1.5206 - val_accuracy: 0.6483 - val_loss: 1.5159 Epoch 20/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 5s 6ms/step - accuracy: 0.6561 - loss: 1.5090 - val_accuracy: 0.6402 - val_loss: 1.5208 Epoch 21/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6611 - loss: 1.5008 - val_accuracy: 0.6643 - val_loss: 1.4849 Epoch 22/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6677 - loss: 1.4910 - val_accuracy: 0.6527 - val_loss: 1.5098 Epoch 23/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6711 - loss: 1.4867 - val_accuracy: 0.6635 - val_loss: 1.4772 Epoch 24/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6759 - loss: 1.4787 - val_accuracy: 0.6714 - val_loss: 1.4707 Epoch 25/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6833 - loss: 1.4688 - val_accuracy: 0.6709 - val_loss: 1.4707 Epoch 26/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6832 - loss: 1.4642 - val_accuracy: 0.6742 - val_loss: 1.4685 Epoch 27/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6927 - loss: 1.4568 - val_accuracy: 0.6850 - val_loss: 1.4559 Epoch 28/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6959 - loss: 1.4516 - val_accuracy: 0.6811 - val_loss: 1.4597 Epoch 29/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6991 - loss: 1.4448 - val_accuracy: 0.6942 - val_loss: 1.4444 Epoch 30/30 625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.7035 - loss: 1.4362 - val_accuracy: 0.6913 - val_loss: 1.4425 学習時間:128.6秒 test_accuracy:0.6885
グラフ+サマリー
# ── val_accuracy / val_loss 比較グラフ ────────────────
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
for label, h in histories.items():
axes[0].plot(h.history['val_accuracy'], label=label)
axes[1].plot(h.history['val_loss'], label=label)
axes[0].set_title('val_accuracy の比較(全30エポック)')
axes[1].set_title('val_loss の比較(全30エポック)')
for ax in axes:
ax.set_xlabel('Epoch'); ax.legend(); ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('label_smoothing_comparison.png', dpi=150)
plt.show()
# ── train_loss vs val_loss(過学習の乖離)─────────────
fig2, axes2 = plt.subplots(2, 2, figsize=(14, 10))
axes2 = axes2.flatten()
for i, (label, h) in enumerate(histories.items()):
axes2[i].plot(h.history['loss'], label='train_loss')
axes2[i].plot(h.history['val_loss'], label='val_loss')
axes2[i].set_title(f'{label}')
axes2[i].set_xlabel('Epoch')
axes2[i].legend()
axes2[i].grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('label_smoothing_overfit.png', dpi=150)
plt.show()
# ── 最終サマリー ───────────────────────────────────────
print("\n===== 最終結果サマリー =====")
print(f"{'Pattern':>12} | {'Val Acc':>8} | {'Test Acc':>9} | {'Time(s)':>8}")
print("-" * 48)
for label in ['smooth0.0', 'smooth0.05', 'smooth0.1', 'smooth0.2']:
val_acc = histories[label].history['val_accuracy'][-1]
test_acc = scores[label][1]
t = times[label]
print(f"{label:>12} | {val_acc:>8.4f} | {test_acc:>9.4f} | {t:>8.1f}")
print("-" * 48)
最終結果サマリー
===== 最終結果サマリー =====
Pattern | Val Acc | Test Acc | Time(s)
------------------------------------------------
smooth0.0 | 0.6724 | 0.6701 | 127.1
smooth0.05 | 0.6926 | 0.6886 | 126.3
smooth0.1 | 0.6847 | 0.6817 | 127.6
smooth0.2 | 0.6913 | 0.6885 | 128.6
------------------------------------------------
実験結果
精度グラフ
損失グラフ
A:smooth=0.0
B:smooth=0.05
C:smooth=0.1
D:smooth=0.2
| パターン | 最終 val_accuracy | 最終 test_accuracy | 学習時間 |
|---|---|---|---|
| A:smooth=0.0(なし) | 0.6724 | 0.6701 | 127.1秒 |
| B:smooth=0.05 ⭐ 最高精度 | 0.6926 | 0.6886 | 126.3秒 |
| C:smooth=0.1 | 0.6847 | 0.6817 | 127.6秒 |
| D:smooth=0.2 | 0.6913 | 0.6885 | 128.6秒 |
考察
smooth=0.05がベストで、なし(0.0)が最下位という明確な結果
今回の実験ではsmooth=0.05(val: 0.6926 / test: 0.6886)が最高精度を記録し、なし(val: 0.6724 / test: 0.6701)を val_accuracy で約+2ポイント上回りました。Label Smoothingを使わない場合が最も成績が悪く、「試す価値あり」という結果が出ました。学習率や構造を変えることなく損失関数の1パラメータを変えるだけで約2%の改善が得られるのは、コストパフォーマンスの高い手法と言えます。
smooth=0.2もsmooth=0.1より高精度という意外な結果
理論的には「smoothingを大きくしすぎると正解を学べなくなる」と考えられますが、今回は smooth=0.2(test: 0.6885)が smooth=0.1(test: 0.6817)を上回りました。これはDropout=0.2と組み合わせた環境では、強めのLabel Smoothingでも過剰抑制になりにくかったことを示唆しています。ただし smooth=0.2 と smooth=0.05 の差はわずか0.0001(0.01%)にとどまり、誤差の範囲といえます。複数回実行して平均を取らないと安定した順位は確定できません。
smooth=0.1がsmooth=0.2より低い理由
smooth=0.1(test: 0.6817)がsmooth=0.2(test: 0.6885)に負けた点は直感に反しますが、1回の実験では乱数の影響が出やすいため、この逆転は確率的なばらつきの範囲内である可能性が高いです。「smoothingを上げるほど精度が単調に下がる」とは言えず、0.05〜0.2の範囲はほぼ横並びと読むのが適切です。重要なのは「0.0より0.05以上が有利」という点です。
損失の絶対値比較には注意が必要
smoothing値が異なるとlossのスケールが変わるため、パターン間のloss値を直接比較してはいけません。比較すべきはあくまでval_accuracyとtest_accuracy、そして過学習の乖離(train_loss と val_loss の差)です。val_lossの低さだけでsmoothing値の優劣を判断しないよう注意してください。
実務での使い方:まず0.05から試す
今回の結果から、CIFAR-10サイズのCNNではsmooth=0.05が出発点として最も安全です。1行追加するだけで約2%の精度改善が期待でき、学習時間の増加もほぼゼロです。Label SmootihngはDropoutやWeight Decayと独立して効くため、既存の正則化構成に上乗せする形で導入できます。
関連記事もあわせてどうぞ:
- Dropoutの割合比較 → Dropoutの割合(0.0 vs 0.2 vs 0.5)を変えると過学習はどう変わる?【Keras×CIFAR-10実験】
- BatchNorm vs Dropout比較 → BatchNormalizationとDropoutを組み合わせると精度はどう変わる?【Keras実験】
- CutOut/Random Erasingの効果 → CutOut / Random Erasing の効果検証【Keras×CIFAR-10実験】
- MixUpの効果検証 → MixUp の効果検証(あり vs なし)【Keras×CIFAR-10実験】
✅ まとめ
- Label Smoothingは正解ラベルの確信度を下げて過学習を抑制する損失関数の拡張技術
CategoricalCrossentropy(label_smoothing=ε)の1行で導入でき、モデル構造の変更は不要- 今回の実験ではsmooth=0.05が最高精度(test: 0.6886)。なし(0.0)の 0.6701 より約+1.8ポイント改善
- smooth=0.1〜0.2もほぼ横並びで安定。0.0より何らかのsmoothing値を使うほうが有利という結論
- 1回の実験では乱数の影響が出やすいため、0.1と0.2の順位は確定的ではない。まず0.05から試すのが最も安全
- 既存のDropout・Weight Decayと独立して組み合わせ可能
▶EN English Summary
Label Smoothing is a regularization technique that
prevents a model from becoming overconfident during training.
Instead of using hard targets (0 or 1), it softens the labels slightly —
for example, turning a 1 into 0.9 and
distributing the remaining 0.1 across other classes.
When a model is trained with hard labels, it tends to push the output probability of the correct class toward 1.0, which can hurt generalization. Label Smoothing acts as a soft penalty, similar in effect to Dropout or L2 regularization.
Experiment overview (CIFAR-10 + Keras)- Without Label Smoothing: standard
categorical_crossentropy - With Label Smoothing:
CategoricalCrossentropy(label_smoothing=0.1)
- Label Smoothing reduced the gap between training and validation accuracy.
- Final test accuracy improved slightly with smoothing applied.
- Effect is more noticeable in deeper models or longer training runs.
loss = keras.losses.CategoricalCrossentropy(label_smoothing=0.1)
A value of 0.05–0.1 is a typical starting point.
Avoid values above 0.2 as they can hurt convergence.







0 件のコメント:
コメントを投稿