【Keras】Label Smoothingとは?過学習を抑えて精度を上げる損失関数の使い方【CIFAR-10実験】

投稿日:2026年5月26日火曜日 最終更新日:

CIFAR-10 CNN Google Colab Keras Label Smoothing 過学習 画像分類 損失関数

X f B! P L
Label Smoothingで精度は上がる?ありなし比較【Keras×CIFAR-10実験】アイキャッチ画像

「とりあえずcategorical_crossentropyを使っているけど、Label Smoothingって試す価値ある?」

そう感じている方は多いと思います。今回はGoogle ColabとCIFAR-10を使い、Label Smoothingのsmoothingパラメータを0.0 / 0.05 / 0.1 / 0.2の4パターンで比較しました。過学習への影響・精度・val_lossの変化を実験で確認します。

📘 この記事でわかること

  • Label Smoothingとは何か、なぜ過学習を抑えるのか
  • smoothing値ごとに精度・val_lossがどう変わるか
  • 実務での適切なsmoothingパラメータの目安

Label Smoothingとは

通常の損失関数(例:categorical_crossentropy)は正解ラベルを「完全に1」、それ以外を「完全に0」として扱います。モデルはこの硬い正解を過信しやすく、過学習の一因になります。

Label Smoothingは正解ラベルの確信度を少しだけ下げ、残りのクラスにも微量の確率を分配することで、モデルを「そこまで自信満々にならなくてよい」状態に誘導する手法です。

正解クラスの目標値:\( y_{\text{smooth}} = 1 - \varepsilon + \dfrac{\varepsilon}{K} \)  不正解クラス:\( \dfrac{\varepsilon}{K} \)
(\(\varepsilon\):smoothing値、\(K\):クラス数)

例えば CIFAR-10(K=10)でsmoothingを0.1に設定すると、正解クラスの目標値は 1−0.1+0.01 = 0.91 になり、各不正解クラスには0.01ずつが割り当てられます。

smoothing値 正解クラスの目標値 各不正解クラスの目標値
0.0(なし)1.0000.000
0.050.9550.005
0.10.9100.010
0.20.8200.020

実験コード

使用環境はGoogle Colab(GPU:T4)、データセットはCIFAR-10です。smoothing値以外の条件はすべて同一にして、Label Smoothingの影響だけを取り出します。

環境準備(最初に一度だけ実行)

# ── 環境準備(最初に一度だけ実行)──────────────────────
!apt-get -y install fonts-ipafont-gothic
!rm -rf /root/.cache/matplotlib
!pip install -q japanize_matplotlib
print("環境準備完了")
実行結果をクリックして内容を開く
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  fonts-ipafont-mincho
The following NEW packages will be installed:
  fonts-ipafont-gothic fonts-ipafont-mincho
0 upgraded, 2 newly installed, 0 to remove and 51 not upgraded.
Need to get 8,237 kB of archives.
After this operation, 28.7 MB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu jammy/universe amd64 fonts-ipafont-gothic all 00303-21ubuntu1 [3,513 kB]
Get:2 http://archive.ubuntu.com/ubuntu jammy/universe amd64 fonts-ipafont-mincho all 00303-21ubuntu1 [4,724 kB]
Fetched 8,237 kB in 0s (34.7 MB/s)
Selecting previously unselected package fonts-ipafont-gothic.
(Reading database ... 122363 files and directories currently installed.)
Preparing to unpack .../fonts-ipafont-gothic_00303-21ubuntu1_all.deb ...
Unpacking fonts-ipafont-gothic (00303-21ubuntu1) ...
Selecting previously unselected package fonts-ipafont-mincho.
Preparing to unpack .../fonts-ipafont-mincho_00303-21ubuntu1_all.deb ...
Unpacking fonts-ipafont-mincho (00303-21ubuntu1) ...
Setting up fonts-ipafont-mincho (00303-21ubuntu1) ...
update-alternatives: using /usr/share/fonts/opentype/ipafont-mincho/ipam.ttf to provide /usr/share/fonts/truetype/fonts-japanese-mincho.ttf (fonts-japanese-mincho.ttf) in auto mode
Setting up fonts-ipafont-gothic (00303-21ubuntu1) ...
update-alternatives: using /usr/share/fonts/opentype/ipafont-gothic/ipag.ttf to provide /usr/share/fonts/truetype/fonts-japanese-gothic.ttf (fonts-japanese-gothic.ttf) in auto mode
Processing triggers for fontconfig (2.13.1-4.2ubuntu5) ...
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.1/4.1 MB 52.2 MB/s eta 0:00:00
  Preparing metadata (setup.py) ... done
  Building wheel for japanize_matplotlib (setup.py) ... done
環境準備完了

import・データ準備・モデル構築関数

import tensorflow as tf
from tensorflow import keras
import matplotlib.pyplot as plt
import japanize_matplotlib
import time

# ── データ準備 ────────────────────────────────────────
(x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()
x_train = x_train.astype('float32') / 255.0
x_test  = x_test.astype('float32')  / 255.0

# Label Smoothingを使うのでone-hot変換が必要
y_train_oh = keras.utils.to_categorical(y_train, 10)
y_test_oh  = keras.utils.to_categorical(y_test,  10)

# ── モデル構築(smoothing値以外は固定)──────────────────
def build_model(name):
    return keras.Sequential([
        keras.layers.Input(shape=(32, 32, 3)),
        keras.layers.Conv2D(64,  (3, 3), activation='relu', padding='same'),
        keras.layers.MaxPooling2D((2, 2)),
        keras.layers.Conv2D(128, (3, 3), activation='relu', padding='same'),
        keras.layers.MaxPooling2D((2, 2)),
        keras.layers.GlobalAveragePooling2D(),
        keras.layers.Dense(128, activation='relu'),
        keras.layers.Dropout(0.2),
        keras.layers.Dense(10, activation='softmax'),
    ], name=name)

def compile_and_fit(model, smoothing):
    loss_fn = keras.losses.CategoricalCrossentropy(label_smoothing=smoothing)
    model.compile(optimizer='adam', loss=loss_fn, metrics=['accuracy'])
    start = time.time()
    history = model.fit(
        x_train, y_train_oh,
        epochs=30, batch_size=64,
        validation_split=0.2,
        verbose=1
    )
    return history, time.time() - start
実行結果をクリックして内容を開く
Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
170498071/170498071 ━━━━━━━━━━━━━━━━━━━━ 14s 0us/step

⚠️ ハマりポイント:one-hot変換を忘れずに

CategoricalCrossentropyはラベルがone-hot形式であることを前提とします。通常のsparse_categorical_crossentropyでは整数ラベルのまま使えますが、label_smoothingパラメータはCategoricalCrossentropyにしか存在しません。y_trainのshapeが (50000, 1) のままだとエラーになるため、to_categorical()(50000, 10) に変換してから渡してください。

4パターンの学習実行

configs = [
    (0.0,  'A_smooth0.0'),
    (0.05, 'B_smooth0.05'),
    (0.1,  'C_smooth0.1'),
    (0.2,  'D_smooth0.2'),
]
histories, times, scores = {}, {}, {}

for smoothing, name in configs:
    print(f"\n=== {name} ===")
    model = build_model(name)
    h, t = compile_and_fit(model, smoothing)
    s = model.evaluate(x_test, y_test_oh, verbose=0)
    label = name.split('_')[1]
    histories[label] = h
    times[label]     = t
    scores[label]    = s
    print(f"学習時間:{t:.1f}秒 test_accuracy:{s[1]:.4f}")
実行結果をクリックして内容を開く
=== A_smooth0.0 ===
Epoch 1/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 11s 9ms/step - accuracy: 0.2631 - loss: 1.9282 - val_accuracy: 0.3480 - val_loss: 1.7288
Epoch 2/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.3744 - loss: 1.6725 - val_accuracy: 0.4198 - val_loss: 1.5726
Epoch 3/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.4333 - loss: 1.5388 - val_accuracy: 0.4430 - val_loss: 1.4940
Epoch 4/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.4647 - loss: 1.4569 - val_accuracy: 0.4822 - val_loss: 1.4174
Epoch 5/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.4880 - loss: 1.4007 - val_accuracy: 0.4772 - val_loss: 1.4241
Epoch 6/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.5089 - loss: 1.3464 - val_accuracy: 0.5335 - val_loss: 1.2942
Epoch 7/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5232 - loss: 1.3072 - val_accuracy: 0.5226 - val_loss: 1.2939
Epoch 8/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5370 - loss: 1.2702 - val_accuracy: 0.5443 - val_loss: 1.2255
Epoch 9/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5499 - loss: 1.2372 - val_accuracy: 0.5599 - val_loss: 1.2050
Epoch 10/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 5s 7ms/step - accuracy: 0.5579 - loss: 1.2084 - val_accuracy: 0.5693 - val_loss: 1.1893
Epoch 11/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5722 - loss: 1.1777 - val_accuracy: 0.5842 - val_loss: 1.1434
Epoch 12/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5806 - loss: 1.1581 - val_accuracy: 0.5869 - val_loss: 1.1298
Epoch 13/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.5875 - loss: 1.1412 - val_accuracy: 0.5913 - val_loss: 1.1262
Epoch 14/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5979 - loss: 1.1178 - val_accuracy: 0.6030 - val_loss: 1.0971
Epoch 15/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6036 - loss: 1.0968 - val_accuracy: 0.5926 - val_loss: 1.1089
Epoch 16/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6142 - loss: 1.0741 - val_accuracy: 0.6128 - val_loss: 1.0665
Epoch 17/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6162 - loss: 1.0608 - val_accuracy: 0.6163 - val_loss: 1.0553
Epoch 18/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6234 - loss: 1.0468 - val_accuracy: 0.6300 - val_loss: 1.0309
Epoch 19/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6326 - loss: 1.0271 - val_accuracy: 0.6282 - val_loss: 1.0214
Epoch 20/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6360 - loss: 1.0147 - val_accuracy: 0.6255 - val_loss: 1.0269
Epoch 21/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6392 - loss: 1.0003 - val_accuracy: 0.6346 - val_loss: 0.9968
Epoch 22/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6457 - loss: 0.9881 - val_accuracy: 0.6367 - val_loss: 0.9983
Epoch 23/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6502 - loss: 0.9737 - val_accuracy: 0.6390 - val_loss: 0.9903
Epoch 24/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6581 - loss: 0.9588 - val_accuracy: 0.6450 - val_loss: 0.9882
Epoch 25/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6583 - loss: 0.9481 - val_accuracy: 0.6501 - val_loss: 0.9583
Epoch 26/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6643 - loss: 0.9385 - val_accuracy: 0.6589 - val_loss: 0.9388
Epoch 27/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6694 - loss: 0.9224 - val_accuracy: 0.6568 - val_loss: 0.9557
Epoch 28/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6717 - loss: 0.9132 - val_accuracy: 0.6627 - val_loss: 0.9334
Epoch 29/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6755 - loss: 0.8968 - val_accuracy: 0.6681 - val_loss: 0.9215
Epoch 30/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6814 - loss: 0.8895 - val_accuracy: 0.6724 - val_loss: 0.9182
学習時間:127.1秒 test_accuracy:0.6701

=== B_smooth0.05 ===
Epoch 1/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 9s 8ms/step - accuracy: 0.2677 - loss: 1.9757 - val_accuracy: 0.3599 - val_loss: 1.7835
Epoch 2/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.3779 - loss: 1.7415 - val_accuracy: 0.4084 - val_loss: 1.6838
Epoch 3/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.4286 - loss: 1.6379 - val_accuracy: 0.4551 - val_loss: 1.5915
Epoch 4/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.4696 - loss: 1.5535 - val_accuracy: 0.4785 - val_loss: 1.5134
Epoch 5/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.4920 - loss: 1.5020 - val_accuracy: 0.5114 - val_loss: 1.4606
Epoch 6/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.5100 - loss: 1.4615 - val_accuracy: 0.5140 - val_loss: 1.4462
Epoch 7/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5216 - loss: 1.4327 - val_accuracy: 0.5412 - val_loss: 1.3887
Epoch 8/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5390 - loss: 1.3993 - val_accuracy: 0.5541 - val_loss: 1.3467
Epoch 9/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.5510 - loss: 1.3705 - val_accuracy: 0.5643 - val_loss: 1.3381
Epoch 10/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5616 - loss: 1.3468 - val_accuracy: 0.5715 - val_loss: 1.3127
Epoch 11/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5735 - loss: 1.3246 - val_accuracy: 0.5861 - val_loss: 1.2906
Epoch 12/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.5823 - loss: 1.3030 - val_accuracy: 0.5858 - val_loss: 1.2796
Epoch 13/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5931 - loss: 1.2822 - val_accuracy: 0.5958 - val_loss: 1.2476
Epoch 14/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5988 - loss: 1.2570 - val_accuracy: 0.5974 - val_loss: 1.2673
Epoch 15/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6112 - loss: 1.2390 - val_accuracy: 0.6159 - val_loss: 1.2304
Epoch 16/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6166 - loss: 1.2248 - val_accuracy: 0.6188 - val_loss: 1.2163
Epoch 17/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 5s 6ms/step - accuracy: 0.6231 - loss: 1.2098 - val_accuracy: 0.6257 - val_loss: 1.1945
Epoch 18/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6317 - loss: 1.1942 - val_accuracy: 0.6318 - val_loss: 1.1722
Epoch 19/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6389 - loss: 1.1807 - val_accuracy: 0.6470 - val_loss: 1.1592
Epoch 20/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6457 - loss: 1.1645 - val_accuracy: 0.6516 - val_loss: 1.1566
Epoch 21/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6511 - loss: 1.1538 - val_accuracy: 0.6403 - val_loss: 1.1795
Epoch 22/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6594 - loss: 1.1402 - val_accuracy: 0.6591 - val_loss: 1.1308
Epoch 23/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6625 - loss: 1.1299 - val_accuracy: 0.6588 - val_loss: 1.1363
Epoch 24/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6667 - loss: 1.1154 - val_accuracy: 0.6598 - val_loss: 1.1407
Epoch 25/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6733 - loss: 1.1082 - val_accuracy: 0.6628 - val_loss: 1.1231
Epoch 26/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6759 - loss: 1.0962 - val_accuracy: 0.6645 - val_loss: 1.1423
Epoch 27/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6822 - loss: 1.0831 - val_accuracy: 0.6667 - val_loss: 1.1191
Epoch 28/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6846 - loss: 1.0777 - val_accuracy: 0.6779 - val_loss: 1.0864
Epoch 29/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6906 - loss: 1.0615 - val_accuracy: 0.6812 - val_loss: 1.0912
Epoch 30/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6956 - loss: 1.0547 - val_accuracy: 0.6926 - val_loss: 1.0653
学習時間:126.3秒 test_accuracy:0.6886

=== C_smooth0.1 ===
Epoch 1/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 8s 8ms/step - accuracy: 0.2703 - loss: 2.0011 - val_accuracy: 0.2798 - val_loss: 2.0098
Epoch 2/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.3773 - loss: 1.8133 - val_accuracy: 0.4188 - val_loss: 1.7460
Epoch 3/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.4336 - loss: 1.7144 - val_accuracy: 0.4718 - val_loss: 1.6476
Epoch 4/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.4650 - loss: 1.6467 - val_accuracy: 0.4918 - val_loss: 1.5879
Epoch 5/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.4959 - loss: 1.5917 - val_accuracy: 0.5022 - val_loss: 1.5691
Epoch 6/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5142 - loss: 1.5587 - val_accuracy: 0.5138 - val_loss: 1.5417
Epoch 7/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5303 - loss: 1.5271 - val_accuracy: 0.5366 - val_loss: 1.4982
Epoch 8/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.5387 - loss: 1.5008 - val_accuracy: 0.5310 - val_loss: 1.4966
Epoch 9/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5531 - loss: 1.4818 - val_accuracy: 0.5495 - val_loss: 1.4775
Epoch 10/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5610 - loss: 1.4617 - val_accuracy: 0.5651 - val_loss: 1.4384
Epoch 11/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.5739 - loss: 1.4384 - val_accuracy: 0.5835 - val_loss: 1.4115
Epoch 12/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5808 - loss: 1.4219 - val_accuracy: 0.5854 - val_loss: 1.4042
Epoch 13/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5924 - loss: 1.4016 - val_accuracy: 0.5981 - val_loss: 1.3801
Epoch 14/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5984 - loss: 1.3897 - val_accuracy: 0.5955 - val_loss: 1.3947
Epoch 15/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6064 - loss: 1.3759 - val_accuracy: 0.6101 - val_loss: 1.3546
Epoch 16/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6136 - loss: 1.3589 - val_accuracy: 0.6234 - val_loss: 1.3289
Epoch 17/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6203 - loss: 1.3469 - val_accuracy: 0.6280 - val_loss: 1.3182
Epoch 18/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6267 - loss: 1.3350 - val_accuracy: 0.6393 - val_loss: 1.3026
Epoch 19/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 5s 6ms/step - accuracy: 0.6352 - loss: 1.3215 - val_accuracy: 0.6295 - val_loss: 1.3119
Epoch 20/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 6s 8ms/step - accuracy: 0.6364 - loss: 1.3172 - val_accuracy: 0.6467 - val_loss: 1.2934
Epoch 21/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6440 - loss: 1.3041 - val_accuracy: 0.6381 - val_loss: 1.2998
Epoch 22/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6454 - loss: 1.2961 - val_accuracy: 0.6406 - val_loss: 1.2901
Epoch 23/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6556 - loss: 1.2822 - val_accuracy: 0.6570 - val_loss: 1.2720
Epoch 24/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6604 - loss: 1.2749 - val_accuracy: 0.6531 - val_loss: 1.2678
Epoch 25/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6637 - loss: 1.2643 - val_accuracy: 0.6566 - val_loss: 1.2640
Epoch 26/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6668 - loss: 1.2590 - val_accuracy: 0.6556 - val_loss: 1.2717
Epoch 27/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6738 - loss: 1.2474 - val_accuracy: 0.6687 - val_loss: 1.2465
Epoch 28/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6736 - loss: 1.2429 - val_accuracy: 0.6708 - val_loss: 1.2448
Epoch 29/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6794 - loss: 1.2339 - val_accuracy: 0.6604 - val_loss: 1.2675
Epoch 30/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6861 - loss: 1.2243 - val_accuracy: 0.6847 - val_loss: 1.2194
学習時間:127.6秒 test_accuracy:0.6817

=== D_smooth0.2 ===
Epoch 1/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 8s 9ms/step - accuracy: 0.2760 - loss: 2.0660 - val_accuracy: 0.3781 - val_loss: 1.9218
Epoch 2/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.3957 - loss: 1.8981 - val_accuracy: 0.4331 - val_loss: 1.8440
Epoch 3/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.4514 - loss: 1.8233 - val_accuracy: 0.4672 - val_loss: 1.7951
Epoch 4/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 5s 7ms/step - accuracy: 0.4877 - loss: 1.7674 - val_accuracy: 0.5069 - val_loss: 1.7289
Epoch 5/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5102 - loss: 1.7325 - val_accuracy: 0.5128 - val_loss: 1.7189
Epoch 6/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5270 - loss: 1.7063 - val_accuracy: 0.5345 - val_loss: 1.6915
Epoch 7/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.5366 - loss: 1.6862 - val_accuracy: 0.5478 - val_loss: 1.6547
Epoch 8/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5506 - loss: 1.6665 - val_accuracy: 0.5650 - val_loss: 1.6439
Epoch 9/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5623 - loss: 1.6484 - val_accuracy: 0.5696 - val_loss: 1.6259
Epoch 10/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.5724 - loss: 1.6293 - val_accuracy: 0.5830 - val_loss: 1.6010
Epoch 11/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5831 - loss: 1.6150 - val_accuracy: 0.6015 - val_loss: 1.5831
Epoch 12/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5921 - loss: 1.6022 - val_accuracy: 0.5962 - val_loss: 1.5903
Epoch 13/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 5s 8ms/step - accuracy: 0.6047 - loss: 1.5843 - val_accuracy: 0.6046 - val_loss: 1.5793
Epoch 14/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6118 - loss: 1.5751 - val_accuracy: 0.6114 - val_loss: 1.5710
Epoch 15/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6206 - loss: 1.5592 - val_accuracy: 0.6238 - val_loss: 1.5485
Epoch 16/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 5s 8ms/step - accuracy: 0.6293 - loss: 1.5500 - val_accuracy: 0.6341 - val_loss: 1.5439
Epoch 17/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6361 - loss: 1.5400 - val_accuracy: 0.6458 - val_loss: 1.5173
Epoch 18/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6412 - loss: 1.5293 - val_accuracy: 0.6468 - val_loss: 1.5126
Epoch 19/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6495 - loss: 1.5206 - val_accuracy: 0.6483 - val_loss: 1.5159
Epoch 20/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 5s 6ms/step - accuracy: 0.6561 - loss: 1.5090 - val_accuracy: 0.6402 - val_loss: 1.5208
Epoch 21/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6611 - loss: 1.5008 - val_accuracy: 0.6643 - val_loss: 1.4849
Epoch 22/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6677 - loss: 1.4910 - val_accuracy: 0.6527 - val_loss: 1.5098
Epoch 23/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6711 - loss: 1.4867 - val_accuracy: 0.6635 - val_loss: 1.4772
Epoch 24/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6759 - loss: 1.4787 - val_accuracy: 0.6714 - val_loss: 1.4707
Epoch 25/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6833 - loss: 1.4688 - val_accuracy: 0.6709 - val_loss: 1.4707
Epoch 26/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6832 - loss: 1.4642 - val_accuracy: 0.6742 - val_loss: 1.4685
Epoch 27/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6927 - loss: 1.4568 - val_accuracy: 0.6850 - val_loss: 1.4559
Epoch 28/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - accuracy: 0.6959 - loss: 1.4516 - val_accuracy: 0.6811 - val_loss: 1.4597
Epoch 29/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.6991 - loss: 1.4448 - val_accuracy: 0.6942 - val_loss: 1.4444
Epoch 30/30
625/625 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.7035 - loss: 1.4362 - val_accuracy: 0.6913 - val_loss: 1.4425
学習時間:128.6秒 test_accuracy:0.6885

グラフ+サマリー

# ── val_accuracy / val_loss 比較グラフ ────────────────
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
for label, h in histories.items():
    axes[0].plot(h.history['val_accuracy'], label=label)
    axes[1].plot(h.history['val_loss'],     label=label)
axes[0].set_title('val_accuracy の比較(全30エポック)')
axes[1].set_title('val_loss の比較(全30エポック)')
for ax in axes:
    ax.set_xlabel('Epoch'); ax.legend(); ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('label_smoothing_comparison.png', dpi=150)
plt.show()

# ── train_loss vs val_loss(過学習の乖離)─────────────
fig2, axes2 = plt.subplots(2, 2, figsize=(14, 10))
axes2 = axes2.flatten()
for i, (label, h) in enumerate(histories.items()):
    axes2[i].plot(h.history['loss'],     label='train_loss')
    axes2[i].plot(h.history['val_loss'], label='val_loss')
    axes2[i].set_title(f'{label}')
    axes2[i].set_xlabel('Epoch')
    axes2[i].legend()
    axes2[i].grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('label_smoothing_overfit.png', dpi=150)
plt.show()

# ── 最終サマリー ───────────────────────────────────────
print("\n===== 最終結果サマリー =====")
print(f"{'Pattern':>12} | {'Val Acc':>8} | {'Test Acc':>9} | {'Time(s)':>8}")
print("-" * 48)
for label in ['smooth0.0', 'smooth0.05', 'smooth0.1', 'smooth0.2']:
    val_acc  = histories[label].history['val_accuracy'][-1]
    test_acc = scores[label][1]
    t        = times[label]
    print(f"{label:>12} | {val_acc:>8.4f} | {test_acc:>9.4f} | {t:>8.1f}")
print("-" * 48)

最終結果サマリー

===== 最終結果サマリー =====
     Pattern |  Val Acc |  Test Acc |  Time(s)
------------------------------------------------
   smooth0.0 |   0.6724 |    0.6701 |    127.1
  smooth0.05 |   0.6926 |    0.6886 |    126.3
   smooth0.1 |   0.6847 |    0.6817 |    127.6
   smooth0.2 |   0.6913 |    0.6885 |    128.6
------------------------------------------------

実験結果

精度グラフ

精度グラフ

損失グラフ

損失グラフ

A:smooth=0.0

A:smooth=0.0

B:smooth=0.05

B:smooth=0.05

C:smooth=0.1

C:smooth=0.1

D:smooth=0.2

D:smooth=0.2
パターン 最終 val_accuracy 最終 test_accuracy 学習時間
A:smooth=0.0(なし)0.67240.6701127.1秒
B:smooth=0.05 ⭐ 最高精度0.69260.6886126.3秒
C:smooth=0.10.68470.6817127.6秒
D:smooth=0.20.69130.6885128.6秒

考察

smooth=0.05がベストで、なし(0.0)が最下位という明確な結果

今回の実験ではsmooth=0.05(val: 0.6926 / test: 0.6886)が最高精度を記録し、なし(val: 0.6724 / test: 0.6701)を val_accuracy で約+2ポイント上回りました。Label Smoothingを使わない場合が最も成績が悪く、「試す価値あり」という結果が出ました。学習率や構造を変えることなく損失関数の1パラメータを変えるだけで約2%の改善が得られるのは、コストパフォーマンスの高い手法と言えます。

smooth=0.2もsmooth=0.1より高精度という意外な結果

理論的には「smoothingを大きくしすぎると正解を学べなくなる」と考えられますが、今回は smooth=0.2(test: 0.6885)が smooth=0.1(test: 0.6817)を上回りました。これはDropout=0.2と組み合わせた環境では、強めのLabel Smoothingでも過剰抑制になりにくかったことを示唆しています。ただし smooth=0.2 と smooth=0.05 の差はわずか0.0001(0.01%)にとどまり、誤差の範囲といえます。複数回実行して平均を取らないと安定した順位は確定できません。

smooth=0.1がsmooth=0.2より低い理由

smooth=0.1(test: 0.6817)がsmooth=0.2(test: 0.6885)に負けた点は直感に反しますが、1回の実験では乱数の影響が出やすいため、この逆転は確率的なばらつきの範囲内である可能性が高いです。「smoothingを上げるほど精度が単調に下がる」とは言えず、0.05〜0.2の範囲はほぼ横並びと読むのが適切です。重要なのは「0.0より0.05以上が有利」という点です。

損失の絶対値比較には注意が必要

smoothing値が異なるとlossのスケールが変わるため、パターン間のloss値を直接比較してはいけません。比較すべきはあくまでval_accuracyとtest_accuracy、そして過学習の乖離(train_loss と val_loss の差)です。val_lossの低さだけでsmoothing値の優劣を判断しないよう注意してください。

実務での使い方:まず0.05から試す

今回の結果から、CIFAR-10サイズのCNNではsmooth=0.05が出発点として最も安全です。1行追加するだけで約2%の精度改善が期待でき、学習時間の増加もほぼゼロです。Label SmootihngはDropoutやWeight Decayと独立して効くため、既存の正則化構成に上乗せする形で導入できます。


関連記事もあわせてどうぞ:

まとめ

  • Label Smoothingは正解ラベルの確信度を下げて過学習を抑制する損失関数の拡張技術
  • CategoricalCrossentropy(label_smoothing=ε) の1行で導入でき、モデル構造の変更は不要
  • 今回の実験ではsmooth=0.05が最高精度(test: 0.6886)。なし(0.0)の 0.6701 より約+1.8ポイント改善
  • smooth=0.1〜0.2もほぼ横並びで安定。0.0より何らかのsmoothing値を使うほうが有利という結論
  • 1回の実験では乱数の影響が出やすいため、0.1と0.2の順位は確定的ではない。まず0.05から試すのが最も安全
  • 既存のDropout・Weight Decayと独立して組み合わせ可能
EN English Summary
What is Label Smoothing?

Label Smoothing is a regularization technique that prevents a model from becoming overconfident during training. Instead of using hard targets (0 or 1), it softens the labels slightly — for example, turning a 1 into 0.9 and distributing the remaining 0.1 across other classes.

Why it helps with overfitting

When a model is trained with hard labels, it tends to push the output probability of the correct class toward 1.0, which can hurt generalization. Label Smoothing acts as a soft penalty, similar in effect to Dropout or L2 regularization.

Experiment overview (CIFAR-10 + Keras)
  • Without Label Smoothing: standard categorical_crossentropy
  • With Label Smoothing: CategoricalCrossentropy(label_smoothing=0.1)
Key findings
  • Label Smoothing reduced the gap between training and validation accuracy.
  • Final test accuracy improved slightly with smoothing applied.
  • Effect is more noticeable in deeper models or longer training runs.
Keras implementation

loss = keras.losses.CategoricalCrossentropy(label_smoothing=0.1)

A value of 0.050.1 is a typical starting point. Avoid values above 0.2 as they can hurt convergence.