Do a tutorial - MNIST, tf 2, Sequential API

Ulf Hamster 5 min.
python Sequential API tensorflow2 mnist cnn tutorial

Load Packages

In Colab activate “GPU” in “Menu/Runtime/Change runtime type/Hardware accelerator”

%%capture
!pip install tensorflow-gpu==2.0.0-beta1
import tensorflow as tf
print(tf.__version__)
print("GPU" if tf.test.is_gpu_available() else "CPU")
2.0.0-beta1
GPU

Load Dataset

data = tf.keras.datasets.mnist
(X_train, Y_train), (X_test, Y_test) = data.load_data()
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
11493376/11490434 [==============================] - 0s 0us/step
# Pre-Processing
X_train = X_train / 255.
X_test = X_test / 255.
type(X_train), X_train.shape
(numpy.ndarray, (60000, 28, 28))

Baseline Model

Copied from the tutorial

#?tf.keras.models.Sequential
#?tf.keras.layers.Flatten
#?tf.keras.layers.Dense
#?tf.keras.layers.Dropout
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Dropout

print("specify neural net")
model = Sequential([
    Flatten(input_shape=(28, 28)),
    Dense(units=128, activation='relu'),
    Dropout(rate=0.2, seed=42),
    Dense(units=10, activation='softmax')
])

print("specify optimization")
model.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)
specify neural net
specify optimization
from tensorflow.keras.utils import model_to_dot
from IPython.display import SVG
SVG(model_to_dot(model, dpi=64).create(prog='dot', format='svg'))

svg

print("-"*79 + "\n" + "run optimization")
%time model.fit(X_train, Y_train, epochs=10, verbose=1)  

print("-"*79 + "\n" + "out-of-sample evaluation")
%time model.evaluate(X_test, Y_test)
WARNING: Logging before flag parsing goes to stderr.
W0709 15:19:49.284379 140589180016512 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_grad.py:1250: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where


-------------------------------------------------------------------------------
run optimization
Train on 60000 samples
Epoch 1/10
60000/60000 [==============================] - 6s 104us/sample - loss: 0.2975 - accuracy: 0.9127
Epoch 2/10
60000/60000 [==============================] - 5s 91us/sample - loss: 0.1479 - accuracy: 0.9567
Epoch 3/10
60000/60000 [==============================] - 5s 90us/sample - loss: 0.1109 - accuracy: 0.9663
Epoch 4/10
60000/60000 [==============================] - 5s 89us/sample - loss: 0.0893 - accuracy: 0.9718
Epoch 5/10
60000/60000 [==============================] - 5s 88us/sample - loss: 0.0761 - accuracy: 0.9760
Epoch 6/10
60000/60000 [==============================] - 5s 89us/sample - loss: 0.0659 - accuracy: 0.9791
Epoch 7/10
60000/60000 [==============================] - 5s 88us/sample - loss: 0.0607 - accuracy: 0.9807
Epoch 8/10
60000/60000 [==============================] - 5s 89us/sample - loss: 0.0541 - accuracy: 0.9823
Epoch 9/10
60000/60000 [==============================] - 5s 88us/sample - loss: 0.0494 - accuracy: 0.9841
Epoch 10/10
60000/60000 [==============================] - 5s 89us/sample - loss: 0.0446 - accuracy: 0.9853
CPU times: user 1min 10s, sys: 7.4 s, total: 1min 18s
Wall time: 54.8 s
-------------------------------------------------------------------------------
out-of-sample evaluation
10000/10000 [==============================] - 1s 56us/sample - loss: 0.0719 - accuracy: 0.9793
CPU times: user 692 ms, sys: 73.8 ms, total: 766 ms
Wall time: 623 ms





[0.07190937550601666, 0.9793]

Grad Student Descent, Trial 1 (one more layer)

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Dropout

print("-"*79)
print("specify neural net")
model = Sequential([
    Flatten(input_shape=(28, 28)),  
    Dense(units=128, activation='relu'),
    Dropout(rate=0.2, seed=42),
    Dense(units=128, activation='relu'),  # more layers!
    Dropout(rate=0.2, seed=42),  # more layers!
    Dense(units=10, activation='softmax')
])

print("-"*79)
print("specify optimization")
model.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

print("-"*79)
print("run optimization")
%time model.fit(X_train, Y_train, epochs=10, verbose=1)  

print("-"*79)
print("out-of-sample evaluation")
%time model.evaluate(X_test, Y_test)
-------------------------------------------------------------------------------
specify neural net
-------------------------------------------------------------------------------
specify optimization
-------------------------------------------------------------------------------
run optimization
Train on 60000 samples
Epoch 1/10
60000/60000 [==============================] - 7s 111us/sample - loss: 0.2978 - accuracy: 0.9101
Epoch 2/10
60000/60000 [==============================] - 6s 108us/sample - loss: 0.1436 - accuracy: 0.9565
Epoch 3/10
60000/60000 [==============================] - 6s 108us/sample - loss: 0.1117 - accuracy: 0.9657
Epoch 4/10
60000/60000 [==============================] - 6s 107us/sample - loss: 0.0926 - accuracy: 0.9704
Epoch 5/10
60000/60000 [==============================] - 7s 111us/sample - loss: 0.0842 - accuracy: 0.9733
Epoch 6/10
60000/60000 [==============================] - 7s 113us/sample - loss: 0.0739 - accuracy: 0.9761
Epoch 7/10
60000/60000 [==============================] - 6s 107us/sample - loss: 0.0673 - accuracy: 0.9786
Epoch 8/10
60000/60000 [==============================] - 6s 107us/sample - loss: 0.0635 - accuracy: 0.9804
Epoch 9/10
60000/60000 [==============================] - 6s 107us/sample - loss: 0.0582 - accuracy: 0.9810
Epoch 10/10
60000/60000 [==============================] - 6s 105us/sample - loss: 0.0558 - accuracy: 0.9817
CPU times: user 1min 27s, sys: 8.78 s, total: 1min 36s
Wall time: 1min 5s
-------------------------------------------------------------------------------
out-of-sample evaluation
10000/10000 [==============================] - 1s 57us/sample - loss: 0.0748 - accuracy: 0.9794
CPU times: user 704 ms, sys: 73.3 ms, total: 777 ms
Wall time: 637 ms





[0.07481259031542141, 0.9794]

Grad Student Descent, Trial 2 (batch normalization)

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import (
    Dense, Flatten, Dropout, Activation, BatchNormalization)

print("-"*79)
print("specify neural net")
model = Sequential([
    Flatten(input_shape=(28, 28)),  # Input Layer
    Dense(units=128),  # 1st Hidden Layer
    BatchNormalization(),
    Activation('relu'),
    Dropout(rate=0.2, seed=42),
    Dense(units=128),  # 2nd Hidden Layer
    BatchNormalization(),
    Activation('relu'),
    Dropout(rate=0.2, seed=42),
    Dense(units=10),  # Output Layer
    Activation('softmax')
])

print("-"*79)
print("specify optimization")
model.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

print("-"*79)
print("run optimization")
%time model.fit(X_train, Y_train, epochs=10, verbose=1)  

print("-"*79)
print("out-of-sample evaluation")
%time model.evaluate(X_test, Y_test)
-------------------------------------------------------------------------------
specify neural net
-------------------------------------------------------------------------------
specify optimization
-------------------------------------------------------------------------------
run optimization
Train on 60000 samples
Epoch 1/10
60000/60000 [==============================] - 9s 150us/sample - loss: 0.3257 - accuracy: 0.9020
Epoch 2/10
60000/60000 [==============================] - 9s 146us/sample - loss: 0.1803 - accuracy: 0.9446
Epoch 3/10
60000/60000 [==============================] - 9s 146us/sample - loss: 0.1460 - accuracy: 0.9548
Epoch 4/10
60000/60000 [==============================] - 9s 150us/sample - loss: 0.1248 - accuracy: 0.9599
Epoch 5/10
60000/60000 [==============================] - 9s 147us/sample - loss: 0.1132 - accuracy: 0.9654
Epoch 6/10
60000/60000 [==============================] - 9s 148us/sample - loss: 0.1070 - accuracy: 0.9659
Epoch 7/10
60000/60000 [==============================] - 9s 144us/sample - loss: 0.0953 - accuracy: 0.9696
Epoch 8/10
60000/60000 [==============================] - 9s 145us/sample - loss: 0.0904 - accuracy: 0.9707
Epoch 9/10
60000/60000 [==============================] - 9s 146us/sample - loss: 0.0847 - accuracy: 0.9727
Epoch 10/10
60000/60000 [==============================] - 9s 145us/sample - loss: 0.0799 - accuracy: 0.9747
CPU times: user 1min 59s, sys: 11.4 s, total: 2min 11s
Wall time: 1min 29s
-------------------------------------------------------------------------------
out-of-sample evaluation
10000/10000 [==============================] - 1s 78us/sample - loss: 0.0622 - accuracy: 0.9810
CPU times: user 1.07 s, sys: 68.3 ms, total: 1.14 s
Wall time: 895 ms





[0.062209673688164914, 0.981]

Readings