In this notebook, we train an MLP to classify images from the MNIST database.
1. Load MNIST Database
1 2 3 4 5 6 7 from keras.datasets import mnist(X_train, y_train), (X_test, y_test) = mnist.load_data() print("The MNIST database has a training set of %d examples." % len(X_train)) print("The MNIST database has a test set of %d examples." % len(X_test))
The MNIST database has a training set of 60000 examples.
The MNIST database has a test set of 10000 examples.
2. Visualize the First Six Training Images 1 2 3 4 5 6 7 8 9 10 11 import matplotlib.pyplot as plt%matplotlib inline import matplotlib.cm as cmimport numpy as npfig = plt.figure(figsize=(20 ,20 )) for i in range(6 ): ax = fig.add_subplot(1 , 6 , i+1 , xticks=[], yticks=[]) ax.imshow(X_train[i], cmap='gray' ) ax.set_title(str(y_train[i]))
3. View an Image in More Detail 1 2 3 4 5 6 7 8 9 10 11 12 13 14 def visualize_input (img, ax) : ax.imshow(img, cmap='gray' ) width, height = img.shape thresh = img.max()/2.5 for x in range(width): for y in range(height): ax.annotate(str(round(img[x][y],2 )), xy=(y,x), horizontalalignment='center' , verticalalignment='center' , color='white' if img[x][y]<thresh else 'black' ) fig = plt.figure(figsize = (12 ,12 )) ax = fig.add_subplot(111 ) visualize_input(X_train[0 ], ax)
4. Rescale the Images by Dividing Every Pixel in Every Image by 255 1 2 3 X_train = X_train.astype('float32' )/255 X_test = X_test.astype('float32' )/255
5. Encode Categorical Integer Labels Using a One-Hot Scheme 1 2 3 4 5 6 7 8 9 10 11 12 13 from keras.utils import np_utilsprint('Integer-valued labels:' ) print(y_train[:10 ]) y_train = np_utils.to_categorical(y_train, 10 ) y_test = np_utils.to_categorical(y_test, 10 ) print('One-hot labels:' ) print(y_train[:10 ])
Integer-valued labels:
[5 0 4 1 9 2 1 3 1 4]
One-hot labels:
[[0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]
[1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
[0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]
[0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
[0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]
[0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]]
6. Define the Model Architecture 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 from keras.models import Sequentialfrom keras.layers import Dense, Dropout, Flattenmodel = Sequential() model.add(Flatten(input_shape=X_train.shape[1 :])) model.add(Dense(512 , activation='relu' )) model.add(Dropout(0.2 )) model.add(Dense(512 , activation='relu' )) model.add(Dropout(0.2 )) model.add(Dense(10 , activation='softmax' )) model.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
flatten_5 (Flatten) (None, 784) 0
_________________________________________________________________
dense_13 (Dense) (None, 512) 401920
_________________________________________________________________
dropout_9 (Dropout) (None, 512) 0
_________________________________________________________________
dense_14 (Dense) (None, 512) 262656
_________________________________________________________________
dropout_10 (Dropout) (None, 512) 0
_________________________________________________________________
dense_15 (Dense) (None, 10) 5130
=================================================================
Total params: 669,706
Trainable params: 669,706
Non-trainable params: 0
_________________________________________________________________
7. Compile the Model 1 2 3 4 5 model.compile(loss='categorical_crossentropy' , optimizer='rmsprop' , metrics=['accuracy' ])
WARNING:tensorflow:From /home/didong/anaconda/anaconda3/lib/python3.7/site-packages/keras/optimizers.py:790: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.
WARNING:tensorflow:From /home/didong/anaconda/anaconda3/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:3295: The name tf.log is deprecated. Please use tf.math.log instead.
8. Calculate the Classification Accuracy on the Test Set (Before Training) 1 2 3 4 5 6 score = model.evaluate(X_test, y_test, verbose=0 ) accuracy = 100 *score[1 ] print('Test accuracy: %.4f%%' % accuracy)
Test accuracy: 7.8400%
9. Train the Model 1 2 3 4 5 6 7 8 9 10 11 12 13 from keras.callbacks import ModelCheckpoint checkpointer = ModelCheckpoint(filepath='mnist.model.best.hdf5' , verbose=1 , save_best_only=True ) hist = model.fit(X_train, y_train, batch_size=128 , epochs=20 , validation_split=0.2 , callbacks=[checkpointer], verbose=1 , shuffle=True )
WARNING:tensorflow:From /home/didong/anaconda/anaconda3/lib/python3.7/site-packages/tensorflow/python/ops/math_grad.py:1250: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Train on 48000 samples, validate on 12000 samples
Epoch 1/20
48000/48000 [==============================] - 6s 130us/step - loss: 0.2958 - acc: 0.9087 - val_loss: 0.1198 - val_acc: 0.9655
Epoch 00001: val_loss improved from inf to 0.11977, saving model to mnist.model.best.hdf5
Epoch 2/20
48000/48000 [==============================] - 5s 114us/step - loss: 0.1234 - acc: 0.9628 - val_loss: 0.1111 - val_acc: 0.9690
Epoch 00002: val_loss improved from 0.11977 to 0.11112, saving model to mnist.model.best.hdf5
Epoch 3/20
48000/48000 [==============================] - 6s 116us/step - loss: 0.0874 - acc: 0.9734 - val_loss: 0.0892 - val_acc: 0.9766
Epoch 00003: val_loss improved from 0.11112 to 0.08923, saving model to mnist.model.best.hdf5
Epoch 4/20
48000/48000 [==============================] - 6s 118us/step - loss: 0.0697 - acc: 0.9784 - val_loss: 0.0876 - val_acc: 0.9772
Epoch 00004: val_loss improved from 0.08923 to 0.08760, saving model to mnist.model.best.hdf5
Epoch 5/20
48000/48000 [==============================] - 5s 114us/step - loss: 0.0580 - acc: 0.9827 - val_loss: 0.0985 - val_acc: 0.9732
Epoch 00005: val_loss did not improve from 0.08760
Epoch 6/20
48000/48000 [==============================] - 5s 114us/step - loss: 0.0484 - acc: 0.9853 - val_loss: 0.0957 - val_acc: 0.9769
Epoch 00006: val_loss did not improve from 0.08760
Epoch 7/20
48000/48000 [==============================] - 5s 113us/step - loss: 0.0417 - acc: 0.9872 - val_loss: 0.1031 - val_acc: 0.9784
Epoch 00007: val_loss did not improve from 0.08760
Epoch 8/20
48000/48000 [==============================] - 5s 112us/step - loss: 0.0388 - acc: 0.9884 - val_loss: 0.0966 - val_acc: 0.9795
Epoch 00008: val_loss did not improve from 0.08760
Epoch 9/20
48000/48000 [==============================] - 5s 115us/step - loss: 0.0351 - acc: 0.9897 - val_loss: 0.1009 - val_acc: 0.9788
Epoch 00009: val_loss did not improve from 0.08760
Epoch 10/20
48000/48000 [==============================] - 5s 115us/step - loss: 0.0312 - acc: 0.9906 - val_loss: 0.1055 - val_acc: 0.9786
Epoch 00010: val_loss did not improve from 0.08760
Epoch 11/20
48000/48000 [==============================] - 5s 111us/step - loss: 0.0316 - acc: 0.9907 - val_loss: 0.1150 - val_acc: 0.9787
Epoch 00011: val_loss did not improve from 0.08760
Epoch 12/20
48000/48000 [==============================] - 5s 112us/step - loss: 0.0275 - acc: 0.9919 - val_loss: 0.1186 - val_acc: 0.9786
Epoch 00012: val_loss did not improve from 0.08760
Epoch 13/20
48000/48000 [==============================] - 5s 111us/step - loss: 0.0265 - acc: 0.9923 - val_loss: 0.1151 - val_acc: 0.9801
Epoch 00013: val_loss did not improve from 0.08760
Epoch 14/20
48000/48000 [==============================] - 5s 111us/step - loss: 0.0250 - acc: 0.9928 - val_loss: 0.1140 - val_acc: 0.9796
Epoch 00014: val_loss did not improve from 0.08760
Epoch 15/20
48000/48000 [==============================] - 5s 111us/step - loss: 0.0246 - acc: 0.9930 - val_loss: 0.1263 - val_acc: 0.9780
Epoch 00015: val_loss did not improve from 0.08760
Epoch 16/20
48000/48000 [==============================] - 6s 115us/step - loss: 0.0215 - acc: 0.9938 - val_loss: 0.1170 - val_acc: 0.9804
Epoch 00016: val_loss did not improve from 0.08760
Epoch 17/20
48000/48000 [==============================] - 5s 114us/step - loss: 0.0232 - acc: 0.9937 - val_loss: 0.1234 - val_acc: 0.9795
Epoch 00017: val_loss did not improve from 0.08760
Epoch 18/20
48000/48000 [==============================] - 6s 116us/step - loss: 0.0192 - acc: 0.9944 - val_loss: 0.1197 - val_acc: 0.9811
Epoch 00018: val_loss did not improve from 0.08760
Epoch 19/20
48000/48000 [==============================] - 6s 115us/step - loss: 0.0207 - acc: 0.9945 - val_loss: 0.1258 - val_acc: 0.9800
Epoch 00019: val_loss did not improve from 0.08760
Epoch 20/20
48000/48000 [==============================] - 6s 116us/step - loss: 0.0203 - acc: 0.9944 - val_loss: 0.1354 - val_acc: 0.9797
Epoch 00020: val_loss did not improve from 0.08760
10. Load the Model with the Best Classification Accuracy on the Validation Set 1 2 model.load_weights('mnist.model.best.hdf5' )
11. Calculate the Classification Accuracy on the Test Set 1 2 3 4 5 6 score = model.evaluate(X_test, y_test, verbose=0 ) accuracy = 100 *score[1 ] print('Test accuracy: %.4f%%' % accuracy)
Test accuracy: 97.9200%