Home > deep learning, programming > Deep learning training speed with 1080 Ti and M1200

Deep learning training speed with 1080 Ti and M1200

June 19th, 2018

I compared the speed of Nvidia’s 1080 Ti on a desktop (Intel i5-3470 CPU, 3.2G Hz, 32G memory) and NVIDIA Quadro M1200 w/4GB GDDR5, 640 CUDA cores on a laptop (CPU: Intel Core i7-7920HQ (Quad Core 3.10GHz, 4.10GHz Turbo, 8MB 45W, Memory: 64G).

The code I used is Keras’ own example (mnist_cnn.py) to classiy MNIST dataset:

MNIST dataset

MNIST dataset

'''Trains a simple convnet on the MNIST dataset.

Gets to 99.25% test accuracy after 12 epochs
(there is still a lot of margin for parameter tuning).
16 seconds per epoch on a GRID K520 GPU.

from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K

batch_size = 128
num_classes = 10
epochs = 12

# input image dimensions
img_rows, img_cols = 28, 28

# the data, shuffled and split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

if K.image_data_format() == 'channels_first':
    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
    input_shape = (img_rows, img_cols, 1)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dense(128, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))


model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs,
          verbose=1, validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

The result is that 1080 Ti is 3 times faster than M1200:

M1200 (1 epoch) 1080 Ti (1 epoch)
18s 6s
Author: Xu Cui Categories: deep learning, programming Tags:
Try Stork, a research tool we developed

Stork is a publication alert app developed by us at Stanford. As a researcher we often forget to follow up important publications - and it's practically impossible to search many keywords or researchers' names everyday. Stork can help us to search everyday and notifies us when there are new publications/grants. How Stork helped me?

About the author:

Xu Cui is a human brain research scientist in Stanford University. He lives in the Bay Area in the United States. He is also the founder of Stork (smart publication alert app), PaperBox and BizGenius.


He was born in He'nan province, China. He received education in Beijing University(BS), University of Tennessee (Knoxville) (MS), Baylor College of Medicine (PhD) and Stanford University (PostDoc). Read more ...
  1. No comments yet.