from mnist_loader import load_data_wrapper
import numpy as np
training_data, validation_data, test_data = load_data_wrapper()
In the lecture, we discussed vectorization. Suppose we have $m$ images:
$$\{(x^{(1)},y^{(1)}),(x^{(1)},y^{(1)}),\ldots,(x^{(m)},y^{(m)})\}$$
where $x^{(i)}\in\mathbb{R}^{784}$ are the images and $y^{(i)}\in\mathbb{R}^{10}$ are their one-hot encoding labels.
We form $X$ by stacking horizontally the vectors $x^{(i)}$ and form $Y$ by stacking horizontally the vectors $y^{(i)}$.
If X
is a list of $n$ (m,1)
2D numpy arrays, then the function np.hstack(X)
will produce a (m,n)
numpy array.
import numpy as np
# create two small numpy (3,1) arrays.
x1 = np.array([[1],[2],[3]])
x2 = np.array([[4],[5],[6]])
print(x1)
# stack x1 and x2 horizontally in X
X = [x1, x2]
X = np.hstack(X)
print(X)
vectorize_mini_batch
below which accepts a minibatch of (image,label)
tuples and calls np.hstack
to return a tuple X,Y
where X
contains all of the images and Y
contains all of the labels stacked horizontally.¶X,Y = vectorize_mini_batch(training_data[0:20],20)
should return X
of shape (784,20)
and Y
of shape (10,20)
.¶Hint: You can create two empty lists and use the function append() to insert the images and labels. Or you can use list comprehensions.
def vectorize_mini_batch(mini_batch):
"""Given a minibatch of (image,label) tuples
return the tuple X,Y where X contains all of the images and Y contains
all of the labels stacked horizontally """
mini_batch_x = []
mini_batch_y = []
for i in range(0, len(mini_batch)):
mini_batch_x.append(mini_batch[i][0])
mini_batch_y.append(mini_batch[i][1])
X = np.hstack(mini_batch_x)
Y = np.hstack(mini_batch_y)
return X, Y
X,Y = vectorize_mini_batch(training_data[0:20])
print(X.shape)
print(Y.shape)
def sigmoid(x):
"""Returns the output of the sigmoid or logistic function."""
return 1/(1+np.exp(-x))
X
is the matrix of images stacked horizontally.¶$$X=\left[ \begin{array}{rrr} x^{(1)} & x^{(2)} & \ldots & x^{(m)} \end{array} \right]$$
$$f(X)=\left[ \begin{array}{rrr} f(x^{(1)}) & f(x^{(2)}) & \ldots & f(x^{(m)}) \end{array} \right]$$
def f(X, W1, W2, B1, B2):
"""Vectorized version.
Return the output of the network if ``X`` is the input consists
of a collection of images. """
Z1 = np.dot(W1,X) + B1 # broadcasting! see slides/video for an example.
A1 = sigmoid(Z1)
Z2 = np.dot(W2,A1) + B2 # broadcasting! see slides/video for an example.
A2 = sigmoid(Z2)
return A2
def predict(images, W1, W2, B1, B2):
"""Vectorized version.
The parameter images is a list of (image, label) tuples.
Call vectorize_mini_batch and the vectorized version of the model f
to predict the labels of the images.
Hint: Use np.argmax using an axis.
"""
X,Y = vectorize_mini_batch(images)
A = f(X,W1,W2,B1,B2)
predictions = np.argmax(A, axis = 0)
predictions = list(predictions)
return predictions
with open("parameters.npy", mode="rb") as r:
parameters = np.load(r)
W1, B1, W2, B2 = parameters
20
images from the training set.¶predict(training_data[0:20],W1,W2,B1,B2)