Lecture 3: Formalizing the problem

by Long Nguyen

This notebook is supplemental to Lecture 3 of the series "Image Recognition with Neural Networks".

The video lecture can be access at here.

In [6]:
from mnist_loader import load_data_wrapper
import numpy as np
In [7]:
training_data, validation_data, test_data = load_data_wrapper()

Vectorization

In the lecture, we discussed vectorization. Suppose we have $m$ images:

$$\{(x^{(1)},y^{(1)}),(x^{(1)},y^{(1)}),\ldots,(x^{(m)},y^{(m)})\}$$

where $x^{(i)}\in\mathbb{R}^{784}$ are the images and $y^{(i)}\in\mathbb{R}^{10}$ are their one-hot encoding labels.

We form $X$ by stacking horizontally the vectors $x^{(i)}$ and form $Y$ by stacking horizontally the vectors $y^{(i)}$.

If X is a list of $n$ (m,1) 2D numpy arrays, then the function np.hstack(X) will produce a (m,n) numpy array.

In [8]:
import numpy as np
In [10]:
# create two small numpy (3,1) arrays.
x1 = np.array([[1],[2],[3]])
x2 = np.array([[4],[5],[6]])
print(x1)
[[1]
 [2]
 [3]]
In [12]:
# stack x1 and x2 horizontally in X
X = [x1, x2]
X = np.hstack(X)
print(X)
[[1 4]
 [2 5]
 [3 6]]

Write the function vectorize_mini_batch below which accepts a minibatch of (image,label) tuples and calls np.hstack to return a tuple X,Y where X contains all of the images and Y contains all of the labels stacked horizontally.

For example X,Y = vectorize_mini_batch(training_data[0:20],20) should return X of shape (784,20) and Y of shape (10,20).

Hint: You can create two empty lists and use the function append() to insert the images and labels. Or you can use list comprehensions.

In [13]:
def vectorize_mini_batch(mini_batch):
    """Given a minibatch of (image,label) tuples 
    return the tuple X,Y where X contains all of the images and Y contains
    all of the labels stacked horizontally """
    mini_batch_x = []
    mini_batch_y = []
    for i in range(0, len(mini_batch)):
        mini_batch_x.append(mini_batch[i][0])
        mini_batch_y.append(mini_batch[i][1])
    X = np.hstack(mini_batch_x)
    Y = np.hstack(mini_batch_y)
    return X, Y
In [15]:
X,Y = vectorize_mini_batch(training_data[0:20])
print(X.shape)
print(Y.shape)
(784, 20)
(10, 20)

The sigmoid function is defined as $$\sigma(x)=\frac{1}{1+e^{-x}}.$$ Implement the sigmoid function. Hint: Use np.exp() for the exponential function.

In [16]:
def sigmoid(x):
    """Returns the output of the sigmoid or logistic function."""
    return 1/(1+np.exp(-x))

Write the vectorized version of the score function or model f below. Here, X is the matrix of images stacked horizontally.

Thus if,

$$X=\left[ \begin{array}{rrr} x^{(1)} & x^{(2)} & \ldots & x^{(m)} \end{array} \right]$$

is an (784,m) array then

$$f(X)=\left[ \begin{array}{rrr} f(x^{(1)}) & f(x^{(2)}) & \ldots & f(x^{(m)}) \end{array} \right]$$

is an (10,m) array.

In [2]:
def f(X, W1, W2, B1, B2):
    """Vectorized version. 
    Return the output of the network if ``X`` is the input consists
    of a collection of images. """
    Z1 = np.dot(W1,X) + B1 # broadcasting! see slides/video for an example.
    A1 = sigmoid(Z1)
    Z2 = np.dot(W2,A1) + B2 # broadcasting! see slides/video for an example.
    A2 = sigmoid(Z2)
    return A2

Write the vectorized version of the predict function.

In [18]:
def predict(images, W1, W2, B1, B2):
    """Vectorized version. 
    The parameter images is a list of (image, label) tuples. 
    Call vectorize_mini_batch and the vectorized version of the model f
    to predict the labels of the images. 
    Hint: Use np.argmax using an axis. 
    """
    X,Y = vectorize_mini_batch(images)
    A = f(X,W1,W2,B1,B2)
    predictions = np.argmax(A, axis = 0)
    predictions = list(predictions)
    return predictions
    
    
In [19]:
with open("parameters.npy", mode="rb") as r:
    parameters = np.load(r)
    W1, B1, W2, B2 = parameters

Use the predict function above to predict the first 20 images from the training set.

In [20]:
predict(training_data[0:20],W1,W2,B1,B2)
Out[20]:
[5, 0, 4, 1, 9, 2, 1, 3, 1, 4, 3, 5, 3, 6, 1, 7, 2, 8, 6, 9]