Lecture 3: Formalizing the problem¶

by Long Nguyen¶

This notebook is supplemental to Lecture 3 of the series "Image Recognition with Neural Networks".

The video lecture can be access at here.

from mnist_loader import load_data_wrapper
import numpy as np

training_data, validation_data, test_data = load_data_wrapper()

Vectorization¶

In the lecture, we discussed vectorization. Suppose we have $m$ images:

$$\{(x^{(1)},y^{(1)}),(x^{(1)},y^{(1)}),\ldots,(x^{(m)},y^{(m)})\}$$

where $x^{(i)}\in\mathbb{R}^{784}$ are the images and $y^{(i)}\in\mathbb{R}^{10}$ are their one-hot encoding labels.

We form $X$ by stacking horizontally the vectors $x^{(i)}$ and form $Y$ by stacking horizontally the vectors $y^{(i)}$.

If X is a list of $n$ (m,1) 2D numpy arrays, then the function np.hstack(X) will produce a (m,n) numpy array.

import numpy as np

# create two small numpy (3,1) arrays.
x1 = np.array([[1],[2],[3]])
x2 = np.array([[4],[5],[6]])
print(x1)

[[1]
 [2]
 [3]]

# stack x1 and x2 horizontally in X
X = [x1, x2]
X = np.hstack(X)
print(X)

[[1 4]
 [2 5]
 [3 6]]

Write the function `vectorize_mini_batch` below which accepts a minibatch of `(image,label)` tuples and calls `np.hstack` to return a tuple `X,Y` where `X` contains all of the images and `Y` contains all of the labels stacked horizontally.¶

For example `X,Y = vectorize_mini_batch(training_data[0:20],20)` should return `X` of shape `(784,20)` and `Y` of shape `(10,20)`.¶

Hint: You can create two empty lists and use the function append() to insert the images and labels. Or you can use list comprehensions.

def vectorize_mini_batch(mini_batch):
    """Given a minibatch of (image,label) tuples 
    return the tuple X,Y where X contains all of the images and Y contains
    all of the labels stacked horizontally """
    mini_batch_x = []
    mini_batch_y = []
    for i in range(0, len(mini_batch)):
        mini_batch_x.append(mini_batch[i][0])
        mini_batch_y.append(mini_batch[i][1])
    X = np.hstack(mini_batch_x)
    Y = np.hstack(mini_batch_y)
    return X, Y

X,Y = vectorize_mini_batch(training_data[0:20])
print(X.shape)
print(Y.shape)

(784, 20)
(10, 20)

The sigmoid function is defined as $$\sigma(x)=\frac{1}{1+e^{-x}}.$$ Implement the sigmoid function. Hint: Use np.exp() for the exponential function.¶

def sigmoid(x):
    """Returns the output of the sigmoid or logistic function."""
    return 1/(1+np.exp(-x))

Write the vectorized version of the score function or model f below. Here, `X` is the matrix of images stacked horizontally.¶

Thus if,¶

$$X=\left[ \begin{array}{rrr} x^{(1)} & x^{(2)} & \ldots & x^{(m)} \end{array} \right]$$

is an (784,m) array then¶

$$f(X)=\left[ \begin{array}{rrr} f(x^{(1)}) & f(x^{(2)}) & \ldots & f(x^{(m)}) \end{array} \right]$$

is an (10,m) array.¶

def f(X, W1, W2, B1, B2):
    """Vectorized version. 
    Return the output of the network if ``X`` is the input consists
    of a collection of images. """
    Z1 = np.dot(W1,X) + B1 # broadcasting! see slides/video for an example.
    A1 = sigmoid(Z1)
    Z2 = np.dot(W2,A1) + B2 # broadcasting! see slides/video for an example.
    A2 = sigmoid(Z2)
    return A2

Write the vectorized version of the predict function.¶

def predict(images, W1, W2, B1, B2):
    """Vectorized version. 
    The parameter images is a list of (image, label) tuples. 
    Call vectorize_mini_batch and the vectorized version of the model f
    to predict the labels of the images. 
    Hint: Use np.argmax using an axis. 
    """
    X,Y = vectorize_mini_batch(images)
    A = f(X,W1,W2,B1,B2)
    predictions = np.argmax(A, axis = 0)
    predictions = list(predictions)
    return predictions

with open("parameters.npy", mode="rb") as r:
    parameters = np.load(r)
    W1, B1, W2, B2 = parameters

Use the predict function above to predict the first `20` images from the training set.¶

predict(training_data[0:20],W1,W2,B1,B2)

[5, 0, 4, 1, 9, 2, 1, 3, 1, 4, 3, 5, 3, 6, 1, 7, 2, 8, 6, 9]