Friday, March 18, 2016

Perceptron Learning Algorithm

I haven't given up on the TTT project. But I've decided to take some time to play with a few other things and come back to the TTT later.

I've been reading through "Python Machine Learning" by Sebastian Raschka. There are lots of code examples that are provided, but in order for me to ensure that I'm understanding things, I'm trying to rewrite pieces of code on my own. The rewritten code will almost certainly be less efficient, but that's not really the point. I want to understand this stuff.

Chapter 2 shows a programming example of the perceptron learning algorithm. I've rewritten that algorithm using lists instead of arrays. This is partly that desire to make sure I understand (lists give me a natural way to perform calculations in a sequential manner, allowing me to break it down step-by-step), and partly because I'm just a little bit lazy and am not immediately interested in learning the syntax of arrays in numpy. I'll have to get there eventually, but right now I'm more interested in the learning algorithms. Besides, if I'm ever in search of an application, I'd probably just use pre-existing code.

The perceptron algorithm is a classification algorithm. It works with data that is linearly separable. Imagine a line on the plane, and one side has only Xs and the other has only Os. The perceptron algorithm seeks to generate that line (or rather, a line that can also serve as the boundary between the data points). The algorithm generates a vector w whose length is one longer than the dimension of the space. To predict whether the outcome is an X or an O, the dot product of w and a vector made of 1 followed by the coordinates of the point is calculated. The prediction is based on whether the result of that dot product is positive or negative.

The algorithm is written in two pieces. One command is called "predict" and this takes the weight vector and a location, and it outputs either a 1 or a -1 depending on the value of the dot product. The other piece is the actual learning algorithm. If I did this correctly, I've actually found an error in the book. The book claims that the algorithm should update the weights simultaneously, but based on what I programmed, it doesn't do that. There may be some technicality in how it runs as the author wrote it, but I'm not interested in pursuing that too deeply.

The next step for me is to rewrite the program that graphically displays this information. The visualization piece is important for me to learn. As a side note, I did not program this on the Raspberry Pi. However, it should run just fine because it's only using lists. I'm not so sure what will happen when I get to the visualization piece, but we'll see what happens when I get there.

def PerceptronFit(eta, n_iter, X, y):
    '''
    X is a list where each list element is a list containing the values of a particular sample
    -- This is the input data
    y is a list
    -- This list represents the correct classifications
    '''
 
    # Make an empty list that has one more slot than the number of features
    # This list holds the weights
    w = [0.0] * (len(X[0]) + 1)
 
    # Make an empty list that will track the number of errors
    errors = []
 
    # Make an empty list that will retain the weights for each iteration
    w_iter = []
 
    # Begin learning
    for iteration in range(n_iter):
        n_errors = 0
        cur_w = w[:]
        # changing cur_w to just using w will change the code to match the book's output
        for xi, target in zip(X, y):
            update = eta * (target - predict(cur_w, xi))
            w[0] += update
            for n in range(1,len(w)):
                w[n] += update * xi[n-1]
            if update != 0:
                n_errors += 1
        errors.append(n_errors)
        w_iter.append(w[:])
    return w_iter, errors
 
# Function to calculate the prediction {0,1} based on the weight and the input
def predict(w, xi):
    net_input = w[0]
    for n in range(1, len(w)):
        net_input += w[n] * xi[n-1]
    if net_input >= 0.0:
        return 1
    else:
        return -1