Sunday, November 20, 2016

Updated Algorithm Finder

The conversion took a little bit less time than I thought it would. However, I did discover a bug that's related to how Python stores list information. When you make a list of lists, depending on how those lists were constructed it's possible to inadvertently change a variable without realizing. Here's a question that was posted to StackOverflow: Python list by value not by reference. This was basically my issue.

The general situation was that I had to use two different variables to hold cube information for the search. One was supposed to be fixed (the cube configuration being solved) and one was supposed to be part of the search (perform some moves and then compare the result to the database of known solved positions). However, every time I tried to change the position on the second cube, the first one would change.

My solution to this problem was to rewrite the code for making the copy of the cube. I didn't use the deepcopy program from the link. Instead, I just created a function that would return the values of the cube and used that to copy. But even then, I had some problems. Here's the original attempt:
def copyCube( master ):
    return master

newCube = copyCube( oldCube )
This didn't work. When I made changes to newCube, it still impacted the old one. So I had to create an intermediate variable that held just the values and did not actually create a reference to the original.
def copyCube( master ):
    tempCube = resetCube()
    for i in range(6):
        for ii in range(9):
            tempCube[i][ii] = master[i][ii]
    return tempCube

newCube = copyCube( oldCube )
I'll have to think about fine-tuning this to see how efficient I can get it. Right now, the program is completely functional. It even has a save feature for cube positions so that they can be loaded instead of needing to type them in. There's definitely space to grow in that direction for flexibility. I also added a randomizer.

On the Raspberry Pi, I get about 15,000 checked positions per second. There's a chance of improving that by multithreading, but that's another task for another day. I first want to make the base code as efficient as possible. I made a couple other changes that might support that, which includes organizing the library in order of move length so that I can check from the middle after I've already verified that the position wasn't already closer to the beginning. I haven't implemented this, but that will give me a few percent improvement. I also need to do some timing checks to see if I can't make other parts of the algorithm more efficient. But at least for now, I've got the minimum I've needed to get a program that can search for solution algorithms. It's slow, but it works.
import pickle
import time
import random

'''
Terminology:
A "command" is one basic twist of the cube.
A "move" is some move around a single axis of the cube.
A "sequence" is an ordered collection of unflattened moves.
'''

# Global Variables
'''
sqColor[face, position] -- The current state of the cube

 Faces             Positions
  0                  0 1 2
1 2 3 4              3 4 5
  5                  6 7 8

0 = Blue, 1 = Red, 2 = Yellow, 3 = Orange, 4 = White, 5 = Green, 6 = Gray

Solved position:

       0 0 0
       0 0 0
       0 0 0

1 1 1  2 2 2  3 3 3  4 4 4
1 1 1  2 2 2  3 3 3  4 4 4
1 1 1  2 2 2  3 3 3  4 4 4

       5 5 5
       5 5 5
       5 5 5
'''
sqColor = [ [ i for ii in range(9) ] for i in range(6) ]

'''
Axis moves -- xAxis = FB / yAxis = LR / zAxis = UD
'''
xAxis = [["F"], ["F2"], ["F'"], ["B"], ["B2"], ["B'"],
         ["F", "B"], ["F", "B2"], ["F", "B'"],
         ["F2", "B"], ["F2", "B2"], ["F2", "B'"],                               
         ["F'", "B"], ["F'", "B2"], ["F'", "B'"] ]
yAxis = [["L"], ["L2"], ["L'"], ["R"], ["R2"], ["R'"],
         ["L", "R"], ["L", "R2"], ["L", "R'"],
         ["L2", "R"], ["L2", "R2"], ["L2", "R'"],                               
         ["L'", "R"], ["L'", "R2"], ["L'", "R'"] ]
zAxis = [["U"], ["U2"], ["U'"], ["D"], ["D2"], ["D'"],
         ["U", "D"], ["U", "D2"], ["U", "D'"],
         ["U2", "D"], ["U2", "D2"], ["U2", "D'"],                               
         ["U'", "D"], ["U'", "D2"], ["U'", "D'"] ]

def resetCube():
    return [ [ i for ii in range(9) ] for i in range(6) ]

def copyCube( master ):
    tempCube = resetCube()
    for i in range(6):
        for ii in range(9):
            tempCube[i][ii] = master[i][ii]
    return tempCube


'''
premute -- perform some move on the cube
'''

def permute(comm):
    if comm == "U":
        swap4([0,0], [0,2], [0,8], [0,6])
        swap4([0,1], [0,5], [0,7], [0,3])
        swap4([4,0], [3,0], [2,0], [1,0])
        swap4([4,1], [3,1], [2,1], [1,1])
        swap4([4,2], [3,2], [2,2], [1,2])
    elif comm == "U2":
        swap2([0,0], [0,8])
        swap2([0,2], [0,6])
        swap2([0,1], [0,7])
        swap2([0,3], [0,5])
        swap2([1,0], [3,0])
        swap2([1,1], [3,1])
        swap2([1,2], [3,2])
        swap2([2,0], [4,0])
        swap2([2,1], [4,1])
        swap2([2,2], [4,2])
    elif comm == "U'":
        swap4([0,0], [0,6], [0,8], [0,2])
        swap4([0,1], [0,3], [0,7], [0,5])
        swap4([1,0], [2,0], [3,0], [4,0])
        swap4([1,1], [2,1], [3,1], [4,1])
        swap4([1,2], [2,2], [3,2], [4,2])
    elif comm == "L":
        swap4([1,0], [1,2], [1,8], [1,6])
        swap4([1,1], [1,5], [1,7], [1,3])
        swap4([0,0], [2,0], [5,0], [4,8])
        swap4([0,3], [2,3], [5,3], [4,5])
        swap4([0,6], [2,6], [5,6], [4,2])
    elif comm == "L2":
        swap2([1,0], [1,8])
        swap2([1,2], [1,6])
        swap2([1,1], [1,7])
        swap2([1,3], [1,5])
        swap2([2,0], [4,8])
        swap2([2,3], [4,5])
        swap2([2,6], [4,2])
        swap2([0,0], [5,0])
        swap2([0,3], [5,3])
        swap2([0,6], [5,6])
    elif comm == "L'":
        swap4([1,0], [1,6], [1,8], [1,2])
        swap4([1,1], [1,3], [1,7], [1,5])
        swap4([5,0], [2,0], [0,0], [4,8])
        swap4([5,3], [2,3], [0,3], [4,5])
        swap4([5,6], [2,6], [0,6], [4,2])
    elif comm == "F":
        swap4([2,0], [2,2], [2,8], [2,6])
        swap4([2,1], [2,5], [2,7], [2,3])
        swap4([0,6], [3,0], [5,2], [1,8])
        swap4([0,7], [3,3], [5,1], [1,5])
        swap4([0,8], [3,6], [5,0], [1,2])
    elif comm == "F2":
        swap2([2,0], [2,8])
        swap2([2,2], [2,6])
        swap2([2,1], [2,7])
        swap2([2,3], [2,5])
        swap2([0,6], [5,2])
        swap2([0,7], [5,1])
        swap2([0,8], [5,0])
        swap2([1,8], [3,0])
        swap2([1,5], [3,3])
        swap2([1,2], [3,6])
    elif comm == "F'":
        swap4([2,0], [2,6], [2,8], [2,2])
        swap4([2,1], [2,3], [2,7], [2,5])
        swap4([0,6], [1,8], [5,2], [3,0])
        swap4([0,7], [1,5], [5,1], [3,3])
        swap4([0,8], [1,2], [5,0], [3,6])
    elif comm == "R":
        swap4([3,0], [3,2], [3,8], [3,6])
        swap4([3,1], [3,5], [3,7], [3,3])
        swap4([0,8], [4,0], [5,8], [2,8])
        swap4([0,5], [4,3], [5,5], [2,5])
        swap4([0,2], [4,6], [5,2], [2,2])
    elif comm == "R2":
        swap2([3,0], [3,8])
        swap2([3,2], [3,6])
        swap2([3,1], [3,7])
        swap2([3,3], [3,5])
        swap2([2,2], [4,6])
        swap2([2,5], [4,3])
        swap2([2,8], [4,0])
        swap2([0,8], [5,8])
        swap2([0,5], [5,5])
        swap2([0,2], [5,2])
    elif comm == "R'":
        swap4([3,0], [3,6], [3,8], [3,2])
        swap4([3,1], [3,3], [3,7], [3,5])
        swap4([0,8], [2,8], [5,8], [4,0])
        swap4([0,5], [2,5], [5,5], [4,3])
        swap4([0,2], [2,2], [5,2], [4,6])
    elif comm == "B":
        swap4([4,0], [4,2], [4,8], [4,6])
        swap4([4,1], [4,5], [4,7], [4,3])
        swap4([0,2], [1,0], [5,6], [3,8])
        swap4([0,1], [1,3], [5,7], [3,5])
        swap4([0,0], [1,6], [5,8], [3,2])
    elif comm == "B2":
        swap2([4,0], [4,8])
        swap2([4,2], [4,6])
        swap2([4,1], [4,7])
        swap2([4,3], [4,5])
        swap2([0,2], [5,6])
        swap2([0,1], [5,7])
        swap2([0,0], [5,8])
        swap2([1,0], [3,8])
        swap2([1,3], [3,5])
        swap2([1,6], [3,2])
    elif comm == "B'":
        swap4([4,0], [4,6], [4,8], [4,2])
        swap4([4,1], [4,3], [4,7], [4,5])
        swap4([0,2], [3,8], [5,6], [1,0])
        swap4([0,1], [3,5], [5,7], [1,3])
        swap4([0,0], [3,2], [5,8], [1,6])
    elif comm == "D":
        swap4([5,0], [5,2], [5,8], [5,6])
        swap4([5,1], [5,5], [5,7], [5,3])
        swap4([1,6], [2,6], [3,6], [4,6])
        swap4([1,7], [2,7], [3,7], [4,7])
        swap4([1,8], [2,8], [3,8], [4,8])
    elif comm == "D2":
        swap2([5,0], [5,8])
        swap2([5,2], [5,6])
        swap2([5,1], [5,7])
        swap2([5,3], [5,5])
        swap2([1,6], [3,6])
        swap2([1,7], [3,7])
        swap2([1,8], [3,8])
        swap2([2,6], [4,6])
        swap2([2,7], [4,7])
        swap2([2,8], [4,8])
    elif comm == "D'":
        swap4([5,0], [5,6], [5,8], [5,2])
        swap4([5,1], [5,3], [5,7], [5,5])
        swap4([1,6], [4,6], [3,6], [2,6])
        swap4([1,7], [4,7], [3,7], [2,7])
        swap4([1,8], [4,8], [3,8], [2,8])

def swap4(p0, p1, p2, p3):
    temp = sqColor[p0[0]][p0[1]]
    sqColor[p0[0]][p0[1]] = sqColor[p3[0]][p3[1]]
    sqColor[p3[0]][p3[1]] = sqColor[p2[0]][p2[1]]
    sqColor[p2[0]][p2[1]] = sqColor[p1[0]][p1[1]]
    sqColor[p1[0]][p1[1]] = temp

def swap2(p0, p1):
    temp = sqColor[p0[0]][p0[1]]
    sqColor[p0[0]][p0[1]] = sqColor[p1[0]][p1[1]]
    sqColor[p1[0]][p1[1]] = temp

'''
Apply a sequence - The sequence is not flattened
'''

def applySeq(seq):
    commList = flatten(seq)
    for comm in commList:
        permute(comm)

def flatten(seq):
    return [ item for sublist in seq for item in sublist ]

'''
Compare cube: test vs target
returns 1 if matching, 0 if not
ignores color 6 (gray)
'''

def compCube( test, target ):
    for i in range(6):
        for ii in [0, 1, 2, 3, 5, 6, 7, 8]:
            if test[i][ii] < 6:
                if test[i][ii] != target[i][ii]:
                    return 0
    return 1

def dbCompCube( test, cubePosList ):
    for i in range( len(cubePosList) ):
        if i % 10000 == 0:
            print(i, 'of', len(cubePosList))
        if compCube( test, cubePosList[i] ) == 1:
            return i
    return -1


'''
Find the next sequence
Generates the "next" sequence when listed pseudo-lexicographically

A
AA
AAA
AAB
AAC
AB
ABA
ABB
ABC

Except that the letters are moves taken from the lists xAxis, yAxis, and zAxis
The search will also not pick two moves from the same axis back-to-back
Need maxLen > 1
'''

def nextSeq(seq, seqLen):
    if len(seq) == 0:
        seq = [ xAxis[0] ]
    else:
        done = 0
        goBack = 0
        while done == 0:
            lastMove = seq[-1]
            if len(seq) < seqLen and goBack == 0:
                if lastMove in xAxis:
                    seq.append( yAxis[0] )
                else:
                    seq.append( xAxis[0] )
                done = 1
            elif lastMove in xAxis:
                if xAxis.index( lastMove) < len( xAxis ) - 1:
                    seq[-1] = xAxis[ xAxis.index( lastMove) + 1]
                else:
                    if len( seq ) > 1:
                        if seq[-2] in yAxis:
                            seq[-1] = zAxis[0]
                        else:
                            seq[-1] = yAxis[0]
                    else:
                        seq = [ yAxis[0] ]
                done = 1
            elif lastMove in yAxis:
                if yAxis.index( lastMove) < len( yAxis ) - 1:
                    seq[-1] = yAxis[ yAxis.index( lastMove) + 1]
                    done = 1
                else:
                    if len( seq ) > 1:
                        if seq[-2] in xAxis:
                            seq[-1] = zAxis[0]
                            done = 1
                        else:
                            seq = seq[:-1]
                            goBack = 1
                    else:
                        seq = [ zAxis[0] ]
                        done = 1
            elif lastMove in zAxis:
                if zAxis.index( lastMove) < len( zAxis ) - 1:
                    seq[-1] = zAxis[ zAxis.index( lastMove) + 1]
                    done = 1
                else:
                    seq = seq[:-1]
                    goBack = 1
            if seq == [ ] and goBack == 1:
                done = 1
    return seq

'''
Invert move list -- this is for building solution sequences for positions
'''
def invertMoves( flatSeq ):
    inverted = []
    for i in range(len(flatSeq)):
        j = i+1
        if flatSeq[-j] == "F":
            inverted.append("F'")
        elif flatSeq[-j] == "F'":
            inverted.append("F")
        elif flatSeq[-j] == "B":
            inverted.append("B'")
        elif flatSeq[-j] == "B'":
            inverted.append("B")
        elif flatSeq[-j] == "R":
            inverted.append("R'")
        elif flatSeq[-j] == "R'":
            inverted.append("R")
        elif flatSeq[-j] == "L":
            inverted.append("L'")
        elif flatSeq[-j] == "L'":
            inverted.append("L")
        elif flatSeq[-j] == "U":
            inverted.append("U'")
        elif flatSeq[-j] == "U'":
            inverted.append("U")
        elif flatSeq[-j] == "D":
            inverted.append("D'")
        elif flatSeq[-j] == "D'":
            inverted.append("D")
        else:
            inverted.append( flatSeq[-j] )
    return inverted

'''
Display cube
'''
def dispCube(cube):
    print()
    print("       ", cube[0][0], cube[0][1], cube[0][2])
    print("       ", cube[0][3], cube[0][4], cube[0][5])
    print("       ", cube[0][6], cube[0][7], cube[0][8])
    print()
    print(cube[1][0], cube[1][1], cube[1][2], " ",
          cube[2][0], cube[2][1], cube[2][2], " ",
          cube[3][0], cube[3][1], cube[3][2], " ",
          cube[4][0], cube[4][1], cube[4][2] )
    print(cube[1][3], cube[1][4], cube[1][5], " ",
          cube[2][3], cube[2][4], cube[2][5], " ",
          cube[3][3], cube[3][4], cube[3][5], " ",
          cube[4][3], cube[4][4], cube[4][5] )
    print(cube[1][6], cube[1][7], cube[1][8], " ",
          cube[2][6], cube[2][7], cube[2][8], " ",
          cube[3][6], cube[3][7], cube[3][8], " ",
          cube[4][6], cube[4][7], cube[4][8] )
    print()
    print("       ", cube[5][0], cube[5][1], cube[5][2])
    print("       ", cube[5][3], cube[5][4], cube[5][5])
    print("       ", cube[5][6], cube[5][7], cube[5][8])
    print()

def dispGuide():
    print(''' Faces       Positions     Colors: 0 = Blue, 1 = Red, 2 = Yellow
  0            0 1 2               3 = Orange, 4 = White, 5 = Green
1 2 3 4        3 4 5               6 = Gray (ignore)
  5            6 7 8
''')

def go():
    exec(open("./SpeedCubeV2.py").read())

# BEGIN MAIN PROGRAM!

menu = " "
sqColor = resetCube()
testCube = resetCube()
while menu != '9':
    print('''
    (1) Create library pickle
    (2) Load library pickle
    (3) Set Cube
    (4) Solve Cube
    (7) Load Cube
    (8) Save Cube
    (9) Quit
    ''')
    menu = input('Select Menu Option: ')
    if menu == '1':
        maxLen = input('Select Maximum Length: ')
        if maxLen.isdigit():
            curSeq = []
            seqList = [ curSeq ]
            cubePosList = [ sqColor ]
            bookmark = [ 0 ]
            maxLen = int(maxLen)
            for i in range(maxLen):
                bookmark.append( len( seqList) )
                seqLen = i+1
                done = 0
                while done == 0:
                    if len(seqList) % 1000 == 0:
                        print(len(seqList), curSeq)
                    curSeq = nextSeq( curSeq, seqLen )
                    sqColor = resetCube()
                    if len( curSeq ) == seqLen:
                        applySeq( curSeq )
                        cubePosList.append( sqColor )
                        seqList.append( invertMoves( flatten( curSeq ) ) )
                    elif curSeq == []:
                        done = 1
            pickle_out = open("seqList.pickle",'wb')
            pickle.dump(seqList, pickle_out)
            pickle_out.close()
            pickle_out = open("cubePosList.pickle",'wb')
            pickle.dump(cubePosList, pickle_out)
            pickle_out.close()
            pickle_out = open("bookmark.pickle",'wb')
            pickle.dump(bookmark, pickle_out)
            pickle_out.close()

    elif menu == '2':
        pickle_in = open("seqList.pickle",'rb')
        seqList = pickle.load(pickle_in)
        pickle_in.close()
        pickle_in = open("cubePosList.pickle",'rb')
        cubePosList = pickle.load(pickle_in)
        pickle_in.close()
        pickle_in = open("bookmark.pickle",'rb')
        bookmark = pickle.load(pickle_in)
        pickle_in.close()

    elif menu == '3':
        done = 0
        while done == 0:
            dispCube(sqColor)
            dispGuide()
            face = input("Select face (0-5), F for full entry, R to randomize, S to save, Q to quit: ")
            if face == 'Q':
                done = 1
            elif face == 'S':
                testCube = copyCube(sqColor)
                done = 1
            elif face == 'F':
                sqColor = [ ['X' for i in range(9)] for ii in range(6)]
                for i in range(6):
                    for ii in range(9):
                        complete = 0
                        while complete ==  0:
                            dispCube(sqColor)
                            dispGuide()
                            color = input("Location "
                                          + str(i) + ", " + str(ii) + ": ")
                            if color in ['0', '1', '2', '3', '4', '5', '6']:
                                sqColor[i][ii] = int(color)
                                complete = 1
            elif face == 'R':
                randSeq = []
                for i in range(20):
                    randAxis = random.randint(0,2)
                    if randAxis == 0:
                        randMove = random.randint(0,len(xAxis)-1)
                        randSeq.append( xAxis[randMove] )
                    elif randAxis == 1:
                        randMove = random.randint(0,len(yAxis)-1)
                        randSeq.append( yAxis[randMove] )
                    else:
                        randMove = random.randint(0,len(zAxis)-1)
                        randSeq.append( zAxis[randMove] )
                applySeq( randSeq )
            elif face in ['0', '1', '2', '3', '4', '5']:
                face = int(face)
                position = input("Select position (0-8): ")
                if position in ['0', '1', '2', '3', '4', '5', '6', '7', '8']:
                    position = int(position)
                    color = input("Select color (0-6, 6 = ignore): ")
                    if color in ['0', '1', '2', '3', '4', '5', '6']:
                        sqColor[face][position] = int(color)

    elif menu == '4':
        print("Begin search in database")
        dispCube( testCube )
        solIndex = dbCompCube( testCube, cubePosList )
        if solIndex > -1:
            print("Solution index:", solIndex)
            print("Solution:", seqList[solIndex])
        else:
            print("Position not found in database.")
            for searchDepth in range(1,15):
                print("Searching at depth", searchDepth)
                testSeq = [ ]
                done = 0
                count = 1
                while done == 0:
                    testSeq = nextSeq(testSeq, searchDepth)
                    if len( testSeq ) > 0:
                        print("Count:", count)
                        print("Trying", testSeq)
                        sqColor = copyCube( testCube )
                        applySeq( testSeq )
                        dispCube( sqColor )
                        count = count + 1
                        solIndex = dbCompCube( sqColor, cubePosList )
                        if solIndex > -1:
                            print("Search sequence:", testSeq)
                            print("Solution index:", solIndex)
                            print("Solution:", seqList[solIndex])
                            done = 1
                    else:
                        done = 1
                if solIndex > -1:
                    break
    elif menu == '7':
        pickle_in = open("testCube.pickle",'rb')
        testCube = pickle.load(pickle_in)
        pickle_in.close()
    elif menu == '8':
        pickle_out = open("testCube.pickle",'wb')
        pickle.dump(testCube, pickle_out)
        pickle_out.close()

Saturday, November 19, 2016

Creating the move list and searchable database

I created the algorithm to generate the searchable list of positions with "known" solutions. It took a little bumbling around to get it right, but it wasn't conceptually too difficult. It's just thinking through how I would manually create the list step by step. When do I add a new item, how do I cycle to the next item in the list, and then how do I backtrack after I've exhausted the list.

But as I'm thinking about this, I'm somewhat worried that I'm going to need to rewrite the program a bit. I think some of the pieces have become too spread out and disorganized for me to go to the next step (the solving step). Also, while I have definitely liked the graphical representation as I've been double checking the various permutations corresponding to the moves, Pygame just takes so much time to load and running the program in IDLE instead of in a terminal just feels so much slower.

So I'm about to embark on a rewriting of the program. I've built all of the bits and pieces, but I need to pull it apart and reorganize it a bit so that I can call the things I want to call when I want to call them. I've got the ability to run long processes and then dump the results into a file (pickling), which means that I don't need to rerun those processes again and again.

So the way I see it, I need to create the following functions:
  • Reset the cube
  • Perform a sequence of moves to the cube
  • Compare the current state of the cube with one from a list, including the ability to ignore gray pieces
  • Find the "next" sequence to check given a maximum move depth and the current sequence.
It may take a while for me to get there, so here's the code in its current form:
import pygame
import pickle
import time

# Global Variables
sGap = 5
bGap = 10
boxSize = 30

def posCalc(bS, sG, bG):
    return bS*boxSize + sG*sGap + bG*bGap

# Define drawing data

sqPos = [ [ [posCalc(3,3,1), posCalc(0,1,0)],
            [posCalc(4,4,1), posCalc(0,1,0)],
            [posCalc(5,5,1), posCalc(0,1,0)],
            [posCalc(3,3,1), posCalc(1,2,0)],
            [posCalc(4,4,1), posCalc(1,2,0)],
            [posCalc(5,5,1), posCalc(1,2,0)],
            [posCalc(3,3,1), posCalc(2,3,0)],
            [posCalc(4,4,1), posCalc(2,3,0)],
            [posCalc(5,5,1), posCalc(2,3,0)] ],
          [ [posCalc(0,1,0), posCalc(3,3,1)],
            [posCalc(1,2,0), posCalc(3,3,1)],
            [posCalc(2,3,0), posCalc(3,3,1)],
            [posCalc(0,1,0), posCalc(4,4,1)],
            [posCalc(1,2,0), posCalc(4,4,1)],
            [posCalc(2,3,0), posCalc(4,4,1)],
            [posCalc(0,1,0), posCalc(5,5,1)],
            [posCalc(1,2,0), posCalc(5,5,1)],
            [posCalc(2,3,0), posCalc(5,5,1)] ],
          [ [posCalc(3,3,1), posCalc(3,3,1)],
            [posCalc(4,4,1), posCalc(3,3,1)],
            [posCalc(5,5,1), posCalc(3,3,1)],
            [posCalc(3,3,1), posCalc(4,4,1)],
            [posCalc(4,4,1), posCalc(4,4,1)],
            [posCalc(5,5,1), posCalc(4,4,1)],
            [posCalc(3,3,1), posCalc(5,5,1)],
            [posCalc(4,4,1), posCalc(5,5,1)],
            [posCalc(5,5,1), posCalc(5,5,1)] ],
          [ [posCalc(6,5,2), posCalc(3,3,1)],
            [posCalc(7,6,2), posCalc(3,3,1)],
            [posCalc(8,7,2), posCalc(3,3,1)],
            [posCalc(6,5,2), posCalc(4,4,1)],
            [posCalc(7,6,2), posCalc(4,4,1)],
            [posCalc(8,7,2), posCalc(4,4,1)],
            [posCalc(6,5,2), posCalc(5,5,1)],
            [posCalc(7,6,2), posCalc(5,5,1)],
            [posCalc(8,7,2), posCalc(5,5,1)] ],
          [ [posCalc(9,7,3), posCalc(3,3,1)],
            [posCalc(10,8,3), posCalc(3,3,1)],
            [posCalc(11,9,3), posCalc(3,3,1)],
            [posCalc(9,7,3), posCalc(4,4,1)],
            [posCalc(10,8,3), posCalc(4,4,1)],
            [posCalc(11,9,3), posCalc(4,4,1)],
            [posCalc(9,7,3), posCalc(5,5,1)],
            [posCalc(10,8,3), posCalc(5,5,1)],
            [posCalc(11,9,3), posCalc(5,5,1)] ],
          [ [posCalc(3,3,1), posCalc(6,5,2)],
            [posCalc(4,4,1), posCalc(6,5,2)],
            [posCalc(5,5,1), posCalc(6,5,2)],
            [posCalc(3,3,1), posCalc(7,6,2)],
            [posCalc(4,4,1), posCalc(7,6,2)],
            [posCalc(5,5,1), posCalc(7,6,2)],
            [posCalc(3,3,1), posCalc(8,7,2)],
            [posCalc(4,4,1), posCalc(8,7,2)],
            [posCalc(5,5,1), posCalc(8,7,2)] ]
        ]

sqColor = [ [ i for ii in range(9) ] for i in range(6) ]

drawColor = [ pygame.Color(0,0,255),
              pygame.Color(255,0,0),
              pygame.Color(255,255,0),
              pygame.Color(255,100,0),
              pygame.Color(255,255,255),
              pygame.Color(0,255,0),
              pygame.Color(100,100,100) ]

# Initialize
pygame.init()

window1Width = posCalc(12,10,3)
window1Height = posCalc(9,8,2)
window1 = pygame.display.set_mode((window1Width,window1Height))

# Functions

def menu():
    menuSelect = 0
    while menuSelect == 0:
        print('''
Main Menu:
(1) Free Play
(2) Set up Cube
(6) Generate Position Lookup
(7) Generate Solution Lookup Move List
(8) Reset Cube
(9) Quit
''')
        comm = input('Select Menu Option: ')
        if comm in ['1', '2', '6', '7', '8', '9']:
            menuSelect = 1
    return int(comm)

def drawIt1(window, sqPos, sqColor, drawColor):
    for i in range(6):
        for ii in range(9):
            pygame.draw.rect(window,
                             drawColor[sqColor[i][ii]],
                             ( sqPos[i][ii][0], sqPos[i][ii][1], boxSize, boxSize ),
                             0)
    pygame.display.update()
    return

def playCube(sqColor):
    commands = ["U", "U2", "U'",
                "L", "L2", "L'",
                "F", "F2", "F'",
                "R", "R2", "R'",
                "B", "B2", "B'",
                "D", "D2", "D'"]
    commList = "U, U2, U', L, L2, L', F, F2, F', R, R2, R', B, B2, B', D, D2, D'"
    comm = ""
    while comm != "Q":
        print("Execute which command?")
        print(commList)
        comm = input('Enter a squence of moves separated by spaces or Q to quit: ' )
        validComm = 1
        commSeq = comm.split()
        for move in commSeq:
            if move not in commList:
                validComm = 0
        if validComm == 1:
            for move in commSeq:
                sqColor = permute(move, sqColor)
        elif validComm == 0 and comm != "Q":
            print("Invalid move sequence.")
        drawIt1(window1, sqPos, sqColor, drawColor)
    return

def permute(comm, sqColor):
    if comm == "U":
        swap4(sqColor, [0,0], [0,2], [0,8], [0,6])
        swap4(sqColor, [0,1], [0,5], [0,7], [0,3])
        swap4(sqColor, [4,0], [3,0], [2,0], [1,0])
        swap4(sqColor, [4,1], [3,1], [2,1], [1,1])
        swap4(sqColor, [4,2], [3,2], [2,2], [1,2])
    elif comm == "U2":
        swap2(sqColor, [0,0], [0,8])
        swap2(sqColor, [0,2], [0,6])
        swap2(sqColor, [0,1], [0,7])
        swap2(sqColor, [0,3], [0,5])
        swap2(sqColor, [1,0], [3,0])
        swap2(sqColor, [1,1], [3,1])
        swap2(sqColor, [1,2], [3,2])
        swap2(sqColor, [2,0], [4,0])
        swap2(sqColor, [2,1], [4,1])
        swap2(sqColor, [2,2], [4,2])
    elif comm == "U'":
        swap4(sqColor, [0,0], [0,6], [0,8], [0,2])
        swap4(sqColor, [0,1], [0,3], [0,7], [0,5])
        swap4(sqColor, [1,0], [2,0], [3,0], [4,0])
        swap4(sqColor, [1,1], [2,1], [3,1], [4,1])
        swap4(sqColor, [1,2], [2,2], [3,2], [4,2])
    elif comm == "L":
        swap4(sqColor, [1,0], [1,2], [1,8], [1,6])
        swap4(sqColor, [1,1], [1,5], [1,7], [1,3])
        swap4(sqColor, [0,0], [2,0], [5,0], [4,8])
        swap4(sqColor, [0,3], [2,3], [5,3], [4,5])
        swap4(sqColor, [0,6], [2,6], [5,6], [4,2])
    elif comm == "L2":
        swap2(sqColor, [1,0], [1,8])
        swap2(sqColor, [1,2], [1,6])
        swap2(sqColor, [1,1], [1,7])
        swap2(sqColor, [1,3], [1,5])
        swap2(sqColor, [2,0], [4,8])
        swap2(sqColor, [2,3], [4,5])
        swap2(sqColor, [2,6], [4,2])
        swap2(sqColor, [0,0], [5,0])
        swap2(sqColor, [0,3], [5,3])
        swap2(sqColor, [0,6], [5,6])
    elif comm == "L'":
        swap4(sqColor, [1,0], [1,6], [1,8], [1,2])
        swap4(sqColor, [1,1], [1,3], [1,7], [1,5])
        swap4(sqColor, [5,0], [2,0], [0,0], [4,8])
        swap4(sqColor, [5,3], [2,3], [0,3], [4,5])
        swap4(sqColor, [5,6], [2,6], [0,6], [4,2])
    elif comm == "F":
        swap4(sqColor, [2,0], [2,2], [2,8], [2,6])
        swap4(sqColor, [2,1], [2,5], [2,7], [2,3])
        swap4(sqColor, [0,6], [3,0], [5,2], [1,8])
        swap4(sqColor, [0,7], [3,3], [5,1], [1,5])
        swap4(sqColor, [0,8], [3,6], [5,0], [1,2])
    elif comm == "F2":
        swap2(sqColor, [2,0], [2,8])
        swap2(sqColor, [2,2], [2,6])
        swap2(sqColor, [2,1], [2,7])
        swap2(sqColor, [2,3], [2,5])
        swap2(sqColor, [0,6], [5,2])
        swap2(sqColor, [0,7], [5,1])
        swap2(sqColor, [0,8], [5,0])
        swap2(sqColor, [1,8], [3,0])
        swap2(sqColor, [1,5], [3,3])
        swap2(sqColor, [1,2], [3,6])
    elif comm == "F'":
        swap4(sqColor, [2,0], [2,6], [2,8], [2,2])
        swap4(sqColor, [2,1], [2,3], [2,7], [2,5])
        swap4(sqColor, [0,6], [1,8], [5,2], [3,0])
        swap4(sqColor, [0,7], [1,5], [5,1], [3,3])
        swap4(sqColor, [0,8], [1,2], [5,0], [3,6])
    elif comm == "R":
        swap4(sqColor, [3,0], [3,2], [3,8], [3,6])
        swap4(sqColor, [3,1], [3,5], [3,7], [3,3])
        swap4(sqColor, [0,8], [4,0], [5,8], [2,8])
        swap4(sqColor, [0,5], [4,3], [5,5], [2,5])
        swap4(sqColor, [0,2], [4,6], [5,2], [2,2])
    elif comm == "R2":
        swap2(sqColor, [3,0], [3,8])
        swap2(sqColor, [3,2], [3,6])
        swap2(sqColor, [3,1], [3,7])
        swap2(sqColor, [3,3], [3,5])
        swap2(sqColor, [2,2], [4,6])
        swap2(sqColor, [2,5], [4,3])
        swap2(sqColor, [2,8], [4,0])
        swap2(sqColor, [0,8], [5,8])
        swap2(sqColor, [0,5], [5,5])
        swap2(sqColor, [0,2], [5,2])
    elif comm == "R'":
        swap4(sqColor, [3,0], [3,6], [3,8], [3,2])
        swap4(sqColor, [3,1], [3,3], [3,7], [3,5])
        swap4(sqColor, [0,8], [2,8], [5,8], [4,0])
        swap4(sqColor, [0,5], [2,5], [5,5], [4,3])
        swap4(sqColor, [0,2], [2,2], [5,2], [4,6])
    elif comm == "B":
        swap4(sqColor, [4,0], [4,2], [4,8], [4,6])
        swap4(sqColor, [4,1], [4,5], [4,7], [4,3])
        swap4(sqColor, [0,2], [1,0], [5,6], [3,8])
        swap4(sqColor, [0,1], [1,3], [5,7], [3,5])
        swap4(sqColor, [0,0], [1,6], [5,8], [3,2])
    elif comm == "B2":
        swap2(sqColor, [4,0], [4,8])
        swap2(sqColor, [4,2], [4,6])
        swap2(sqColor, [4,1], [4,7])
        swap2(sqColor, [4,3], [4,5])
        swap2(sqColor, [0,2], [5,6])
        swap2(sqColor, [0,1], [5,7])
        swap2(sqColor, [0,0], [5,8])
        swap2(sqColor, [1,0], [3,8])
        swap2(sqColor, [1,3], [3,5])
        swap2(sqColor, [1,6], [3,2])
    elif comm == "B'":
        swap4(sqColor, [4,0], [4,6], [4,8], [4,2])
        swap4(sqColor, [4,1], [4,3], [4,7], [4,5])
        swap4(sqColor, [0,2], [3,8], [5,6], [1,0])
        swap4(sqColor, [0,1], [3,5], [5,7], [1,3])
        swap4(sqColor, [0,0], [3,2], [5,8], [1,6])
    elif comm == "D":
        swap4(sqColor, [5,0], [5,2], [5,8], [5,6])
        swap4(sqColor, [5,1], [5,5], [5,7], [5,3])
        swap4(sqColor, [1,6], [2,6], [3,6], [4,6])
        swap4(sqColor, [1,7], [2,7], [3,7], [4,7])
        swap4(sqColor, [1,8], [2,8], [3,8], [4,8])
    elif comm == "D2":
        swap2(sqColor, [5,0], [5,8])
        swap2(sqColor, [5,2], [5,6])
        swap2(sqColor, [5,1], [5,7])
        swap2(sqColor, [5,3], [5,5])
        swap2(sqColor, [1,6], [3,6])
        swap2(sqColor, [1,7], [3,7])
        swap2(sqColor, [1,8], [3,8])
        swap2(sqColor, [2,6], [4,6])
        swap2(sqColor, [2,7], [4,7])
        swap2(sqColor, [2,8], [4,8])
    elif comm == "D'":
        swap4(sqColor, [5,0], [5,6], [5,8], [5,2])
        swap4(sqColor, [5,1], [5,3], [5,7], [5,5])
        swap4(sqColor, [1,6], [4,6], [3,6], [2,6])
        swap4(sqColor, [1,7], [4,7], [3,7], [2,7])
        swap4(sqColor, [1,8], [4,8], [3,8], [2,8])

    return sqColor

def swap4(sqColor, p0, p1, p2, p3):
    temp = sqColor[p0[0]][p0[1]]
    sqColor[p0[0]][p0[1]] = sqColor[p3[0]][p3[1]]
    sqColor[p3[0]][p3[1]] = sqColor[p2[0]][p2[1]]
    sqColor[p2[0]][p2[1]] = sqColor[p1[0]][p1[1]]
    sqColor[p1[0]][p1[1]] = temp
    return sqColor

def swap2(sqColor, p0, p1):
    temp = sqColor[p0[0]][p0[1]]
    sqColor[p0[0]][p0[1]] = sqColor[p1[0]][p1[1]]
    sqColor[p1[0]][p1[1]] = temp
    return sqColor

def setCube(sqColor):
    done = 0
    print('''
 Faces             Positions
  0                  0 1 2
1 2 3 4              3 4 5
  5                  6 7 8

0 = Blue, 1 = Red, 2 = Yellow, 3 = Orange, 4 = White, 5 = Green, 6 = Gray

Use Face = -1 to quit
''')
    face = input('Enter a face to change: ')
    if face in ['0', '1', '2', '3', '4', '5']:
        pos = input('Enter a position to chance: ')
        if pos in ['0', '1', '2', '3', '4', '5', '6', '7', '8']:
            color = input('Enter the color: ')
            if color in ['0', '1', '2', '3', '4', '5', '6']:
                face = int(face)
                pos = int(pos)
                color = int(color)
                sqColor[face][pos] = color
    elif face == '-1':
        done = 1
    return done, sqColor

def buildLib():
    xAxis = [["F"], ["F2"], ["F'"], ["B"], ["B2"], ["B'"],
             ["F", "B"], ["F", "B2"], ["F", "B'"],
             ["F2", "B"], ["F2", "B2"], ["F2", "B'"],                           
             ["F'", "B"], ["F'", "B2"], ["F'", "B'"] ]
    yAxis = [["L"], ["L2"], ["L'"], ["R"], ["R2"], ["R'"],
             ["L", "R"], ["L", "R2"], ["L", "R'"],
             ["L2", "R"], ["L2", "R2"], ["L2", "R'"],                           
             ["L'", "R"], ["L'", "R2"], ["L'", "R'"] ]
    zAxis = [["U"], ["U2"], ["U'"], ["D"], ["D2"], ["D'"],
             ["U", "D"], ["U", "D2"], ["U", "D'"],
             ["U2", "D"], ["U2", "D2"], ["U2", "D'"],                           
             ["U'", "D"], ["U'", "D2"], ["U'", "D'"] ]

    maxDepth = 3

    moveList = [ [move] for move in xAxis+yAxis+zAxis]

    for firstMove in xAxis+yAxis+zAxis:
        thisComm = [ firstMove ]
        done = 0
        goBack = 0
        while done == 0:
            if len( moveList ) % 1000 == 0:
                print(len(moveList))
            if len( thisComm ) < maxDepth and goBack == 0:
                if thisComm[-1] in xAxis:
                    thisComm.append( yAxis[0] )
                else:
                    thisComm.append( xAxis[0] )
            else:
                goBack = 0
                lastMove = thisComm[-1]
                if lastMove in xAxis:
                    if xAxis.index( lastMove ) < len( xAxis ) - 1:
                        thisComm[-1] = xAxis[ xAxis.index( lastMove ) + 1]
                    else:
                        if thisComm[-2] in yAxis:
                            thisComm[-1] = zAxis[0]
                        else:
                            thisComm[-1] = yAxis[0]
                elif lastMove in yAxis:
                    if yAxis.index( lastMove ) < len( yAxis ) - 1:
                        thisComm[-1] = yAxis[ yAxis.index( lastMove ) + 1]
                    else:
                        if thisComm[-1] in xAxis:
                            thisComm[-1] = zAxis[0]
                        else:
                            if thisComm[-2] in xAxis:
                                thisComm[-1] = zAxis[0]
                            else:
                                thisComm = thisComm[:-1]
                                goBack = 1
                elif lastMove in zAxis:
                    if zAxis.index( lastMove ) < len( xAxis) - 1:
                        thisComm[-1] = zAxis[ zAxis.index( lastMove ) + 1]
                    else:
                        thisComm = thisComm[:-1]
                        goBack = 1
            if len( thisComm ) == 1:
                done = 1
            if goBack == 0 and done == 0:
                moveList.append( thisComm[:] )
    flatList = []
    for move in moveList:
        flatList.append( [ item for sublist in move for item in sublist ] )
    pickle_out = open("flatList.pickle", 'wb')
    pickle.dump(flatList, pickle_out)
    pickle_out.close()
    return

def buildPosLib():
    pickle_in = open("flatList.pickle",'rb')
    flatList = pickle.load(pickle_in)
    pickle_in.close()

    posList = [ [ [ i for ii in range(9) ] for i in range(6) ] ]
    solList = [ [] ]

    for moveSeq in flatList:
        if len(solList) % 1000 == 0:
            print(len(solList))
        sqColor = [ [ i for ii in range(9) ] for i in range(6) ]
        for move in moveSeq:
            sqColor = permute(move, sqColor)
        posList.append( [ sqColor ] )
        solList.append( invertMoves( moveSeq ) )

    pickle_out = open("posList.pickle",'wb')
    pickle.dump(posList, pickle_out)
    pickle_out.close()
    pickle_out = open("solList.pickle",'wb')
    pickle.dump(solList, pickle_out)
    pickle_out.close()


def invertMoves( seq ):
    inverted = []
    for i in range(len(seq)):
        j = i+1
        if seq[-j] == "F":
            inverted.append("F'")
        elif seq[-j] == "F'":
            inverted.append("F")
        elif seq[-j] == "B":
            inverted.append("B'")
        elif seq[-j] == "B'":
            inverted.append("B")
        elif seq[-j] == "R":
            inverted.append("R'")
        elif seq[-j] == "R'":
            inverted.append("R")
        elif seq[-j] == "L":
            inverted.append("L'")
        elif seq[-j] == "L'":
            inverted.append("L")
        elif seq[-j] == "U":
            inverted.append("U'")
        elif seq[-j] == "U'":
            inverted.append("U")
        elif seq[-j] == "D":
            inverted.append("D'")
        elif seq[-j] == "D'":
            inverted.append("D")
        else:
            inverted.append( seq[-j] )
    return inverted


### Main program

comm = 0
while comm != 9:
    comm = menu()
    if comm == 1:
        drawIt1(window1, sqPos, sqColor, drawColor)
        playCube(sqColor)
    elif comm == 2:
        finished = 0
        while finished == 0:
            drawIt1(window1, sqPos, sqColor, drawColor)
            finished, sqColor = setCube(sqColor)
        drawIt1(window1, sqPos, sqColor, drawColor)
    elif comm == 6:
        buildPosLib()
    elif comm == 7:
        buildLib()
    elif comm == 8:
        sqColor = [ [ i for ii in range(9) ] for i in range(6) ]
    elif comm == 9:
        quit()

Wednesday, November 16, 2016

"Playable" Cube

This is just a minor update on the Rubik's Cube program. I've programmed all of the moves as well as the ability to define the cube colors to be whatever you want them to be. This process includes a "gray" color which will be used as a "I don't care" color in the search algorithm. Speaking of the search algorithm, I have decided how I'm going to set it up.

There are 18 moves that are hard-coded into the program (3 moves per face * 6 faces). The most straight-forward search method (test each one, then test all two consecutive moves, then test all three consecutive moves) grows exponentially. To search a 4 move depth would require 18^4, which is just over 100,000 attempts. But many of these moves would be duplicates. For example, the pattern UU would have been searched at depth 1 by U2. There are also a lot of "dumb" moves like UU', which accomplishes nothing.

So my first idea is that I'm not going to search using the 18 standard moves, but instead I'm going to think of each move as being rotation around a particular pole. If you imagine the z-axis going up the U and D faces and consider all of the distinct moves that use only those faces, there are a total of 15 different things that can be done: U, U2, U', DU, DU2, DU', D2U, D2U2, D2U', D'U, D'U2, D'U', D, D2, D'. If the next move is along one of the other two axes, it will be impossible for me to have a move that's undone. I also know that if I do something along the z-axis twice in a row, I'm going to get something redundant. So my search tree will have 45*30*30*... total algorithms to check. But I'm guaranteed not to have anything too silly in the search.

One other issue that I'm concerned about is the time it will take to search. The way this is set up, I'm searching for exactly one arrangement (the solved position). It is possible for me to decrease the search time by expanding the space of winning arrangements. Instead of looking for the solved position, I could search for any position that is within one move of being solved. In other words, teach the program to recognize the almost-solved positions as well as the solved positions. Then if it's only one move away, it doesn't need to finish searching the entire tree at that depth before finding the solution. If I do this for all 3-deep positions from being solved, I go from one winning position to 45*30*30 = 40,500 winning positions.

In order to do this, I need to basically start from a solved cube and then apply all 3-deep moves to it. Since I'll be searching cases where certain pieces might not matter, I'll probably need to include the 1-deep and 2-deep moves in the list. But I'll work on that some other day. Here's the code as it stands.
import pygame

# Global Variables
sGap = 5
bGap = 10
boxSize = 30

def posCalc(bS, sG, bG):
    return bS*boxSize + sG*sGap + bG*bGap

# Define drawing data

sqPos = [ [ [posCalc(3,3,1), posCalc(0,1,0)],
            [posCalc(4,4,1), posCalc(0,1,0)],
            [posCalc(5,5,1), posCalc(0,1,0)],
            [posCalc(3,3,1), posCalc(1,2,0)],
            [posCalc(4,4,1), posCalc(1,2,0)],
            [posCalc(5,5,1), posCalc(1,2,0)],
            [posCalc(3,3,1), posCalc(2,3,0)],
            [posCalc(4,4,1), posCalc(2,3,0)],
            [posCalc(5,5,1), posCalc(2,3,0)] ],
          [ [posCalc(0,1,0), posCalc(3,3,1)],
            [posCalc(1,2,0), posCalc(3,3,1)],
            [posCalc(2,3,0), posCalc(3,3,1)],
            [posCalc(0,1,0), posCalc(4,4,1)],
            [posCalc(1,2,0), posCalc(4,4,1)],
            [posCalc(2,3,0), posCalc(4,4,1)],
            [posCalc(0,1,0), posCalc(5,5,1)],
            [posCalc(1,2,0), posCalc(5,5,1)],
            [posCalc(2,3,0), posCalc(5,5,1)] ],
          [ [posCalc(3,3,1), posCalc(3,3,1)],
            [posCalc(4,4,1), posCalc(3,3,1)],
            [posCalc(5,5,1), posCalc(3,3,1)],
            [posCalc(3,3,1), posCalc(4,4,1)],
            [posCalc(4,4,1), posCalc(4,4,1)],
            [posCalc(5,5,1), posCalc(4,4,1)],
            [posCalc(3,3,1), posCalc(5,5,1)],
            [posCalc(4,4,1), posCalc(5,5,1)],
            [posCalc(5,5,1), posCalc(5,5,1)] ],
          [ [posCalc(6,5,2), posCalc(3,3,1)],
            [posCalc(7,6,2), posCalc(3,3,1)],
            [posCalc(8,7,2), posCalc(3,3,1)],
            [posCalc(6,5,2), posCalc(4,4,1)],
            [posCalc(7,6,2), posCalc(4,4,1)],
            [posCalc(8,7,2), posCalc(4,4,1)],
            [posCalc(6,5,2), posCalc(5,5,1)],
            [posCalc(7,6,2), posCalc(5,5,1)],
            [posCalc(8,7,2), posCalc(5,5,1)] ],
          [ [posCalc(9,7,3), posCalc(3,3,1)],
            [posCalc(10,8,3), posCalc(3,3,1)],
            [posCalc(11,9,3), posCalc(3,3,1)],
            [posCalc(9,7,3), posCalc(4,4,1)],
            [posCalc(10,8,3), posCalc(4,4,1)],
            [posCalc(11,9,3), posCalc(4,4,1)],
            [posCalc(9,7,3), posCalc(5,5,1)],
            [posCalc(10,8,3), posCalc(5,5,1)],
            [posCalc(11,9,3), posCalc(5,5,1)] ],
          [ [posCalc(3,3,1), posCalc(6,5,2)],
            [posCalc(4,4,1), posCalc(6,5,2)],
            [posCalc(5,5,1), posCalc(6,5,2)],
            [posCalc(3,3,1), posCalc(7,6,2)],
            [posCalc(4,4,1), posCalc(7,6,2)],
            [posCalc(5,5,1), posCalc(7,6,2)],
            [posCalc(3,3,1), posCalc(8,7,2)],
            [posCalc(4,4,1), posCalc(8,7,2)],
            [posCalc(5,5,1), posCalc(8,7,2)] ]
        ]

sqColor = [ [ i for ii in range(9) ] for i in range(6) ]

drawColor = [ pygame.Color(0,0,255),
              pygame.Color(255,0,0),
              pygame.Color(255,255,0),
              pygame.Color(255,100,0),
              pygame.Color(255,255,255),
              pygame.Color(0,255,0),
              pygame.Color(100,100,100) ]

# Initialize
pygame.init()

window1Width = posCalc(12,10,3)
window1Height = posCalc(9,8,2)
window1 = pygame.display.set_mode((window1Width,window1Height))

# Functions

def menu():
    menuSelect = 0
    while menuSelect == 0:
        print('''
Main Menu:
(1) Free Play
(2) Set up Cube
(8) Reset Cube
(9) Quit
''')
        comm = input('Select Menu Option: ')
        if comm in ['1', '2', '8', '9']:
            menuSelect = 1
    return int(comm)

def drawIt1(window, sqPos, sqColor, drawColor):
    for i in range(6):
        for ii in range(9):
            pygame.draw.rect(window,
                             drawColor[sqColor[i][ii]],
                             ( sqPos[i][ii][0], sqPos[i][ii][1], boxSize, boxSize ),
                             0)
    pygame.display.update()
    return

def playCube(sqColor):
    commands = ["U", "U2", "U'",
                "L", "L2", "L'",
                "F", "F2", "F'",
                "R", "R2", "R'",
                "B", "B2", "B'",
                "D", "D2", "D'"]
    commList = "U, U2, U', L, L2, L', F, F2, F', R, R2, R', B, B2, B', D, D2, D'"
    comm = ""
    while comm != "Q":
        print("Execute which command?")
        print(commList)
        comm = input('Select a move or Q to quit: ' )
        if comm in commands:
            sqColor = permute(comm, sqColor)
        drawIt1(window1, sqPos, sqColor, drawColor)

def permute(comm, sqColor):
    if comm == "U":
        swap4(sqColor, [0,0], [0,2], [0,8], [0,6])
        swap4(sqColor, [0,1], [0,5], [0,7], [0,3])
        swap4(sqColor, [4,0], [3,0], [2,0], [1,0])
        swap4(sqColor, [4,1], [3,1], [2,1], [1,1])
        swap4(sqColor, [4,2], [3,2], [2,2], [1,2])
    elif comm == "U2":
        swap2(sqColor, [0,0], [0,8])
        swap2(sqColor, [0,2], [0,6])
        swap2(sqColor, [0,1], [0,7])
        swap2(sqColor, [0,3], [0,5])
        swap2(sqColor, [1,0], [3,0])
        swap2(sqColor, [1,1], [3,1])
        swap2(sqColor, [1,2], [3,2])
        swap2(sqColor, [2,0], [4,0])
        swap2(sqColor, [2,1], [4,1])
        swap2(sqColor, [2,2], [4,2])
    elif comm == "U'":
        swap4(sqColor, [0,0], [0,6], [0,8], [0,2])
        swap4(sqColor, [0,1], [0,3], [0,7], [0,5])
        swap4(sqColor, [1,0], [2,0], [3,0], [4,0])
        swap4(sqColor, [1,1], [2,1], [3,1], [4,1])
        swap4(sqColor, [1,2], [2,2], [3,2], [4,2])
    elif comm == "L":
        swap4(sqColor, [1,0], [1,2], [1,8], [1,6])
        swap4(sqColor, [1,1], [1,5], [1,7], [1,3])
        swap4(sqColor, [0,0], [2,0], [5,0], [4,8])
        swap4(sqColor, [0,3], [2,3], [5,3], [4,5])
        swap4(sqColor, [0,6], [2,6], [5,6], [4,2])
    elif comm == "L2":
        swap2(sqColor, [1,0], [1,8])
        swap2(sqColor, [1,2], [1,6])
        swap2(sqColor, [1,1], [1,7])
        swap2(sqColor, [1,3], [1,5])
        swap2(sqColor, [2,0], [4,8])
        swap2(sqColor, [2,3], [4,5])
        swap2(sqColor, [2,6], [4,2])
        swap2(sqColor, [0,0], [5,0])
        swap2(sqColor, [0,3], [5,3])
        swap2(sqColor, [0,6], [5,6])
    elif comm == "L'":
        swap4(sqColor, [1,0], [1,6], [1,8], [1,2])
        swap4(sqColor, [1,1], [1,3], [1,7], [1,5])
        swap4(sqColor, [5,0], [2,0], [0,0], [4,8])
        swap4(sqColor, [5,3], [2,3], [0,3], [4,5])
        swap4(sqColor, [5,6], [2,6], [0,6], [4,2])
    elif comm == "F":
        swap4(sqColor, [2,0], [2,2], [2,8], [2,6])
        swap4(sqColor, [2,1], [2,5], [2,7], [2,3])
        swap4(sqColor, [0,6], [3,0], [5,2], [1,8])
        swap4(sqColor, [0,7], [3,3], [5,1], [1,5])
        swap4(sqColor, [0,8], [3,6], [5,0], [1,2])
    elif comm == "F2":
        swap2(sqColor, [2,0], [2,8])
        swap2(sqColor, [2,2], [2,6])
        swap2(sqColor, [2,1], [2,7])
        swap2(sqColor, [2,3], [2,5])
        swap2(sqColor, [0,6], [5,2])
        swap2(sqColor, [0,7], [5,1])
        swap2(sqColor, [0,8], [5,0])
        swap2(sqColor, [1,8], [3,0])
        swap2(sqColor, [1,5], [3,3])
        swap2(sqColor, [1,2], [3,6])
    elif comm == "F'":
        swap4(sqColor, [2,0], [2,6], [2,8], [2,2])
        swap4(sqColor, [2,1], [2,3], [2,7], [2,5])
        swap4(sqColor, [0,6], [1,8], [5,2], [3,0])
        swap4(sqColor, [0,7], [1,5], [5,1], [3,3])
        swap4(sqColor, [0,8], [1,2], [5,0], [3,6])
    elif comm == "R":
        swap4(sqColor, [3,0], [3,2], [3,8], [3,6])
        swap4(sqColor, [3,1], [3,5], [3,7], [3,3])
        swap4(sqColor, [0,8], [4,0], [5,8], [2,8])
        swap4(sqColor, [0,5], [4,3], [5,5], [2,5])
        swap4(sqColor, [0,2], [4,6], [5,2], [2,2])
    elif comm == "R2":
        swap2(sqColor, [3,0], [3,8])
        swap2(sqColor, [3,2], [3,6])
        swap2(sqColor, [3,1], [3,7])
        swap2(sqColor, [3,3], [3,5])
        swap2(sqColor, [2,2], [4,6])
        swap2(sqColor, [2,5], [4,3])
        swap2(sqColor, [2,8], [4,0])
        swap2(sqColor, [0,8], [5,8])
        swap2(sqColor, [0,5], [5,5])
        swap2(sqColor, [0,2], [5,2])
    elif comm == "R'":
        swap4(sqColor, [3,0], [3,6], [3,8], [3,2])
        swap4(sqColor, [3,1], [3,3], [3,7], [3,5])
        swap4(sqColor, [0,8], [2,8], [5,8], [4,0])
        swap4(sqColor, [0,5], [2,5], [5,5], [4,3])
        swap4(sqColor, [0,2], [2,2], [5,2], [4,6])
    elif comm == "B":
        swap4(sqColor, [4,0], [4,2], [4,8], [4,6])
        swap4(sqColor, [4,1], [4,5], [4,7], [4,3])
        swap4(sqColor, [0,2], [1,0], [5,6], [3,8])
        swap4(sqColor, [0,1], [1,3], [5,7], [3,5])
        swap4(sqColor, [0,0], [1,6], [5,8], [3,2])
    elif comm == "B2":
        swap2(sqColor, [4,0], [4,8])
        swap2(sqColor, [4,2], [4,6])
        swap2(sqColor, [4,1], [4,7])
        swap2(sqColor, [4,3], [4,5])
        swap2(sqColor, [0,2], [5,6])
        swap2(sqColor, [0,1], [5,7])
        swap2(sqColor, [0,0], [5,8])
        swap2(sqColor, [1,0], [3,8])
        swap2(sqColor, [1,3], [3,5])
        swap2(sqColor, [1,6], [3,2])
    elif comm == "B'":
        swap4(sqColor, [4,0], [4,6], [4,8], [4,2])
        swap4(sqColor, [4,1], [4,3], [4,7], [4,5])
        swap4(sqColor, [0,2], [3,8], [5,6], [1,0])
        swap4(sqColor, [0,1], [3,5], [5,7], [1,3])
        swap4(sqColor, [0,0], [3,2], [5,8], [1,6])
    elif comm == "D":
        swap4(sqColor, [5,0], [5,2], [5,8], [5,6])
        swap4(sqColor, [5,1], [5,5], [5,7], [5,3])
        swap4(sqColor, [1,6], [2,6], [3,6], [4,6])
        swap4(sqColor, [1,7], [2,7], [3,7], [4,7])
        swap4(sqColor, [1,8], [2,8], [3,8], [4,8])
    elif comm == "D2":
        swap2(sqColor, [5,0], [5,8])
        swap2(sqColor, [5,2], [5,6])
        swap2(sqColor, [5,1], [5,7])
        swap2(sqColor, [5,3], [5,5])
        swap2(sqColor, [1,6], [3,6])
        swap2(sqColor, [1,7], [3,7])
        swap2(sqColor, [1,8], [3,8])
        swap2(sqColor, [2,6], [4,6])
        swap2(sqColor, [2,7], [4,7])
        swap2(sqColor, [2,8], [4,8])
    elif comm == "D'":
        swap4(sqColor, [5,0], [5,6], [5,8], [5,2])
        swap4(sqColor, [5,1], [5,3], [5,7], [5,5])
        swap4(sqColor, [1,6], [4,6], [3,6], [2,6])
        swap4(sqColor, [1,7], [4,7], [3,7], [2,7])
        swap4(sqColor, [1,8], [4,8], [3,8], [2,8])

    return sqColor

def swap4(sqColor, p0, p1, p2, p3):
    temp = sqColor[p0[0]][p0[1]]
    sqColor[p0[0]][p0[1]] = sqColor[p3[0]][p3[1]]
    sqColor[p3[0]][p3[1]] = sqColor[p2[0]][p2[1]]
    sqColor[p2[0]][p2[1]] = sqColor[p1[0]][p1[1]]
    sqColor[p1[0]][p1[1]] = temp
    return sqColor

def swap2(sqColor, p0, p1):
    temp = sqColor[p0[0]][p0[1]]
    sqColor[p0[0]][p0[1]] = sqColor[p1[0]][p1[1]]
    sqColor[p1[0]][p1[1]] = temp
    return sqColor

def setCube(sqColor):
    done = 0
    print('''
 Faces             Positions
  0                  0 1 2
1 2 3 4              3 4 5
  5                  6 7 8

0 = Blue, 1 = Red, 2 = Yellow, 3 = Orange, 4 = White, 5 = Green, 6 = Gray

Use Face = -1 to quit
''')
    face = input('Enter a face to change: ')
    if face in ['0', '1', '2', '3', '4', '5']:
        pos = input('Enter a position to chance: ')
        if pos in ['0', '1', '2', '3', '4', '5', '6', '7', '8']:
            color = input('Enter the color: ')
            if color in ['0', '1', '2', '3', '4', '5', '6']:
                face = int(face)
                pos = int(pos)
                color = int(color)
                sqColor[face][pos] = color
    elif face == '-1':
        done = 1
    return done, sqColor

### Main program

comm = 0
while comm != 9:
    comm = menu()
    print(comm)
    if comm == 1:
        print('Free Play!!!')
        drawIt1(window1, sqPos, sqColor, drawColor)
        playCube(sqColor)
    elif comm == 2:
        finished = 0
        while finished == 0:
            drawIt1(window1, sqPos, sqColor, drawColor)
            finished, sqColor = setCube(sqColor)
    elif comm == 8:
        sqColor = [ [ i for ii in range(9) ] for i in range(6) ]
    elif comm == 9:
        quit()

Monday, November 7, 2016

Displaying the Unfolded Cube

There isn't that much to this first step. All I wanted to do was to be able to display an unfolded cube. But because I didn't want to have to calculate the positions of everything by hand, I wrote a function that let me compute the locations of the squares based on three constants: the small gap size (between squares on the same face), the large gap size (between the faces), and the box size (the actual boxes). After having done all of this, I think I probably could have written a more general position calculator, but I'm not going to go back and rewrite this in order to do that. Also, I think this is the first time I've used global variables. My understanding is that if you define a variable before you define functions, you get a global variable. That seems to be what happens, anyway.

I've defined the colors of the faces to match the physical cube that I have (with blue on top... for now). The locations are given by a two dimensional array. The first dimension gives the face (0 = U, 1 = L, 2 = F, 3 = R, 4 = B, 5 = D) and the second dimension gives the position (0-8 starting from the top left and working across the rows and down the columns). I also added a 7th color (gray) for when I'm searching for moves that permute certain pieces but that do not care what happens to certain other pieces.

The next task will be to program in the permutations obtained from the 18 basic movements. I thought about just programming in the 6 clockwise rotations and then having the others generated by repeating these, but I suspect that I'll be checking millions of positions, so that would just make things inefficient. When it comes time to writing the search algorithm, I may need to also write in a "don't be dumb" check that doesn't let me do two moves on the same face in a row. But I'll get to that one probably in the next few days.

I've set myself up to have two windows. One to show the unfolded cube and the other to show a 3D cube. I actually want to use the second window to display move sequences with a visual aid to help me get it right. It's not necessary, but it just seemed like a fun thing to try. That will be done towards the end of the project.

Here's the existing code:
import pygame

# Global Variables
sGap = 5
bGap = 10
boxSize = 30

# Functions

def posCalc(bS, sG, bG):
    return bS*boxSize + sG*sGap + bG*bGap

def drawIt1(window, sqPos, sqColor, drawColor):
    for i in range(6):
        for ii in range(9):
            pygame.draw.rect(window,
                             drawColor[sqColor[i][ii]],
                             ( sqPos[i][ii][0], sqPos[i][ii][1], boxSize, boxSize ),
                             0)
    pygame.display.update()
    return


# Define drawing data

sqPos = [ [ [posCalc(3,3,1), posCalc(0,1,0)],
            [posCalc(4,4,1), posCalc(0,1,0)],
            [posCalc(5,5,1), posCalc(0,1,0)],
            [posCalc(3,3,1), posCalc(1,2,0)],
            [posCalc(4,4,1), posCalc(1,2,0)],
            [posCalc(5,5,1), posCalc(1,2,0)],
            [posCalc(3,3,1), posCalc(2,3,0)],
            [posCalc(4,4,1), posCalc(2,3,0)],
            [posCalc(5,5,1), posCalc(2,3,0)] ],
          [ [posCalc(0,1,0), posCalc(3,3,1)],
            [posCalc(1,2,0), posCalc(3,3,1)],
            [posCalc(2,3,0), posCalc(3,3,1)],
            [posCalc(0,1,0), posCalc(4,4,1)],
            [posCalc(1,2,0), posCalc(4,4,1)],
            [posCalc(2,3,0), posCalc(4,4,1)],
            [posCalc(0,1,0), posCalc(5,5,1)],
            [posCalc(1,2,0), posCalc(5,5,1)],
            [posCalc(2,3,0), posCalc(5,5,1)] ],
          [ [posCalc(3,3,1), posCalc(3,3,1)],
            [posCalc(4,4,1), posCalc(3,3,1)],
            [posCalc(5,5,1), posCalc(3,3,1)],
            [posCalc(3,3,1), posCalc(4,4,1)],
            [posCalc(4,4,1), posCalc(4,4,1)],
            [posCalc(5,5,1), posCalc(4,4,1)],
            [posCalc(3,3,1), posCalc(5,5,1)],
            [posCalc(4,4,1), posCalc(5,5,1)],
            [posCalc(5,5,1), posCalc(5,5,1)] ],
          [ [posCalc(6,5,2), posCalc(3,3,1)],
            [posCalc(7,6,2), posCalc(3,3,1)],
            [posCalc(8,7,2), posCalc(3,3,1)],
            [posCalc(6,5,2), posCalc(4,4,1)],
            [posCalc(7,6,2), posCalc(4,4,1)],
            [posCalc(8,7,2), posCalc(4,4,1)],
            [posCalc(6,5,2), posCalc(5,5,1)],
            [posCalc(7,6,2), posCalc(5,5,1)],
            [posCalc(8,7,2), posCalc(5,5,1)] ],
          [ [posCalc(9,7,3), posCalc(3,3,1)],
            [posCalc(10,8,3), posCalc(3,3,1)],
            [posCalc(11,9,3), posCalc(3,3,1)],
            [posCalc(9,7,3), posCalc(4,4,1)],
            [posCalc(10,8,3), posCalc(4,4,1)],
            [posCalc(11,9,3), posCalc(4,4,1)],
            [posCalc(9,7,3), posCalc(5,5,1)],
            [posCalc(10,8,3), posCalc(5,5,1)],
            [posCalc(11,9,3), posCalc(5,5,1)] ],
          [ [posCalc(3,3,1), posCalc(6,5,2)],
            [posCalc(4,4,1), posCalc(6,5,2)],
            [posCalc(5,5,1), posCalc(6,5,2)],
            [posCalc(3,3,1), posCalc(7,6,2)],
            [posCalc(4,4,1), posCalc(7,6,2)],
            [posCalc(5,5,1), posCalc(7,6,2)],
            [posCalc(3,3,1), posCalc(8,7,2)],
            [posCalc(4,4,1), posCalc(8,7,2)],
            [posCalc(5,5,1), posCalc(8,7,2)] ]
        ]

sqColor = [ [ i for ii in range(9) ] for i in range(6) ]

drawColor = [ pygame.Color(0,0,255),
              pygame.Color(255,0,0),
              pygame.Color(255,255,0),
              pygame.Color(255,140,0),
              pygame.Color(255,255,255),
              pygame.Color(0,255,0),
              pygame.Color(100,100,100) ]

### Main program
# Initialize
pygame.init()

window1Width = posCalc(12,10,3)
window1Height = posCalc(9,8,2)
window1 = pygame.display.set_mode((window1Width,window1Height))
drawIt1(window1, sqPos, sqColor, drawColor)

Sunday, November 6, 2016

Speed Cubing Project

It's been a long time since I've last pulled out the Raspberry Pi to play with it. I've been watching a lot of videos about Python from PythonProgramming.net, which has been extremely enlightening (though I haven't actually done anything with it). For a short period of time I worked on some programming stuff to try to get some horse racing data, but I just don't have the free time to put into these projects as I would like.

However, I think I found a project that's at an appropriate level for the time that I have. Over the summer, I discovered that my friend's son got interested in speed cubing. I played around with the Rubik's cube a whole lot when I was in high school, and so this discovery piqued my interest a bit. Then this semester I discovered that I had a student that was interested in speed cubing. And I think that put things over the top. So I bought myself a proper speed cube, which arrived in the mail today.

There are basically two aspects of speed cubing. One is just the raw memorization of algorithms to move the pieces to certain places, and the other is the physical technique of manipulating the cube. So my idea is to create a program that's both an algorithm search program as well as an algorithm optimizer. It's true that there are lots of algorithms published on the internet to do all sorts of things. I suspect that those are fairly optimal with regards to the number of moves it takes to execute them. But they may not be optimal with regards to the actual physical manipulation of the cube. For example, RUR' is a very fluid motion, but BUB' is not. I actually think it would be faster to physically rotate the cube and then do RUR' even though the latter is four movements.

So I envision building out this program in steps. I'm going to first build the algorithm search engine, and then I'm going modify that engine so that it takes into account some data about the speed of movements. Or perhaps in the process of finding algorithms, I may discover that there are some moves that I simply need to practice. For example, repeating RUR'U' is a common way to just fiddle around with the cube. And by doing this over and over again, there's a bit of muscle memory that's created to make patterns such as RUR' become more fluid and more automatic. So I might have to just create patterns like that to play with.

This project is somewhat reminiscent of a project I did back in high school. I was frustrated with a permutation game called Triple Cross, and so I programmed a computer to execute a search for an algorithm to solve it. For that game, there were really only four basic motions, so it was a simple tree which made writing the search algorithm easy. The Rubik's Cube has 18 motions, so there will be a much slower search process. However, since computers are significantly faster than they were back in 1998, it's possible that this won't really be an issue. As I'm thinking about potential pitfalls along the way, I'm wondering whether I'll need to make a library of permutations to speed up the search process.

I'm also thinking about developing a display module to show the moves just to get myself used to thinking about the moves using the standard language and notation for speed cubing. If I'm holding a cube in front of me, it's far from natural for me to manipulate it by just reading a string of letters. That display module will probably just come from making drawings using PyGame. It shouldn't be that hard.

Friday, March 18, 2016

Perceptron Learning Algorithm

I haven't given up on the TTT project. But I've decided to take some time to play with a few other things and come back to the TTT later.

I've been reading through "Python Machine Learning" by Sebastian Raschka. There are lots of code examples that are provided, but in order for me to ensure that I'm understanding things, I'm trying to rewrite pieces of code on my own. The rewritten code will almost certainly be less efficient, but that's not really the point. I want to understand this stuff.

Chapter 2 shows a programming example of the perceptron learning algorithm. I've rewritten that algorithm using lists instead of arrays. This is partly that desire to make sure I understand (lists give me a natural way to perform calculations in a sequential manner, allowing me to break it down step-by-step), and partly because I'm just a little bit lazy and am not immediately interested in learning the syntax of arrays in numpy. I'll have to get there eventually, but right now I'm more interested in the learning algorithms. Besides, if I'm ever in search of an application, I'd probably just use pre-existing code.

The perceptron algorithm is a classification algorithm. It works with data that is linearly separable. Imagine a line on the plane, and one side has only Xs and the other has only Os. The perceptron algorithm seeks to generate that line (or rather, a line that can also serve as the boundary between the data points). The algorithm generates a vector w whose length is one longer than the dimension of the space. To predict whether the outcome is an X or an O, the dot product of w and a vector made of 1 followed by the coordinates of the point is calculated. The prediction is based on whether the result of that dot product is positive or negative.

The algorithm is written in two pieces. One command is called "predict" and this takes the weight vector and a location, and it outputs either a 1 or a -1 depending on the value of the dot product. The other piece is the actual learning algorithm. If I did this correctly, I've actually found an error in the book. The book claims that the algorithm should update the weights simultaneously, but based on what I programmed, it doesn't do that. There may be some technicality in how it runs as the author wrote it, but I'm not interested in pursuing that too deeply.

The next step for me is to rewrite the program that graphically displays this information. The visualization piece is important for me to learn. As a side note, I did not program this on the Raspberry Pi. However, it should run just fine because it's only using lists. I'm not so sure what will happen when I get to the visualization piece, but we'll see what happens when I get there.

def PerceptronFit(eta, n_iter, X, y):
    '''
    X is a list where each list element is a list containing the values of a particular sample
    -- This is the input data
    y is a list
    -- This list represents the correct classifications
    '''
 
    # Make an empty list that has one more slot than the number of features
    # This list holds the weights
    w = [0.0] * (len(X[0]) + 1)
 
    # Make an empty list that will track the number of errors
    errors = []
 
    # Make an empty list that will retain the weights for each iteration
    w_iter = []
 
    # Begin learning
    for iteration in range(n_iter):
        n_errors = 0
        cur_w = w[:]
        # changing cur_w to just using w will change the code to match the book's output
        for xi, target in zip(X, y):
            update = eta * (target - predict(cur_w, xi))
            w[0] += update
            for n in range(1,len(w)):
                w[n] += update * xi[n-1]
            if update != 0:
                n_errors += 1
        errors.append(n_errors)
        w_iter.append(w[:])
    return w_iter, errors
 
# Function to calculate the prediction {0,1} based on the weight and the input
def predict(w, xi):
    net_input = w[0]
    for n in range(1, len(w)):
        net_input += w[n] * xi[n-1]
    if net_input >= 0.0:
        return 1
    else:
        return -1

Monday, February 15, 2016

Synchronous Value Iteration is a success for Tic-Tac-Toe

It took a while, but I finally got the programming bugs worked out to make VI a success. I made a few mistakes in the setup, from not properly understanding what happens in Python when you set lists equal to each other, to making a conceptual error in the evaluation of the values of various positions. However, none of that matters now because those problems have all been worked out.

Here are some of the highlights of the results:
  • It works! Or at least, it mostly works. It successfully figured out to block every play that's one away from winning. I've played out some sample games below, but I've also spot-checked a few other positions to see if the computer would make the "obvious" decision.
  • The assumptions that it makes is that the opposing player will pick randomly. This means that it did not learn any opponent-specific information. There are some interesting consequences to this, such as the second player making a mistake if you open with an edge move.
  • The "best" decision is not what's worst for the opposing player. The machine learning algorithm wasn't constructed with a two player game in mind. There are situations that the first player analyzes in a certain way because of the assumption that the opponent will play randomly. It does not make a decision based on what move puts the opponent in the worst possible position.
  • The program resolves both player's actions, not just the first player. This was accomplished by having the second player minimize the value and the first player maximize the value.
  • I also set draws to have a value of zero. I had originally wanted to set draws to be -1 to punish the first player for not winning. But when I found myself solving both players at the same time, it seemed more sensible to do this.
  • The values converged at the end of the 4th epoch. Just as a side note, the 0th epoch is where V(s) = 0 for every state s.
Here are a few games played using the calculated policy.
  • X opens on the corner, O plays the center, X picks an adjacent edge, and the game plays itself to a draw. (This is how every game would play if the computer just plays itself.)
  • X opens on an edge, O plays an opposite corner (this is a mistake -- playing the center forces a draw), X plays the corner on same side as both the first X and the O, which can be played to a win for X.
  • X opens in the center, O takes a corner, and X picks up the edge next to the O. After O blocks, X takes the other edge next to the first O, and this ends in a draw.
Playing to do maximal harm has X opening in the center (and playing to a draw). If X opens on the corner, O's maximal harm move is the center (and all moves at the next level are equally harmful).

Lastly, here's the code. It's built off of values that were created by the position enumeration program. Once again, it's not thoroughly commented.
### Load States and Rewards
def LoadSR():
    S = ['000000000']
    R = [0]

    inputFile = open("TTTValidGames.txt","r")

    # For each line of input, convert it to a state string and extract
    # a reward
    for line in inputFile:
        game = line.split()
        S.append(game[0]+game[1]+game[2] \
                 +game[3]+game[4]+game[5]\
                 +game[6]+game[7]+game[8])
        if game[9] == "1":
            R.append(1.0)
        elif game[9] == "2":
            R.append(-1.0)
        elif game[9] == "-1":
            R.append(0.0)
        else:
            R.append(0.0)
    inputFile.close()
    return S, R

### Determine whose turn it is
def WhoseTurn(state):
    CountMoves = 0
    for i in range(0,len(state)):
        if state[i] != '0':
            CountMoves += 1
    if CountMoves % 2 == 0:
        return '1'
    else:
        return '2'


### Calculate OutStates and Transitions
def GetOutStatesAndMoves(S):
    OutStates = []
    OutMoves = []
    for state in S:
        game = StateToInt(state)
        XWin = CheckXWin(game)
        OWin = CheckOWin(game)
        if XWin == 1 or OWin == 2 or XWin == -1:
            OutStates.append([])
            OutMoves.append([])
        else:
            MovesFromHere = []
            PositionsFromHere = []
            player = WhoseTurn(state)

            for i in range(0, len(state)):
                if state[i] == '0':
                    tempstate = ""
                    for j in range(0,i):
                        tempstate += state[j]
                    tempstate += player
                    for j in range(i+1, len(state)):
                        tempstate += state[j]
                    if ValidPosition(tempstate) == 1:
                        MovesFromHere.append(i)
                        PositionsFromHere.append(tempstate)
            OutStates.append(PositionsFromHere)
            OutMoves.append(MovesFromHere)
    return OutStates, OutMoves

### Calculate transition probabilities
def GetTransitionProbs(Moves):
    Probs = []
    for MoveList in Moves:
        if len(MoveList) > 0:
            ThisProb = [1.0/(len(MoveList))] * len(MoveList)
            Probs.append(ThisProb)
        else:
            Probs.append([])
    return Probs

### Convert state string to numerical array
def StateToInt(state):
    game = []
    for i in range(0, len(state)):
        game.append(int(state[i]))
    return game

### Check if a position is valid
def ValidPosition(InState):
    game = StateToInt(InState)
    XWin = CheckXWin(game)
    OWin = CheckOWin(game)
    if XWin == 1 and OWin == 2:
        return 0
    return 1

# Check if X wins
def CheckXWin(game):
    winner = 0
    if game[0]*game[1]*game[2] == 1 \
      or game[3]*game[4]*game[5] == 1 \
      or game[6]*game[7]*game[8] == 1 \
      or game[0]*game[3]*game[6] == 1 \
      or game[1]*game[4]*game[7] == 1 \
      or game[2]*game[5]*game[8] == 1 \
      or game[0]*game[4]*game[8] == 1 \
      or game[2]*game[4]*game[6] == 1:
        winner = 1
    elif game[0]*game[1]*game[2]*game[3]*game[4]*game[5]*game[6]*game[7]*game[8] > 0:
        winner = -1
    return winner

# Check if O wins
def CheckOWin(game):
    winner = 0
    if game[0]*game[1]*game[2] == 8 \
      or game[3]*game[4]*game[5] == 8 \
      or game[6]*game[7]*game[8] == 8 \
      or game[0]*game[3]*game[6] == 8 \
      or game[1]*game[4]*game[7] == 8 \
      or game[2]*game[5]*game[8] == 8 \
      or game[0]*game[4]*game[8] == 8 \
      or game[2]*game[4]*game[6] == 8:
        winner = 2
    elif game[0]*game[1]*game[2]*game[3]*game[4]*game[5]*game[6]*game[7]*game[8] > 0:
        winner = -1
    return winner

### Main Program
max_epochs = 20
gamma = 0.9

S, R = LoadSR()
OutStateList, MoveList = GetOutStatesAndMoves(S)
TProb = GetTransitionProbs(MoveList)

# Begin Learning
VOld = [0.0] * len(S)
VNew = [0.0] * len(S)
P = [-2] * len(S)

for epoch in range(0,max_epochs):
    for state in S:
        stateIndex = S.index(state)
        if len(OutStateList[stateIndex]) == 0:
            VNew[stateIndex] = R[stateIndex]
        else:
            comp = []
            if WhoseTurn(S[stateIndex]) == '1':
                for secondState in OutStateList[stateIndex]:
                    secondIndex = S.index(secondState)
                    if len(OutStateList[secondIndex]) == 0:
                        comp.append(R[secondIndex])
                    else:
                        tempValue = 0.0
                        for i in range(0, len(OutStateList[secondIndex])):
                            tempIndex = S.index(OutStateList[secondIndex][i])
                            tempValue += TProb[secondIndex][i] * VOld[tempIndex]
                        comp.append(tempValue)
                VNew[stateIndex] = R[stateIndex] + gamma * max(comp)
                P[stateIndex] = MoveList[stateIndex][comp.index(max(comp))]
            else:
                for secondState in OutStateList[stateIndex]:
                    secondIndex = S.index(secondState)
                    if len(OutStateList[secondIndex]) == 0:
                        comp.append(R[secondIndex])
                    else:
                        tempValue = 0.0
                        for i in range(0, len(OutStateList[secondIndex])):
                            tempIndex = S.index(OutStateList[secondIndex][i])
                            tempValue += TProb[secondIndex][i] * VOld[tempIndex]
                        comp.append(tempValue)
                VNew[stateIndex] = R[stateIndex] + gamma * min(comp)
                P[stateIndex] = MoveList[stateIndex][comp.index(min(comp))]
#        print epoch, stateIndex, state, WhoseTurn(S[stateIndex]), VOld[stateIndex], VNew[stateIndex], P[stateIndex]
    VOld = VNew[:]
    print epoch

    OutFile = open("SynchVI"+str(epoch)+".csv", "w")
    OutFile.write("state, P, V\n")
    for i in range(0, len(S)):
        OutFile.write(S[i])
        OutFile.write(", ")
        OutFile.write(str(P[i]))
        OutFile.write(", ")
        OutFile.write(str(VNew[i]))
        OutFile.write("\n")
    OutFile.close()