Temporal Difference Learning
Replication of experiments from Richard S. Sutton’s 1988 paper on Temporal Difference Learning.
Two Markov Decision Processes of varying complexity were used to analyze value iteration, policy iteration and finally q-learning.
Two clustering algorithms (k-means, expecation maximization) and four dimensionality reduction algorithms (PCA, ICA, Randomized Projections, Information Gain) were implemented and explored.
The report takes a closer look at random optimization (random hill climbing, genetic algorithm, simulated annealing) with the previously used datasets as well as some classical computer science problems.
Five classification algorithms (decision trees, boosting, k-nearest neighbors, support vector machines and neural networks) were compared and contrasted on two different datasets.