Abstract:
Go is a fascinating research domain for exploring new approaches in Artificial Intelligence (AI). In spite of its very simple rules, computer Go is much more difficult to implement than computer chess, because Go is an extremely complex strategic board game and has a vast search spare.
The traditional AI approach when faced with a board game problem is to search the space of all possible moves to find the best move sequence that leads to an advantageous position. Heuristic searching methods are often used to tackle the problem. As Go is a complete-knowledge, deterministic and strategic game with extreme complexity, these approaches are not feasible in computer Go. Computer Go must be tackled with Go knowledge-intensive approaches rather than just the heuristic searching methods that are applied to computer chess.
A lot of Go knowledge has been accumulated over the past several centuries by human master players. Most of it is implicit in the form of expert games. The patterns recognised by human players are more than just stones and empty spaces. That is, players intuitively perceive the life-and-death status of surrounded groups, and select an effective next move in a certain Go board position with a priori, knowledge of Go concepts such as influence, the safety of groups, connectedness between groups etc. This visual nature in Go is easy for human perception but hard to model with computers.
In this thesis, we classify Go knowledge into implicit Go knowledge and explicit Go knowledge. Because implicit Go knowledge is embedded in prototypical professional Go games as pattern knowledge, we use pattern recognition capability to infer implicit knowledge from professional Go games. Meanwhile, explicit Go knowledge is represented as Go propositions (including Go proverbs, Go maxims and Go terms), and thus we apply Go term knowledge to a rule-based fuzzy reasoning model to solve problems in the opening game.
This thesis discusses and implements a learning model, a reasoning model and a heuristic searching model. The learning model is a neural network for Temporal Difference (TD) learning with pattern recognition capability. The reasoning model includes a neuro-fuzzy controller for fuzzy reasoning using Go term knowledge based on pattern knowledge. Both models are applied to the full-sized (19x19) opening games of Go instead of a simplified version (e.g. 9x9), which is often used to study AI methods in Go. Additionally, the heuristic searching model is applied to solving life-and-death problems in a local domain.
Firstly, we analyse the feasibility of applying a TD(λ) learning model in the opening game of Go. We implement TD(λ) learning with a neural network to predict the next stone moves, to evaluate the different opening styles and to find the most favourable opening game for black, and then evaluate the performance of TD(λ). The empirical result for predicting the next stone moves is promising, but there is no guarantee that TD(λ) will always pick the same opening game, which is one of the most favourable opening games for black, independent of different λ values. The competition between two TD(λ)s shows that it is difficult to clearly say that which TD(λ) is better, with a few games played between two TD(λ)s.
Secondly, we discuss the implementation of a novel fuzzy reasoning model, which includes three components: (l) a neural network with supervised learning to generate the candidate moves, (2) a heuristic evaluator of the candidate moves, and (3) an adapted neuro-fuzzy controller to decide the best next move. We also let the fuzzy reasoning model play against TD(λ) learning to test the performance. The experimental result reveals that even the simple fuzzy reasoning model can compete against TD(λ) learning and it shows great potential to be applied to the real game of Go.
Finally, we construct a heuristic searching model that enables the reduction of the branching factor in a game tree for solving life-and-death problems. The set of first moves in a game tree composes a set of candidate moves generated by pattern clustering and a set of possible moves generated by eye shape analysis. Then α-β searching is conducted to find the best sequence of moves for solving life-and-death problems. The empirical result shows that when there is a complete boundary between black and white, the sequence of moves generated is almost always correct, except that it does not deal well with ko fighting. Meanwhile, α-β searching does not generate a correct sequence of moves for solving problems that have an incomplete boundary between black and white.
Keywords: α-β Searching, Complexity. Computer Go, Distance Transform, Eve Shape Analysis, Explicit Go Knowledge, Go Knowledge, Implicit Go Knowledge, Influence, Learning, Mamdani Fuzzy Model, Neural Network, Neuro-Fuzzy Reasoning. Pattern Clustering, Pattern Recognition, Reasoning, Reinforcement Learning, Searching, Sugeno Fuzzy Model, Supervised Learning, Temporal Difference (TD) Learning, Tsukamoto Fuzzy Model, Unsupervised Learning.