Solving Complex Problems in Human Genetics Using Nature-Inspired Algorithms Requires Strategies which Exploit Domain-Specific Knowledge

Author(s): Casey S. Greene (Dartmouth College, USA) and Jason H. Moore (Dartmouth College, USA)
Copyright: 2012
Pages: 15
Source title: Computer Engineering: Concepts, Methodologies, Tools and Applications
Source Author(s)/Editor(s): Information Resources Management Association (USA)
DOI: 10.4018/978-1-61350-456-7.ch804



In human genetics the availability of chip-based technology facilitates the measurement of thousands of DNA sequence variations from across the human genome. The informatics challenge is to identify combinations of interacting DNA sequence variations that predict common diseases. The authors review three nature-inspired methods that have been developed and evaluated in this domain. The two approaches this chapter focuses on in detail are genetic programming (GP) and a complex-system inspired GP-like computational evolution system (CES). The authors also discuss a third nature-inspired approach known as ant colony optimization (ACO). The GP and ACO techniques are designed to select relevant attributes, while the CES addresses both the selection of relevant attributes and the modeling of disease risk. Specifically, they examine these methods in the context of epistasis or gene-gene interactions. For the work discussed here we focus solely on the situation where there is an epistatic effect but no detectable main effect. In this domain, early studies show that nature-inspired algorithms perform no better than a simple random search when classification accuracy is used as the fitness function. Thus, the challenge for applying these search algorithms to this problem is that when using classification accuracy there are no building blocks. The goal then is to use outside knowledge or pre-processing of the dataset to provide these building blocks in a manner that enables the population, in a nature-inspired framework, to discover an optimal model. The authors examine one pre-processing strategy for revealing building blocks in this domain and three different methods to exploit these building blocks as part of a knowledge-aware nature-inspired strategy. They also discuss potential sources of building blocks and modifications to the described methods which may improve our ability to solve complex problems in human genetics. Here it is argued that both the methods using expert knowledge and the sources of expert knowledge drawn upon will be critical to improving our ability to detect and characterize epistatic interactions in these large scale biomedical studies.

