On the Informativeness of the DNA Promoter Sequences Domain Theory

The DNA promoter sequences domain theory and database havebecome popular for testing systems that integrate empirical andanalytical learning. This note reports a simple change andreinterpretation of the domain theory in terms of M-of-N concepts,involving no learning, that results in an accuracy of 93.4% on the 106items of the database. Moreover, an exhaustive search of the space ofM-of-N domain theory interpretations indicates that the expectedaccuracy of a randomly chosen interpretation is 76.5%, and that amaximum accuracy of 97.2% is achieved in 12 cases. This demonstratesthe informativeness of the domain theory, without the complications ofunderstanding the interactions between various learning algorithms andthe theory. In addition, our results help characterize the difficultyof learning using the DNA promoters theory.

