Hardly a week seems to go by without encountering a new genetics study that contains a diagram of specimen genetic similarities and clades. For these diagrams, biologists have long relied on universitybased and/or commercial computational packages which are not only prone to pilot errors but also contain “analysis” methods which should never be used for genetic distance or clustering. Not that all the software is poor – it appears there is a mixture of good and bad in each package. The troublesome methods, however, have enjoyed acceptable use for so long that serious errors are published on a frequent basis. What follows is a list of concerns that will hopefully be useful to authors and reviewers alike. The report concludes with a graph-theoretical alternative to the current status quo in genomics.
Bayesian clustering, Graph partitioning, Missing values, Pair joining, Pseudo-metrics.