Sketch Image

DNA Barcodes classification with supervised machine learning techniques

The DNA Barcodes sequences classification problem may be approached as a supervised machine learning problem in the following way: given a reference library composed of DNA Barcode specimen sequences of known species and a collection of unknown DNA Barcode sequences (query set) recognize the latter into the species that are present in the library. This problem may be solved with a special software procedure explained in the tutorial and in the paper: "Supervised DNA Barcodes Species Classification: Analysis, Comparisons and Results", Bio DataMining (under revision).
Download the special FASTA converter here.

FASTA TO WEKA CONVERTER, TUTORIAL, AND PRESENTATION

Fasta2Weka.zip
SupervisedMLBarcodes.ppt
Tutorial.pdf

Sample datasets

Cypraeidae.zip
Drosophila.zip
Inga.zip
Simulated.zip
aQuickTest.zip
algae.zip
bats.zip
birds.zip
fishes.zip
fungi.zip