Sketch Image

FAQ


What is the DMB System?

    DMB stands for Data Mining Big. The DMB System is a collection of data analysis tools. It is an implementation of a set of algorithms for automatic classification.

Why should I use it?

    Because our applications are particulary efficient and accurate for classifying.

What is the DMBCSV format?

    The DMB format uses the standard CSV style (Comma-Separated Values).
    The first row of the file contains the names of the samples (they must be all different). The second row contains the class name to which the sample belongs to.
    The first column contains the feature name (the variable of the experiments). The second column describes the variabile type: NUM, if it assumes numerical values, or ORD, if it assumes values from finite set of elements.

    Here is an example:

    Exp 1 Exp 2 Exp 3 Exp 4
    class A B A B
    Gene 1 NUM 1.50 0.42 0.70 1.05
    Gene 2 NUM 1.00 1.40 0.70 0.65
    Pos 1 ORD A T G T
    Pos 2 ORD A A C A

    You can download an input example file here.

What is the input format of BLOG?

    The input format of BLOG is standard FASTA format file with the barcode sequences.
    The second field of the description line should contain the specimen class. For instance:

    >EM1232|squalus edmundsi|..|...
    ACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGT
    ACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGT
    ACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGT
    >EM1237|squalus mitsukuri|..|...
    ACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACG
    ACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGT
    ACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGT

    You can download an input example file here.

What is the output of BLOG?

    Output files are:

    testfile.bdp1.txt #bdp1 (see bmc article)
    testfile.bdp2.txt #bdp2 (see bmc article)
    testfile.bdp3.txt #bdp3 (see bmc article)
    testfile.bdp4.html #bdp4 (a summary of the outputs)
    testfile.stats.html #the test set classification statistics
    trainfile.stats.html #the train set classification statistics
    testfile.confmatrix.html #the test set confusion matrix
    trainfile.confmatrix.html #the train set confusion matrix
    trainfile.formulas.csv #the logic formulas in csv format

What is the input format of MALA?

    The input format of MALA is DMBCSV format.

    You can download an input example file here.

What is the output of MALA?

    Output files are:
    The logic classification formulas
    The classification statistics

How big can my datasets be?

    The upload limit is set to 6 MB. For larger files please contact us.

Can I customize the DMB methods in more detail?

    Sure, feel free to contact us for customized data analysis at DMB mail.

Why does it take so long to obtain results?

    Maybe our servers are busy. Please be patient. Take a coffee, the results will arrive soon..

I don't understand the algorithms!

    Read the system description carefully. Check your input! Otherwise contact us at DMB mail.

Can I download the source code?