HELP AND DOCUMENTATION

MycoRegDB
  1. The FASTA comment line of each database entry indicates:  
    - Gene number, annotation and CDS positions as specified in the GenBank file.
    - Appropriate designation of promoter mentioning the organism, gene number/name and TSP/Motif name (if any).
    - The TSP-CDS or Motif-CDS distance in base pairs. The TSP-CDS distance is mentioned in approximation if the promoter has been computationally determined.
    - PubMed ID(s) of relevant reference(s).
MoPP
  1. Input sequence
    The DNA sequences can be pasted into the text area or a file containing the sequences can be uploaded. The sequences should be in FASTA format.

  2. Motif length
    The minimum and maximum motif length allowed by MoPP server is 2 bp and 15 bp respectively. By default, the server only searches for motifs of 6-8 bp. For motif searches of >15 bp, download the program.

  3. Stringency
    By default:
    The server first searches for motifs that are >80% identical and present in >70% of the sequences (High stringency). In case no motif is detected, the stringency is reduced to detect motifs that are >70% identical and present in >60% of the sequences (Medium stringency). If still no motif is detected, the stringency is further reduced to detect motifs that are >60% identical and present in >50% of the sequences (Low stringency).

    Using advance options, a user also has the freedom to specify the cut-offs for % identity and % sequences that should contain the motif.

  4. Multiple motifs
    The user may choose whether a motif is allowed to appear more than once in a single sequence or not (Default is "yes").

  5. Organism
    The user has to select the organism from which the input DNA sequences have been derived.

  6. MoPP output
    The degenerate IUPAC nucletide symbols used in MoPP output (Motif searched or Consensus) are:

       R - a/g;         B - c/g/t
       Y - c/t;         D - a/g/t
       M - a/c;         H - a/c/t
       K - g/t;         V - a/c/g
       S - c/g;         N - a/c/t/g
       W - a/t;

  7. Example for MoPP

    Input DNA sequences (from M. leprae):

    >ML1795 | hsp18 | CDS: 2174297..2174743 | Mlp_18kDa | TSP-CDS-Dist: 66 | PMID: 7551043
    ttgtctatcacaacttgcatcaatatatcgaccagtgctatatcaaatctAtgtagtcag
    gaacagctatatagttatagtttgtcacaacagattggagtgcgaggtgaccacac

    >ML2041 | oxyR | CDS: complement(2431984..2432919) | Mlp_oxyR_P1 | TSP-CDS-Dist: 0 | PMID: 9079928
    cgtgagtttggtctgaagtaaaggtgatatatcacactatacttatcggt

    >ML2041 | oxyR | CDS: complement(2431984..2432919) | Mlp_oxyR_P2 | TSP-CDS-Dist: 50 | PMID: 9079928
    ttgtttggattgattcaaaatcagatttttacactatccttcaaggttgcCgtgagtttg
    gtctgaagtaaaggtgatatatcacactatacttatcggt

    >ML2042 | ahpC | CDS: 2433029..2433616 | Mlp_ahpC_P1 | TSP-CDS-Dist: 46 | PMID: 9079928
    gtgtgatatatcacctttacttcagaccaaactcacggcaaccttgaaggAtagtgtaaa
    aatctgattttgaatcaatccaaacaaggatcggat

    >ML2042 | ahpC | CDS: 2433029..2433616 | Mlp_ahpC_P2 | TSP-CDS-Dist: 87 | PMID: 9079928
    tggtgggctgataactcttatcactcataccgataagtatagtgtgatatAtcaccttta
    cttcagaccaaactcacggcaaccttgaaggatagtgtaaaaatctgattttgaatcaat
    ccaaacaaggatcggat

    >MLr01 | rrs | CDS: 1341144..1342692 | Mlp_rrs(rrnA) | TSP-CDS-Dist: 207 | PMID: 1707388
    tagtcaacccgggacttgactcctctgctggatctgtattaatctggctgGgttgccccg
    aagcgggggaagtaagcttgaagtgttgtttgagaactcaatagtgtgtttggttttgtt
    gttgttgattttttgactacatctagcattcctcgtgtgtgtaggtgtagtttattatgt
    tatttatagatgccagttttggtgtcttgtcaggtatctctagaaattgaaaatttcgtc
    tagttattgatggagtt

    MoPP output [With default settings and "Mycobacterium (any)" as organism]:

    RESULT

    On the detailed result page of each motif (wherein the matches to the detected motif are highlighed), the value '1' for % identity denotes 100% identity.