NII HOME  HOME

Bioinformatics Centre
National Institute of Immunology

MyPatternFinder JNU HOME  CSIR HOME

Jawaharlal Nehru University
Institute of Microbial Technology


SUBMIT YOUR SEQUENCES v1.0
SUBMIT YOUR SEQUENCES v2.0
SUPPLEMENTARY INFORMATION
RETRIEVE RESULTS
MPF TEAM
REFERENCES
HELP AND DOCUMENTATION
  1. Options
    The Option A detects only substitutional mutations while Option B, in addition, detects indels. Since Option B is based on ClustalW alignment it is unable to accept "fuzzy" nucleotides.

  2. Input Sequence
    The DNA sequence can be pasted into the text area. Or a file containing nucleotide sequence can be uploaded using this option.

  3. Format of Input Sequence
    In case the Format of the sequence is any of the standard ones (EMBL, FASTA, GENBANK, etc.) then 'Sequence Format' should be selected appropriately to 'FASTA', 'GENBANK', or 'OTHER (READSEQ Formats)'. The SRF server uses READSEQ program developed by D.G. Gilbert Indiana University to convert the format of your sequence to FASTA. In case the input sequence is just plain text, set the 'Format Type' to Plain Text (Single Letter Code). By default the server takes only single letter code of nucleotide bases. The server also has the capability to ignore all the non-standard characters such as *%!,@$% etc.

  4. Query Pattern
    The query pattern refers to a 'motif' to be searched in the DNA sequence. The user can also select any motif from a list of consensus patterns. The query pattern can include the following IUPAC symbols for allowing 'wobbling' at any given position:

       R - a/g;         B - c/g/t
       Y - c/t;         D - a/g/t
       M - a/c;         H - a/c/t
       K - g/t;         V - a/c/g
       S - c/g;         N - a/c/t/g
       W - a/t;

  5. Percent Score (% Identity)
    It is a representation of the percent identity, where the value 1 denotes 100% identity. The score for a pattern is calculated as the sum of all perfect matches between the individual bases with respect to the query pattern divided by the length of pattern, where a perfect match has a score of 1 added to tally.

  6. Total Score
    The score for each pattern is calculated by adding +1 for every exact match with the query pattern, while substitutions and insertions/deletions subtracts a penalty of -0.5 and -0.25 from the total score respectively.

  7. Region
    By default, complete sequence is searched for the presence of the pattern. A subsequence can also be selected by the user, for searching the pattern.

  8. Example for MyPattern Output

    Input Sequence (Accession Number M65145):
            1 gatcaacacc actgcactcc agcctgggca acagagcgag actccatctt aaaaaaaaaa 60
           61 aaaaaaaaag aagaagaaag aaagaaatga ttgaggtgat tctgtggggt aaccttgagg 120
          121 atgagtggga tgacctttgg ggggtcctct gtggggtgac cttggggggt gtgtgggatt 180
          181 atgagagtgt ctgagaattg agttttaggt taagtagggg tgatttggga caactctaag 240
          241 ggtgtgactg ggggatgact ttaggggtcc ctggggtgac cttgtgagtg agattggagt 300
          301 ttctttgggg tttctaggag gtgactttgg ggaggtgagt aggggtgatt ttgaggatgt 360
          361 gccagagctt tggggttgtc tgttgggctg cttgtgggta tcagcagaga tgactttgaa 420
          421 gggtgaccaa gaggggtatc tgggggtatc tggggggtga ctctggggcc cctgtcccct 480
          481 ctccccagtt ctcatccata tgagcaaggg cttcatgtta ctggccgtgg ccgtgtgctg 540
          541 agtcctcctc ccgcccatgc cctctgcctc ctggccagtg tcctccgcct gaacaatgtc 600
          601 ttcagtttcc tctgcctcgg cattggtagg tgttggcttg gggggtgggg gccctcccta 660
          661 accccagccc tgctctctgg ctttctggag gtgtatctag acttccgtgc tgctcccctc 720
          721 ggggcctcgt cttagcatgt ctctggggag ctctgactca gttctactct atctgtggga 780
          781 tgtctctctc ccccactctc tgtttccctt tctttttttg tcccctctct ccaggtatct 840
          841 ttctgtctgt tgtgagggcg tgtgtgtgtg tgtgtgtgtg tgtgtgtgtg tgtgtctgct 900
          901 tgtcacggag ggtgggagag gatgtctcag tccctctttt cttactgggg gcatagggct 960
          961 cactgtccag ccttctgctg ctctgcccac ccatgcaggc tccgattctc ttcagcctga 1020
         1021 gccctctttc agtaacgcaa ggcgtgcacc caagccgtgt gcacggactc ac
    

    Output (With Default Settings):

         Pattern searched: TGACTTTGGGG

         RESULT


  9. Consensus Patterns
    User can also select a query pattern from the following various consensus patterns:

           PROKARYOTIC PROMOTERS
            alpha70
              TATAAT  (-10 consensus)
              TTGACA  (-35 consensus)

            alpha32
              CCCCATTTA  (-10 consensus)
              TNTCNCCCTTGAA  (-35 consensus)

            alpha54
              TTGCA  (-10 consensus)
              CTGGNA  (-35 consensus)

            alpha28
              GCCGATAA  (-10 consensus)
              CTAAA  (-35 consensus)

            MYCOBACTERIAL PROMOTERS
               M. tuberculosis
                 TAYGAT  (-10 consensus)
                 TAKRAT  (-10 consensus)
                 TTGACA  (-35 consensus)

               M. smegmatis
                 TATAAT  (-10 consensus)
                 TTGACA  (-35 consensus)

               M. paratuberculosis
                 CGGCCS  (-10 consensus)
                 TGMCGT  (-35 consensus)

           EUKARYOTIC PROMOTER ELEMENTS
             YYAWYY  (Initiator)
             TATAAAA  (-25 consensus)
             GGNCAATCT  (CAAT BOX)
             GGGCGG  (GC BOX)

           TRANSCRIPTION FACTORS
             ATTTGCAT  (Oct-1/Oct-2)
             TGAGTCA  (AP-1)
             CCCMNSSS  (AP-2)
             GGGACTTTCC  (NF-kB)
             TGACTCAG  (NF-E2)
             GGAGAR  (NFAT)
             GGAGAR  (ATF)
             WGATAR  (GATA-1)
             RGRCATGYCY  (p53)

           RESPONSE ELEMENTS
             TGACGTCA  (CRE)
             AGGTCANNNTGACCT  (ERE)
             AGAACANNNTGTTCT  (GRE)
             CNNGAANNTCCNNG  (HSE)
             CCATATTAGG  (SRE)
             TGACTCA  (TRE)