BCB @ ISU MuSeqBox Download Help Tutorial References Contact
 

This tutorial will demonstrate the uses of MuSeqBox online by parsing the BLAST output file my_blastx_output. The MuSeqBox program can parse output from all blast programs. In this example my_blastx_output was created by executing a BLASTX search of the entries in the query against mypept database of 28 protein sequences. We assume that following the instruction on BLAST usage as follows, the users may produce their own BLAST outputs and save as local files on their machine to replace our provided my_blastx_output

.

BLAST USAGE:

1. Formatting and searching mypept database using BLASTX:

    formatdb -i mypept -p T -o T
    blastall -p blastx -d mypept -i query -o my_blastx_output -I

    Notes: The blastall command executes a BLASTX search of the entries in query against mypept; the results are saved in my_blastx_output . The "-I" option selects to display the matching peptide sequences with the NCBI identifiers (GIs).

2. Searching NCBI non-redundant nucleic acid database using net-client BLASTN:

    blastcl3 -p blastn -d nr -i query -e 1e-10 -o my_blastx_output -I

MUSEQBOX ONLINE USAGE:

The MuSeqBox online provides users multiple input file choices. Users may use our server provided MuSeqBox application outputs (e.g., maize EST BLASTX output, maize contig BLASTX output, and soybean contig BLASTX output) or supply their own BLAST outputs to search for queries of interests. If BLAST search produces a large output file, we strongly suggest that users download the MuSeqBox stand-alone version to post-process the BLAST results. The following examples highlight various uses of MuSeqBox online.

1. Creating default tabulated output to your Browser from a BLAST input file:

  • Click the checkbox left of "Supply your own BLAST output file"
  • Click the Browse button to locate your local BLAST output, e.g., my_blastx_output
  • Click the submit button (see output derived from my_blastx_output)

    Notes: By default, MuSeqBox online uses options "-n 3" and "-p 4" to create the output (for more detailed information, see the manual document in the distributed package). Those options correspond to parameters Display Hits and Print format, respectively. Users can change these parameters using their respective pull down menus. For example, to retain only the top two BLAST hits for each query in condensed print format (pstyle=3), users may follow the first two steps above and then:

  • Click the Display hits pull-down menu and select "Top 2 if any" option
  • Click the Print format pull-down menu and select "Condensed" option
  • Click the submit botton

    Note: The users may request the MuSeqBox output be sent via their email address. To do so, check the checkbox left of "Send the (text format) output to this email address:" and fill in the blank with the intended recipient's email addresses.

2. Selecting queries satisfying complex criteria specified by the users and output to your browser:

  • Follow the steps in 1 to provide basic online settings (i.e., Display hits and Print format) and to select a local BLAST output file for the MuSeqBox
  • Click the checkbox left of "Select queries based on the following criteria:"
  • Click the checkboxes left of the desired criteria and then fill in the blanks with numerical values. For example, to select the queries with query sequence length larger than 600 and with expectation value less than 1e-10, check both checkboxes left of QLen and Eval, and then fill in the blanks with 600 and 1e-10, respectively
  • Click the submit button

3. Identification of potential retained introns in EST queries on the basis of matching peptide BLAST hits:

  • Follow the steps in 1 to provide basic online settings (i.e., Display hits and Print format) and to select a local BLAST output file (e.g., my_blastx_output) for the MuSeqBox
  • Click the checkbox left of "Select queries that represent potential alternatively spliced transcripts:"
  • Fill in the blank corresponding to indel parameter with the minimal insertion segment size value, for example, indel >=40 nt
  • Click the type pull-down menu and choose "Insertion in query relative to subject"
  • Click the submit botton (see the output)

4. Identification of potential skipped exons in the EST queries on the basis of matching peptide BLAST hits:

  • Follow the first two steps in 3
  • Fill in the blank with the minimal deletion segment size value, for example, indel=90 nt
  • Click the type pull-down menu and choose "Deletion in query relative to subject"
  • Click the submit botton

5. Identification of potential (near) full-length transcripts among EST queries on the basis of matching peptide BLAST hits:

  • Follow the steps in 1 to provide basic online settings (i.e., Display hits and Print format) and to select a local BLAST output file for the MuSeqBox
  • Click the checkbox left of "Select queries that potentially encode full-length coding sequences:"
  • Fill in the parameter requirement blanks (click help to see detailed definitions the parameters). For example, option "-F 10 10 0 0 95.0 40.0" corresponds to v5s <=10, v3s <=10, v5q <= 0, v3q <= 0, scv >=95.0%, and qsc >=40.0%, respectively.
  • Click the submit button (see output)

    Note: This example selects hits for which the matching peptide sequence has HSPs covering at least 95% of the peptide sequence, with the terminal HSPs starting from within the first 10 amino acids of the peptide and extending into the last 10 amino acids, respectively. Moreover query sequence coverage of at least 40% is also required. In the example, it is clear that the maize EST AW065755 encodes the entire homolog of the Arabidopsis 60S ribosomal protein L18A.
 
BCB @ ISU MuSeqBox Download Help Tutorial References Contact