|
This tutorial will demonstrate the uses of MuSeqBox online by parsing
the BLAST output file my_blastx_output.
The MuSeqBox program can parse output from all blast programs. In this example
my_blastx_output was created
by executing a BLASTX search of the entries in the
query against mypept database of 28
protein sequences.
We assume that following the instruction on BLAST usage as
follows, the users may produce their own BLAST outputs and save as local
files on their machine to replace our provided
my_blastx_output .
BLAST USAGE:
1. Formatting and searching mypept database
using BLASTX:
formatdb -i mypept -p T -o T
blastall -p blastx -d mypept -i query -o my_blastx_output -I
Notes:
The blastall command executes a BLASTX search of the entries in
query against mypept;
the results are saved in my_blastx_output
. The "-I" option selects to display the matching peptide sequences
with the NCBI identifiers (GIs).
2. Searching NCBI non-redundant nucleic acid database using net-client BLASTN:
blastcl3 -p blastn -d nr -i query -e 1e-10 -o my_blastx_output -I
MUSEQBOX ONLINE USAGE:
The MuSeqBox online provides users multiple input file choices. Users may use
our server provided MuSeqBox application outputs (e.g., maize EST BLASTX output,
maize contig BLASTX output, and soybean contig BLASTX output) or supply their own
BLAST outputs to search for queries of interests. If BLAST search produces a large
output file, we strongly suggest that users download the MuSeqBox stand-alone
version to post-process the BLAST results. The following examples highlight various
uses of MuSeqBox online.
1. Creating default tabulated output to your Browser from a BLAST input file:
- Click the checkbox left of "Supply your own BLAST output file"
- Click the Browse button to locate your local BLAST output, e.g.,
my_blastx_output
- Click the submit button (see output derived
from my_blastx_output)
Notes: By default, MuSeqBox online uses options "-n 3" and "-p 4" to create
the output (for more detailed information, see the manual document in the
distributed package). Those options correspond to parameters
Display Hits and Print format, respectively. Users can change
these parameters using their respective pull down menus. For example, to retain
only the top two BLAST hits for each query in
condensed print format (pstyle=3), users may follow the first two steps above and then:
- Click the Display hits pull-down menu and select "Top 2 if any" option
- Click the Print format pull-down menu and select "Condensed" option
- Click the submit botton
Note: The users may request the MuSeqBox output be sent via their email address. To
do so, check the checkbox left of "Send the (text format)
output to this email address:" and fill in the blank with the intended recipient's email addresses.
2. Selecting queries satisfying complex criteria specified by the users and output
to your browser:
- Follow the steps in 1 to provide basic online settings (i.e., Display hits
and Print format) and to select a local BLAST output
file for the MuSeqBox
- Click the checkbox left of "Select queries based on the following
criteria:"
- Click the checkboxes left of the desired criteria and then fill in the
blanks with numerical values. For example, to select the queries with query sequence
length larger than 600 and with expectation value less than 1e-10, check
both checkboxes left of QLen and Eval, and then fill
in the blanks with 600 and 1e-10, respectively
- Click the submit button
3. Identification of potential retained introns in EST queries on the basis
of matching peptide BLAST hits:
- Follow the steps in 1 to provide basic online settings (i.e., Display hits
and Print format) and to select a local BLAST output
file (e.g., my_blastx_output) for the MuSeqBox
- Click the checkbox left of "Select queries that represent potential
alternatively spliced transcripts:"
- Fill in the blank corresponding to indel parameter with the
minimal insertion segment size value, for example, indel >=40 nt
- Click the type pull-down menu and choose "Insertion in query
relative to subject"
- Click the submit botton (see the output)
4. Identification of potential skipped exons in the EST queries on the basis
of matching peptide BLAST hits:
- Follow the first two steps in 3
- Fill in the blank with the minimal deletion segment size value, for
example, indel=90 nt
- Click the type pull-down menu and choose "Deletion in query
relative to subject"
- Click the submit botton
5. Identification of potential (near) full-length transcripts among EST
queries on the basis of matching peptide BLAST hits:
- Follow the steps in 1 to provide basic online settings (i.e., Display hits
and Print format) and to select a local BLAST output
file for the MuSeqBox
- Click the checkbox left of "Select queries that potentially encode
full-length coding sequences:"
- Fill in the parameter requirement blanks (click
help to see detailed definitions the parameters). For
example, option "-F 10 10 0 0 95.0 40.0" corresponds to v5s <=10, v3s <=10,
v5q <= 0, v3q <= 0, scv >=95.0%, and qsc >=40.0%, respectively.
- Click the submit button (see output)
Note: This example selects hits for which the
matching peptide sequence has HSPs covering at least 95% of the peptide sequence,
with the terminal HSPs starting from within the first 10 amino acids of the peptide
and extending into the last 10 amino acids, respectively. Moreover query
sequence coverage of at least 40% is also required. In the example, it is clear
that the maize EST AW065755 encodes the entire
homolog of the Arabidopsis 60S ribosomal protein L18A.
|