4. View the PostScript figures.
Apart from the text output, RNAfold produces a PostScript structure drawing, suitable for
inclusion in publications as well as for printing on any PostScript-capable printer (Fig.
12.2.1). For on-screen, viewing a PostScript viewer such as GhostScript (or one of its front
ends, i.e., gv or gsview; http://www.cs.wisc.edu/~ghost/) is needed. If the input defined a
sequence name (say
seq1
), it will be used to name the PostScript file (e.g,.
seq1 ss.ps
);
otherwise the default filename
rna.ps
will be used.
Pair probabilities will be written in the form of a PostScript “dot plot.” The dot plot shows
a n
×
n matrix of squares, such that the area of the square at row i and column j in the
upper right half is proportional to probability of the pair (i, j), while the lower left half
shows all pairs belonging to the mfe structure. The name of the dot plot file will again be
derived from the sequence name (e.g.,
seq1 dp.ps
) or the default filename
dot.ps
will be used.
Dot plots are an excellent way to visualize structural alternatives. For an RNA with
well-defined mfe structure, the upper right half should only contain a few small additional
dots compared to the lower left. The PostScript dot plot is constructed such that the actual
pair probabilities can be easily read from the file itself (see, e.g., step 5).
5. Produce a mountain plot.
Secondary structure graphs and dot plots both become cumbersome for long file sequences.
A mountain plot is a structure representation that works well even for long sequences, and
which is well suited for comparing structures. A mountain plot is an x-y graph that plots
the number of base pairs enclosing a sequence position, or, for pair probabilities, the
average number of enclosing pairs. The Perl script
mountain.pl
can be used to produce
the coordinates for a mountain plot from a dot plot PostScript file. The result can then be
plotted with any x-y plotting program. Using, e.g., the xmgrace plotting program, the
following command is typed:
mountain.pl seq1_dp.ps | xmgrace -pipe
If a
mountain.pl: Command not found
error is encountered, use the full path in
the command (e.g.,
/usr/local/share/ViennaRNA/bin/mountain.pl
).
The resulting plot shows three curves: two mountain plots derived from mfe structure and
pair probabilities and a positional entropy derived from the pair probabilities:
where p
i
u
is the probability of i being unpaired. Well-defined regions are marked by low
entropy.
6. Include experimental constraints.
Secondary structure prediction is of course error-prone, and no prediction should be
trusted blindly without experimental support. If any experimental results (such as chemical
probing data) are available, it is possible to test whether the prediction is compatible with
the experimental data. Furthermore, constraints can be used to ensure that RNAfold will
only consider structures compatible with the constraints.
To do constrained folding, open the sequence file in a text editor and add another line after
the sequence consisting of the symbols
x
,
|
,
.
, and matching parentheses,
()
. A pair of
matching parentheses signify that the corresponding positions must form a base pair. A
vertical line (
|
) marks a position that must pair, and an
x
marks a position that must not
pair. The dot (
.
) marks positions without constraint. Refold the sequences with constraints
using the
-C
option:
RNAfold -p -C < file_c.seq > file_c.fold
One can now compare the constrained and unconstrained foldings. Ideally, the constraints
should only lead to a small change in energy.
uu
iijijii
j
log log
=− −
∑
Spppp
Current Protocols in Bioinformatics Supplement 4
12.2.3
Analyzing RNA
Sequence and
Structure
评论0
最新资源