You are here: Vale Lab Home Page>Research>Other Projects>RNA Transport>mRNA Localization in Yeast>Identifying zipcodes
Although several RNA transport mechanisms are known to exist in eukaryotic organisms, our knowledge of the specific motifs that define the final destination of a transported message (“zipcodes”) remains deficient. One primary reason for this lies in the fact that very few mRNA substrates have been identified for a given transport complex, thus limiting the sample size that can be analyzed. A second hindrance has been the difficulty in designing algorithms that can accurately predict secondary and tertiary structures in the context of a cell.
Previous studies had identified at least 24 different transcripts that are transported by the She2/She3/Myo4 protein complex to the buds of growing yeast. In order to narrow down those specific features that might be acting as localization motifs, our collaborators in the DeRisi lab utilized an unbiased strategy for zipcode identification by creating a library of small, randomized stretches of RNA derived from known transport substrates and then screening for those elements that bound to the She2/She3 complex in a three-hybrid reporter assay (Figure 1).
Figure 1. A library of short, randomly arranged RNA stretches derived from known She2/She3/Myo4 substrates was created by nonhomologous recombination and screened for interaction with the She proteins in a three hybrid assay. (click for larger) |
Each of the identified sequences was subsequently evaluated for its ability to direct a GFP-tagged reporter RNA to the tips of growing buds. These analyses led to the identification of 10 novel zipcode sequences as well as the rediscovery and further refinement of several previously identified motifs, thereby validating this approach as an efficient and practical means of zipcode identification.
Identification of a Core Zipcode Motif
MEME analysisc revealed that most of the identified zipcodes contained a short, degenerate motif defined by the sequence RCGAADA, which was generally found in regions that were predicted by the M-fold2 program to be single stranded. Furthermore, this core was consistently localized to a region that was predicted to be proximal to a double stranded stem loop. Finally, the majority of the identified zipcodes displayed a conserved adenosine at a position of –6 from the consensus start (Figure 2).
Extensive mutational analysis was utilized to evaluate the contribution of both primary sequence and secondary structure on zipcode activity using the three hybrid and RNA localization assays as measures of activity. Interestingly, the contribution of certain nucleotides can vary substantially depending on their context. For example, the conserved adenosine at –6 was dispensable in certain zipcodes, but affected activity substantially in others. In addition, the exact identity of the nucleotides immediately following the CG dinucleotide core exerted varying effects depending on which of the identified zipcodes was being examined. Further analysis suggested that the core consensus motif needed to be single stranded for normal zipcode activity to occur whereas the exact identity of nucleotides in the predicted double-stranded regions was less important.
Conclusions and Future Directions
Despite the identification and characterization of this novel zipcode motif, there are still many mysteries surrounding the binding and recognition of RNA substrates by the She protein machinery. Although there are at least 24 known localized transcripts, less than half of them contain the RCGAADA motif in regions that are predicted to be single stranded. Furthermore, additional sequences that seemed to fulfill all of the known criteria for the consensus were identified from the yeast genome and failed to interact with the She complex in an RNA localization assay. Most interestingly, both the nucleotide and structural contexts of each zipcode could affect activity…sometimes subtly and sometimes strikingly. These collective observations indicate that the currently available structural prediction programs are inadequate to consistently and accurately identify zipcodes in silico. This is likely due to the fact that structures other than stem loops, or more likely tertiary or quaternary RNA structures are essential for recognition of RNA substrates by the She complex. Fortunately, the combination of screening and mutational analysis has allowed zipcode regions as small as 25 nucleotides in length to be identified. These newly identified zipcode elements should prove useful for biochemical experimentation and are much more amenable to computational analysis than the longer, more complex stretches that were known previously.
updated 4/9/07
back to Home Page