RNA to protein X95757


Statement
 

pdf   zip

Recall that the primary structure of a protein can be represented as a sequence over the alphabet of amino acids A (alanine, Ala), R (arginine, Arg), N (asparagine, Asn), D (aspartate, Asp), C (cysteine, Cys), E (glutamate, Glu), Q (glutamine, Gln), G (glycine, Gly), H (histidine, His), I (isoleucine, Ile), L (leucine, Leu), K (lysine, Lys), M (methionine, Met), F (phenylalanine, Phe), P (proline, Pro), S (serine, Ser), T (threonine, Thr), W (tryptophan, Trp), Y (tyrosine, Tyr), and V (valine, Val).

A codon of three nucleotides is translated into a single amino acid within a protein, with translation beginning with a start codon (AUG) and ending with a stop codon (UAA, UAG, or UGA). The 43=644^3=64 different nucleotide triplets code for 20 amino acids, one translation start signal (methionine, one of these amino acids) and three translation stop signals, with some redundancies. The genetic code defines a mapping between codons and amino acids, and despite variations in the genetic code across species, there is a standard genetic code common to most species.

AAA K AAC N AAG K AAU N ACA T ACC T ACG T ACU T
AGA R AGC S AGG R AGU S AUA I AUC I AUG M AUU I
CAA Q CAC H CAG Q CAU H CCA P CCC P CCG P CCU P
CGA R CGC R CGG R CGU R CUA L CUC L CUG L CUU L
GAA E GAC D GAG E GAU D GCA A GCC A GCG A GCU A
GGA G GGC G GGG G GGU G GUA V GUC V GUG V GUU V
UAA - UAC Y UAG - UAU Y UCA S UCC S UCG S UCU S
UGA - UGC C UGG W UGU C UUA L UUC F UUG L UUU F

Write code for the protein translation problem. The program must implement and use the RNA-TO-PROTEIN function in the pseudocode discussed in class, which is iterative and is not allowed to perform input/output operations. Make one submission with Python code and another submission with C++ code.

Input

The input is a string ss over the alphabet {A,C,G,U}\{A,C,G,U\}.

Output

The output is the translation of a minimal substring of ss from a start codon to a stop codon to a string (proteomic sequence) over the alphabet {A,R,N,D,C,E,Q,G,H,I,L,K,M,F,P,S,T,W,Y,V}\{A,R,N,D,C,E,Q,G,H,I,L,K,M,F,P,S,\allowbreak T,W,Y,V\}.

Public test cases
  • Input

    GUCGCCAUGAUGGUGGUUAUUAUACCGUCAAGGACUGUGUGACUA
    

    Output

    MVVIIPSRTV
    
  • Information
    Author
    Gabriel Valiente
    Language
    English
    Official solutions
    C++ Python
    User solutions
    C++ Python