Gene Coding Regions (1) X65747


Statement
 

pdf   zip

thehtml

In a DNA sequence, start codons and stop codons delimit a coding region of a gene –a.k.a. coding DNA sequence (CDS)–, which is the part of a gene that is translated into a protein.

Any CDS starts with the universal start codon ATG and ends with the first occurrence of a stop codon (either TAA, TAG, or TGA).

Write a program that, given a DNA sequence, writes the gene coding regions it contains.

Input

A sequence of codons (triplets of A, T, C, G), which may contain zero or more CDS delimited by start/stop codons.

The input sequence may appear in several lines, with one or more whitespaces or newlines between one codon and the next.

Assume that the CDS in the sequence –if any– are well-formed. That is, if a start codon appears, an end codon will appear later before the sequence ends or the start codon appears again. No end codon will appear if no start codon has previously appeared.

Output

A line with the codons formed by each CDS found in the input DNA sequence.

If no CDS is found in the sequence, the output is "No CDS found".

Public test cases
  • Input

    TGC ATG GCT CCG GCT AAG
     TAA TGC CGT ATG
    AAT CTC AAT   GAG AAT CCG TAG AAG

    Output

    CDS 1: GCT CCG GCT AAG
    CDS 2: AAT CTC AAT GAG AAT CCG
    
  • Input

    CGT TGC TAC TGC CGT TGC GCT CCG 
    GCT AAG AAT CTC AAT GAG GAT AAG

    Output

    No CDS found
    
  • Input

    CCA CGG ATA GAA ACA GGA CCC CGA AAG CTG CAC ACC GAC CAA
     TAC ATG GTG GGT AGA TCT GAG GCA CTT TTT TTA ACA TAT TAT
    GGC TCT TCA TGC CCT GGT CGA AGG CAG TAG CGT TTA CGT CAG 
    TGT CGT ACT TAT GGA ACG TCC  GAT TGG TTT TTG ATC ATG GTC 
    CAC CTC AAG TAT CAT AAC AGA TTT GTT GGT GCC CCA CTA CTA
     GTA GAT TCG TGG GGA TCG GTG AAA ACC CGG ACC TGA AAG CAG 
    GTC GTG CTC TAT CGG GAC CGG GAA AAG AAT TGG AGT CCC TTC CTC
     GAC CTG CTT CCT CCA CCT CTA GGC AGC AAC TCA ACA AAA ACC 
    AGA GTG ATA CCC GCT ATG TCA ACC CGA ACT AGG CCG GCA TCG 
    GCA TTT GCC GTT CAG AAC CGC AGC TTG CAT CGT ACA GGG ACC 
    GGT CCC AAA CTA AAA CTC TAA CTC ACC ACG CGA AAA CTA ATC TCA
     CAG CAT AGT ATT GCT CTA CAG ACT AAC TAC

    Output

    CDS 1: GTG GGT AGA TCT GAG GCA CTT TTT TTA ACA TAT TAT GGC TCT TCA TGC CCT GGT CGA AGG CAG
    CDS 2: GTC CAC CTC AAG TAT CAT AAC AGA TTT GTT GGT GCC CCA CTA CTA GTA GAT TCG TGG GGA TCG GTG AAA ACC CGG ACC
    CDS 3: TCA ACC CGA ACT AGG CCG GCA TCG GCA TTT GCC GTT CAG AAC CGC AGC TTG CAT CGT ACA GGG ACC GGT CCC AAA CTA AAA CTC
    
  • Information
    Author
    Lluís Padró
    Language
    English
    Official solutions
    Python
    User solutions
    Python