Subwords 1 X85229


Statement
 

pdf   zip

A nucleic acid or amino acid sequence of length nn can be seen as composed of a number of possibly overlapping kk-mers or words of length kk, for 1kn1 \leq k \leq n. An interesting problem is the generation of all the words of length kk contained in a genomic sequence with nn nucleotides, for all kk with 1kn1 \leq k \leq n. That is, the generation of all the subwords of a genomic sequence of length nn.

Write code for the subwords problem. The program must implement and use the SUBWORDS function in the pseudocode discussed in class, which is iterative and is not allowed to perform input/output operations. Make one submission with Python code and another submission with C++ code.

Input

The input is a string ss over the alphabet Σ={A,C,G,T}\Sigma=\{A,C,G,T\}.

Output

The output is a sorted list of all the nonempty subwords of ss, without repetitions.

Public test cases
  • Input

    TATAAT
    

    Output

    A
    AA
    AAT
    AT
    ATA
    ATAA
    ATAAT
    T
    TA
    TAA
    TAAT
    TAT
    TATA
    TATAA
    TATAAT
    
  • Information
    Author
    Gabriel Valiente
    Language
    English
    Official solutions
    C++ Python
    User solutions
    C++ Python