Subwords 1 X85229


Statement
 

pdf   zip

html

A nucleic acid or amino acid sequence of length n can be seen as composed of a number of possibly overlapping k-mers or words of length k, for 1 ≤ kn. An interesting problem is the generation of all the words of length k contained in a genomic sequence with n nucleotides, for all k with 1 ≤ kn. That is, the generation of all the subwords of a genomic sequence of length n.

Write code for the subwords problem. The program must implement and use the SUBWORDS function in the pseudocode discussed in class, which is iterative and is not allowed to perform input/output operations. Make one submission with Python code and another submission with C++ code.

Input

The input is a string s over the alphabet Σ={A,C,G,T}.

Output

The output is a sorted list of all the nonempty subwords of s, without repetitions.

Public test cases
  • Input

    TATAAT
    

    Output

    A
    AA
    AAT
    AT
    ATA
    ATAA
    ATAAT
    T
    TA
    TAA
    TAAT
    TAT
    TATA
    TATAA
    TATAAT
    
  • Information
    Author
    Gabriel Valiente
    Language
    English
    Official solutions
    C++ Python
    User solutions
    C++ Python