Read mapping 2 X38913


Statement
 

pdf   zip

html

Sequence reads are mapped to a reference genome using some indexing data structure, one of which is the suffix array. The suffix array of a string is a sorted array of the suffixes of the string, that is, an array of the (starting) positions of the sorted suffixes of the string.

For example, the genomic sequence fragment TATAAT has the suffixes TATAAT at position 1, ATAAT at position 2, TAAT at position 3, AAT at position 4, AT at position 5, and T at position 6. The sorted suffixes are AAT at position 4, AT at position 5, ATAAT at position 2, T at position 6, TAAT at position 3, and TATAAT at position 1. Therefore, the suffix array of TATAAT is (4, 5, 2, 6, 3, 1).

Write code for the read mapping problem. The program must implement and use the SUFFIX-ARRAY function in the pseudocode discussed in class, which is iterative and is not allowed to perform input/output operations. Make one submission with Python code and another submission with C++ code.

Input

The input is a string s over the alphabet Σ={A,C,G,T}.

Output

The output is the suffix array of s.

Hint

Implement and use the bubble-sort algorithm to sort suffixes.

Public test cases
  • Input

    TATAAT
    

    Output

    4
    5
    2
    6
    3
    1
    
  • Information
    Author
    Gabriel Valiente
    Language
    English
    Official solutions
    C++ Python
    User solutions
    C++ Python