Read mapping 2

Sequence reads are mapped to a reference genome using some indexing data
structure, one of which is the suffix array. The suffix array of a
string is a sorted array of the suffixes of the string, that is, an
array of the (starting) positions of the sorted suffixes of the string.

For example, the genomic sequence fragment TATAAT has the suffixes
TATAAT at position 1, ATAAT at position 2, TAAT at position 3, AAT at
position 4, AT at position 5, and T at position 6. The sorted suffixes
are AAT at position 4, AT at position 5, ATAAT at position 2, T at
position 6, TAAT at position 3, and TATAAT at position 1. Therefore, the
suffix array of TATAAT is (4, 5, 2, 6, 3, 1).

Write code for the read mapping problem. The program must implement and
use the SUFFIX-ARRAY function in the pseudocode discussed in class,
which is iterative and is not allowed to perform input/output
operations. Make one submission with Python code and another submission
with C++ code.

Input

The input is a string s over the alphabet Σ = {A, C, G, T}.

Output

The output is the suffix array of s.

Hint

Implement and use the bubble-sort algorithm to sort suffixes.

Problem information

Author: Gabriel Valiente

Generation: 2026-01-25T15:49:49.608Z

© Jutge.org, 2006–2026.
https://jutge.org
