Read abundance X28783


Statement
 

pdf   zip

The abundance (number of occurrences) of a read in a read set is an indicator value for read confidence in high-throughput sequencing studies.

Write pseudocode, Python code, and C++ code for the read abundance problem. Make two submissions, including the pseudocode as a comment to both the Python and the C++ code.

Input

The input is a collection of nn strings S={s1,s2,,sn}S=\{s_1,s_2,\ldots,s_n\} (genomic sequence reads, possibly reverse complemented) over the alphabet {A,C,G,T}\{A,C,G,T\}.

Output

The output is the sorted frequency distribution of SS.

Public test cases
  • Input

    TCATC
    TTGAT
    TCATC
    TGAAA
    GATGA
    TTTCA
    ATCAA
    TTGAT
    TTTCA
    

    Output

    ATCAA 3
    GATGA 3
    TGAAA 3
    
  • Information
    Author
    Gabriel Valiente
    Language
    English
    Official solutions
    C++ Python
    User solutions
    C++ Python