Read abundance X28783


Statement
 

pdf   zip

html

The abundance (number of occurrences) of a read in a read set is an indicator value for read confidence in high-throughput sequencing studies.

Write pseudocode, Python code, and C++ code for the read abundance problem. Make two submissions, including the pseudocode as a comment to both the Python and the C++ code.

Input

The input is a collection of n strings S={s1,s2,…,sn} (genomic sequence reads, possibly reverse complemented) over the alphabet {A,C,G,T}.

Output

The output is the sorted frequency distribution of S.

Public test cases
  • Input

    TCATC
    TTGAT
    TCATC
    TGAAA
    GATGA
    TTTCA
    ATCAA
    TTGAT
    TTTCA
    

    Output

    ATCAA 3
    GATGA 3
    TGAAA 3
    
  • Information
    Author
    Gabriel Valiente
    Language
    English
    Official solutions
    C++ Python
    User solutions
    C++ Python