Proteins contain different domains (structural or functional units responsible for a particular function).
We need a program that, given a list of domains in a protein, and their positions in it, determines domain overlapping regions in the sequence.
The input is a list of protein domains found in a DNA sequence. For each domain, the id of the protein, the name of the domain, and its position span in the protein are given.
The input consists of an integer N (the number of protein domain records), followed by N lines, each consisting of two strings, and two integers:
protein_id domain_name start_position end_position
where:
protein_id (string): The protein’s identifier.
domain_name (string): The name of the domain.
start_position, end_position (integers): The span of the domain within the protein sequence.
List proteins in alphabetical order. For each protein, list its domains in order of starting position. If any domain overlaps with the previous one, mark it with "OVERLAP".
Print a summary list of the overlapping domains at the end of each protein. If no overlaps exist, print "No overlaps".
Follow the format of the examples.
Input
6 P12X43 Kinase 5 50 T5678A Phosphatase 10 40 T5678A Transmembrane 50 90 P12X43 SH3 30 70 T5678A Immunoglobulin 35 60 P12X43 Pleckstrin 80 100
Output
P12X43: Kinase (5-50) SH3 (30-70) OVERLAP Pleckstrin (80-100) Overlapping domains in protein P12X43: Kinase-SH3 T5678A: Phosphatase (10-40) Immunoglobulin (35-60) OVERLAP Transmembrane (50-90) OVERLAP Overlapping domains in protein T5678A: Phosphatase-Immunoglobulin Immunoglobulin-Transmembrane
Input
8 P12X31 RRM 5 20 R3012Y Collagen 35 45 H2F127 FN3 35 50 R3012Y EGF 5 15 R3012Y Cadherin 20 30 P12X31 Catalase 25 40 P12X31 Kinase 45 60 H2F127 RRM 10 30
Output
H2F127: RRM (10-30) FN3 (35-50) No overlaps P12X31: RRM (5-20) Catalase (25-40) Kinase (45-60) No overlaps R3012Y: EGF (5-15) Cadherin (20-30) Collagen (35-45) No overlaps
Input
12 P448X1 SH3 85 110 P59S87 Catalase 5 25 P59S87 HTH 10 35 P59S87 Immunoglobulin 75 90 P448X1 Transmembrane 10 40 P448X1 Kinase 30 60 P59S87 7TM 40 50 P59S87 PDZ 45 70 M32101 Pleckstrin 5 15 P448X1 SH2 55 90 P59S87 WD40 5 25 M32101 Porin 20 30
Output
M32101: Pleckstrin (5-15) Porin (20-30) No overlaps P448X1: Transmembrane (10-40) Kinase (30-60) OVERLAP SH2 (55-90) OVERLAP SH3 (85-110) OVERLAP Overlapping domains in protein P448X1: Transmembrane-Kinase Kinase-SH2 SH2-SH3 P59S87: Catalase (5-25) WD40 (5-25) OVERLAP HTH (10-35) OVERLAP 7TM (40-50) PDZ (45-70) OVERLAP Immunoglobulin (75-90) Overlapping domains in protein P59S87: Catalase-WD40 WD40-HTH 7TM-PDZ