TY - JOUR
T1 - Somatic point mutations are enriched in non-coding RNAs with possible regulatory function in breast cancer
AU - Rezaie, Narges
AU - Bayati, Masroor
AU - Hamidi, Mehrab
AU - Tahaei, Maedeh Sadat
AU - Khorasani, Sadegh
AU - Lovell, Nigel H.
AU - Breen, James
AU - Rabiee, Hamid R.
AU - Alinejad-Rokny, Hamid
N1 - Copyright Crown 2022. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.
PY - 2022
Y1 - 2022
N2 - Non-coding RNAs (ncRNAs) form a large portion of the mammalian genome. However, their biological functions are poorly characterized in cancers. In this study, using a newly developed tool, SomaGene, we analyze de novo somatic point mutations from the International Cancer Genome Consortium (ICGC) whole-genome sequencing data of 1,855 breast cancer samples. We identify 1030 candidates of ncRNAs that are significantly and explicitly mutated in breast cancer samples. By integrating data from the ENCODE regulatory features and FANTOM5 expression atlas, we show that the candidate ncRNAs significantly enrich active chromatin histone marks (1.9 times), CTCF binding sites (2.45 times), DNase accessibility (1.76 times), HMM predicted enhancers (2.26 times) and eQTL polymorphisms (1.77 times). Importantly, we show that the 1030 ncRNAs contain a much higher level (3.64 times) of breast cancer-associated genome-wide association (GWAS) single nucleotide polymorphisms (SNPs) than genome-wide expectation. Such enrichment has not been seen with GWAS SNPs from other cancers. Using breast cell line related Hi-C data, we then show that 82% of our candidate ncRNAs (1.9 times) significantly interact with the promoter of protein-coding genes, including previously known cancer-associated genes, suggesting the critical role of candidate ncRNA genes in the activation of essential regulators of development and differentiation in breast cancer. We provide an extensive web-based resource (https://www.ihealthe.unsw.edu.au/research) to communicate our results with the research community. Our list of breast cancer-specific ncRNA genes has the potential to provide a better understanding of the underlying genetic causes of breast cancer. Lastly, the tool developed in this study can be used to analyze somatic mutations in all cancers.
AB - Non-coding RNAs (ncRNAs) form a large portion of the mammalian genome. However, their biological functions are poorly characterized in cancers. In this study, using a newly developed tool, SomaGene, we analyze de novo somatic point mutations from the International Cancer Genome Consortium (ICGC) whole-genome sequencing data of 1,855 breast cancer samples. We identify 1030 candidates of ncRNAs that are significantly and explicitly mutated in breast cancer samples. By integrating data from the ENCODE regulatory features and FANTOM5 expression atlas, we show that the candidate ncRNAs significantly enrich active chromatin histone marks (1.9 times), CTCF binding sites (2.45 times), DNase accessibility (1.76 times), HMM predicted enhancers (2.26 times) and eQTL polymorphisms (1.77 times). Importantly, we show that the 1030 ncRNAs contain a much higher level (3.64 times) of breast cancer-associated genome-wide association (GWAS) single nucleotide polymorphisms (SNPs) than genome-wide expectation. Such enrichment has not been seen with GWAS SNPs from other cancers. Using breast cell line related Hi-C data, we then show that 82% of our candidate ncRNAs (1.9 times) significantly interact with the promoter of protein-coding genes, including previously known cancer-associated genes, suggesting the critical role of candidate ncRNA genes in the activation of essential regulators of development and differentiation in breast cancer. We provide an extensive web-based resource (https://www.ihealthe.unsw.edu.au/research) to communicate our results with the research community. Our list of breast cancer-specific ncRNA genes has the potential to provide a better understanding of the underlying genetic causes of breast cancer. Lastly, the tool developed in this study can be used to analyze somatic mutations in all cancers.
UR - http://www.scopus.com/inward/record.url?scp=85131496532&partnerID=8YFLogxK
UR - http://purl.org/au-research/grants/arc/DE220101210
U2 - 10.1038/s42003-022-03528-0
DO - 10.1038/s42003-022-03528-0
M3 - Article
C2 - 35672401
AN - SCOPUS:85131496532
SN - 2399-3642
VL - 5
SP - 1
EP - 13
JO - Communications Biology
JF - Communications Biology
IS - 1
M1 - 556
ER -