Predicting lysine phosphoglycerylation sites using bidirectional encoder representations with transformers & protein feature extraction and selection

Songning Lai, Xifeng Hu, Jing Han, Chun Wang, Subhas Mukhopadhyay, Zhi Liu*, Lan Ye*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

Abstract

Phosphorylation, a post-translational modification of proteins, greatly affects protein structure and functionand plays an important role in the pathogenesis of human diseases. Elucidation of the molecular mechanism of phosphorylation is important for the development of therapeutic agents for some diseases. Nowdays, identification of phosphorylation sites is one of the hotspots in many studies. However, it is difficult and costly to identify phosphorylation sites only by conventional experimental methods. In our works, we focued on developing a model to predict the phosphorylation sites of lysine. This model uses protein feature acquisition, F_Score feature selection, KNN data cleaning, SMOTE synthesis of positive samples and other algorithms to construct the feature set. Subsequently, the transformer-based BERT classification technique was applied to this prediction model. In the BERT model, the present study used two different feature sequence inputing methods. the accuracy are 98.43% and 99.61%, and the MCC are 96.5% and 99.1% respectively, which are better than other previous models for predicting phosphoglycerylation sites. The results of our work have an incalculable future.

Original languageEnglish
Title of host publicationProceedings 2022 15th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics CISP-BMEI 2022
EditorsXin Chen, Lin Cao, Qingli Li, Yan Wang, Lipo Wang
Place of PublicationPiscataway, NJ
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Number of pages5
ISBN (Electronic)9781665488877, 9781665488860
ISBN (Print)9781665488884
DOIs
Publication statusPublished - 2022
Externally publishedYes
Event15th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics, CISP-BMEI 2022 - Beijing, China
Duration: 5 Nov 20227 Nov 2022

Conference

Conference15th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics, CISP-BMEI 2022
Country/TerritoryChina
CityBeijing
Period5/11/227/11/22

Keywords

  • Phosphoglycerylation
  • Lysine
  • Characterization engineering
  • Transformer

Fingerprint

Dive into the research topics of 'Predicting lysine phosphoglycerylation sites using bidirectional encoder representations with transformers & protein feature extraction and selection'. Together they form a unique fingerprint.

Cite this