A generalized machine-learning aided method for targeted identification of industrial enzymes from metagenome

a xylanase temperature dependence case study

Mehdi Foroozandeh Shahraki, Kiana Farhadyar, Kaveh Kavousi, Mohammad H. Azarabad, Amin Boroomand, Shohreh Ariaeenejad, Ghasem Hosseini Salekdeh*

*Corresponding author for this work

Research output: Contribution to journalArticle


Growing industrial utilization of enzymes and the increasing availability of metagenomic data highlight the demand for effective methods of targeted identification and verification of novel enzymes from various environmental microbiota. Xylanases are a class of enzymes with numerous industrial applications and are involved in the degradation of xylose, a component of lignocellulose. The optimum temperature of enzymes is an essential factor to be considered when choosing appropriate biocatalysts for a particular purpose. Therefore, in silico prediction of this attribute is a significant cost and time-effective step in the effort to characterize novel enzymes. The objective of this study was to develop a computational method to predict the thermal dependence of xylanases. This tool was then implemented for targeted screening of putative xylanases with specific thermal dependencies from metagenomic data and resulted in the identification of three novel xylanases from sheep and cow rumen microbiota. Here we present thermal activity prediction for xylanase, a new sequence-based machine learning method that has been trained using a selected combination of various protein features. This random forest classifier discriminates non-thermophilic, thermophilic, and hyper-thermophilic xylanases. The model's performance was evaluated through multiple iterations of sixfold cross-validations as well as holdout tests, and it is freely accessible as a web-service at arimees.com.

Original languageEnglish
Pages (from-to)759-769
Number of pages11
JournalBiotechnology and Bioengineering
Issue number2
Publication statusPublished - Feb 2021


  • machine learning
  • metagenomics
  • optimum temperature
  • targeted identification
  • xylanase

Fingerprint Dive into the research topics of 'A generalized machine-learning aided method for targeted identification of industrial enzymes from metagenome: a xylanase temperature dependence case study'. Together they form a unique fingerprint.

Cite this