MulDE: multi-teacher knowledge distillation for low-dimensional knowledge graph embeddings

Kai Wang, Yu Liu, Qian Ma, Quan Z. Sheng

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

8 Citations (Scopus)
16 Downloads (Pure)


Link prediction based on knowledge graph embeddings (KGE) aims to predict new triples to automatically construct knowledge graphs (KGs). However, recent KGE models achieve performance improvements by excessively increasing the embedding dimensions, which may cause enormous training costs and require more storage space. In this paper, instead of training high-dimensional models, we propose MulDE, a novel knowledge distillation framework, which includes multiple low-dimensional hyperbolic KGE models as teachers and two student components, namely Junior and Senior. Under a novel iterative distillation strategy, the Junior component, a low-dimensional KGE model, asks teachers actively based on its preliminary prediction results, and the Senior component integrates teachers' knowledge adaptively to train the Junior component based on two mechanisms: relation-specific scaling and contrast attention. The experimental results show that MulDE can effectively improve the performance and training speed of low-dimensional KGE models. The distilled 32-dimensional model is competitive compared to the state-of-the-art high-dimensional methods on several widely-used datasets.

Original languageEnglish
Title of host publicationThe Web Conference 2021
Subtitle of host publicationProceedings of the World Wide Web Conference, WWW 2021
Place of PublicationNew York, NY
PublisherAssociation for Computing Machinery, Inc
Number of pages11
ISBN (Electronic)9781450383127
Publication statusPublished - 2021
Event2021 World Wide Web Conference, WWW 2021 - Ljubljana, Slovenia
Duration: 19 Apr 202123 Apr 2021


Conference2021 World Wide Web Conference, WWW 2021

Bibliographical note

Copyright the Publisher 2021. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.


  • Knowledge distillation
  • Knowledge graph
  • Knowledge graph embeddings
  • Link prediction


Dive into the research topics of 'MulDE: multi-teacher knowledge distillation for low-dimensional knowledge graph embeddings'. Together they form a unique fingerprint.

Cite this