PDKE: an efficient distributed embedding framework for large knowledge graphs
conference contribution
posted on 2020-09-01, 00:00authored byS Dong, X Wang, L Chai, Jianxin Li, Y Yang
Knowledge Representation Learning (KRL) methods produce unsupervised node features from knowledge graphs that can be used for a variety of machine learning tasks. However, two main issues in KRL embedding techniques have not been addressed yet. One is that real-world knowledge graphs contain millions of nodes and billions of edges, which exceeds the capability of existing KRL embedding systems; the other issue is the lack of a unified framework to integrate the current KRL models to facilitate the realization of embeddings for various applications. To address the issues, we propose PDKE, which is a distributed KRL training framework that can incorporate different translation-based KRL models using a unified algorithm template. In PDKE, a set of functions is implemented by various knowledge embedding models to form a unified algorithm template for distributed KRL. PDKE implements training arbitrarily large embeddings in a distributed environment. The effeciency and scalability of our framework have been verified by extensive experiments on both synthetic and real-world knowledge graphs, which shows that our approach outperforms the existing ones by a large margin.