Scratch-DKG: A Framework for Constructing Scratch Domain Knowledge Graph
Author | Qi, Peng |
Author | Sun, Yan |
Author | Luo, Hong |
Author | Guizani, Mohsen |
Available date | 2022-10-23T10:25:28Z |
Publication Date | 2022-01-01 |
Publication Name | IEEE Transactions on Emerging Topics in Computing |
Identifier | http://dx.doi.org/10.1109/TETC.2020.2996710 |
Citation | Qi, P., Sun, Y., Luo, H., & Guizani, M. (2020). Scratch-dkg: A framework for constructing scratch domain knowledge graph. IEEE Transactions on Emerging Topics in Computing. |
Abstract | With the rapid development of programming platforms, how to utilize the tremendous amount of data produced by the platforms, such as Scatch, has been a big challenge to researchers. The growing data is not only huge, but also heterogeneous and diverse, leading that the existing tools cannot effectively extract valuable information. In this article, considering particular features of Scratch data, we propose an effective framework about constructing a Scratch Domain Knowledge Graph (Scratch-DKG). Our framework includes four modules which are designed to process the semi-structured data, users profile data, projects data and programming knowledge points, respectively. For webpages, we design a template-based wrapper method to extract triples from the semi-structured data. As for users profile data, we improve DeepDive, which is a useful tool to extract information but with the problem of wrong labeling, to extract knowledge triples by the proposed Secondary Labeling Algorithm. For projects data, we propose an advanced keywords extraction method (S-TextRank) to extract keywords triples. For programming knowledge points, we develop a frequently contiguous block combinations mining algorithm to extract the potential domain information of Scratch. Finally, extensive experiments are carried out to evaluate the performance of our proposed methods. The experimental results show that, compared to other competing methods, our proposal can extract more correct and comprehensive Scratch triples. |
Language | en |
Publisher | IEEE Computer Society |
Subject | DeepDive Knowledge Graph programming knowledge points S-TextRank Scratch Secondary Labeling Algorithm |
Type | Article |
Pagination | 170-185 |
Issue Number | 1 |
Volume Number | 10 |
Files in this item
Files | Size | Format | View |
---|---|---|---|
There are no files associated with this item. |
This item appears in the following Collection(s)
-
Computer Science & Engineering [2402 items ]