Show simple item record

AdvisorElsayed, Tamer
AdvisorHassan, Yasser
AuthorGAD, RADWA ESSAM
Available date2025-07-17T05:00:04Z
Publication Date2025-06
URIhttp://hdl.handle.net/10576/66442
AbstractWith the rapid advancement of perovskite solar cells (PSCs) research, efficiently extracting structured data from scientific literature has become essential for accelerating materials discovery and development. PSCs studies often report multiple device configurations within a single paper, making traditional single-device extraction approaches insufficient. In this thesis, we are the first to propose an automated information extraction pipeline that leverages Large Language models (LLMs) to extract structured attributes for all reported devices in PSCs research papers. Our experiments utilize open-source and closed-source LLMs, including GPT-4o-mini, LLaMA 3.1 70B, and Qwen 2.5 72B, ensuring a comprehensive evaluation across various model architectures. Additionally, we introduce the first multi-device evaluation framework using an optimization-based matching algorithm. We also define a wide range of PSC-specific attributes, carefully selected to enhance the practical utility of the extracted dataset for researchers. Our experimental results demonstrate that the proposed pipeline outperforms existing approaches, achieving a champion-device extraction F1 score of 90.06%, F1 score of 78.70% for multi-device extraction, and the best F1 score of 90.98% for the best device in multi-device extraction. These findings highlight the effectiveness of our approach in delivering a scalable, reproducible, and efficient solution for automating structured information extraction from PSCs literature.
Languageen
SubjectPerovskite Solar Cells
Information Extraction
Large Language Models
Multi-device Extraction
Natural Language Processing
TitleAUTOMATING INFORMATION EXTRACTION FROM PEROVSKITE SOLAR CELLS LITERATURE USING LARGE LANGUAGE MODELS
TypeMaster Thesis
DepartmentComputer Science
dc.accessType Full Text


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record