• English
    • العربية
  • العربية
  • Login
  • QU
  • QU Library
  •  Home
  • Communities & Collections
  • Help
    • Item Submission
    • Publisher policies
    • User guides
    • FAQs
  • About QSpace
    • Vision & Mission
View Item 
  •   Qatar University Digital Hub
  • Qatar University Institutional Repository
  • Academic
  • Student Thesis & Dissertations
  • College of Engineering
  • Computing
  • View Item
  • Qatar University Digital Hub
  • Qatar University Institutional Repository
  • Academic
  • Student Thesis & Dissertations
  • College of Engineering
  • Computing
  • View Item
  •      
  •  
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Enhancing Knowledge Distillation for Text Summarization

    View/Open
    Mohammad Kotit _ OGS Approved Thesis.pdf (2.803Mb)
    Date
    2024-01
    Author
    Kotit, Mohammad Basheer
    Metadata
    Show full item record
    Abstract
    In the realm of natural language processing, recent advancements have been significantly shaped by the development of large pretrained Seq2Seq Transformer models, including BART, PEGASUS, and T5. These models have revolutionized various text generation applications, such as machine translation, text summarization, and chatbot development, by offering remarkable improvements in accuracy and fluency. However, their deployment in text summarization often encounters significant challenges in environments with limited computational resources. This research proposes an innovative solution: the development of compact student models. These models are designed to emulate the capabilities of their larger pretrained counterparts (teacher models) while ensuring reduced computational demands and increased processing speed, thus maintaining high performance with greater efficiency. Knowledge distillation, a popular technique in model optimization, typically employs two primary techniques: direct knowledge distillation and the use of pseudo-labels. Our research enhances direct knowledge distillation by introducing an effective behavior function. This function selectively emphasizes the more certain predictions from the teacher model, thereby addressing the exposure bias issue that arises from differences between training and testing environments. In addition to this, we propose a novel approach to select the most reliable predictions from the teacher model. These highconfidence predictions are then utilized as pseudo-summaries, optimizing the student model’s training through the pseudo-label technique. This dual approach mainly focuses on the confidence of teacher predictions and offers a comprehensive solution to enhance the model’s performance while maintaining computational efficiency. We evaluated our methods using BART on the CNN/DM dataset and Pegasus on the XSUM dataset. The findings of these assessments revealed that our approaches not only successfully achieved the knowledge distillation objectives, but also significantly surpassed the performance of the teacher models.
    DOI/handle
    http://hdl.handle.net/10576/51500
    Collections
    • Computing [‎103‎ items ]

    entitlement


    Qatar University Digital Hub is a digital collection operated and maintained by the Qatar University Library and supported by the ITS department

    Contact Us | Send Feedback
    Contact Us | Send Feedback | QU

     

     

    Home

    Submit your QU affiliated work

    Browse

    All of Digital Hub
      Communities & Collections Publication Date Author Title Subject Type Language Publisher
    This Collection
      Publication Date Author Title Subject Type Language Publisher

    My Account

    Login

    Statistics

    View Usage Statistics

    About QSpace

    Vision & Mission

    Help

    Item Submission Publisher policiesUser guides FAQs

    Qatar University Digital Hub is a digital collection operated and maintained by the Qatar University Library and supported by the ITS department

    Contact Us | Send Feedback
    Contact Us | Send Feedback | QU

     

     

    Video