Bidirectional Language Modeling: A Systematic Literature Review

Shah Jahan, Muhammad; Khan, Habib Ullah; Akbar, Shahzad; Umar Farooq, Muhammad; Gul, Sarah; Amjad, Anam

Author	Shah Jahan, Muhammad
Author	Khan, Habib Ullah
Author	Akbar, Shahzad
Author	Umar Farooq, Muhammad
Author	Gul, Sarah
Author	Amjad, Anam
Available date	2022-12-27T10:57:17Z
Publication Date	2021-05-03
Publication Name	Scientific Programming
Identifier	http://dx.doi.org/10.1155/2021/6641832
Citation	Shah Jahan, M., Khan, H. U., Akbar, S., Umar Farooq, M., Gul, S., & Amjad, A. (2021). Bidirectional Language Modeling: A Systematic Literature Review. Scientific Programming, 2021.
ISSN	1058-9244
URI	https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85106388570&origin=inward
URI	http://hdl.handle.net/10576/37684
Abstract	In transfer learning, two major activities, i.e., pretraining and fine-tuning, are carried out to perform downstream tasks. The advent of transformer architecture and bidirectional language models, e.g., bidirectional encoder representation from transformer (BERT), enables the functionality of transfer learning. Besides, BERT bridges the limitations of unidirectional language models by removing the dependency on the recurrent neural network (RNN). BERT also supports the attention mechanism to read input from any side and understand sentence context better. It is analyzed that the performance of downstream tasks in transfer learning depends upon the various factors such as dataset size, step size, and the number of selected parameters. In state-of-the-art, various research studies produced efficient results by contributing to the pretraining phase. However, a comprehensive investigation and analysis of these research studies is not available yet. Therefore, in this article, a systematic literature review (SLR) is presented investigating thirty-one (31) influential research studies published during 2018-2020. Following contributions are made in this paper: (1) thirty-one (31) models inspired by BERT are extracted. (2) Every model in this paper is compared with RoBERTa (replicated BERT model) having large dataset and batch size but with a small step size. It is concluded that seven (7) out of thirty-one (31) models in this SLR outperforms RoBERTa in which three were trained on a larger dataset while the other four models are trained on a smaller dataset. Besides, among these seven models, six models shared both feedforward network (FFN) and attention across the layers. Rest of the twenty-four (24) models are also studied in this SLR with different parameter settings. Furthermore, it has been concluded that a pretrained model with a large dataset, hidden layers, attention heads, and small step size with parameter sharing produces better results. This SLR will help researchers to pick a suitable model based on their requirements.
Language	en
Publisher	Hindawi
Subject	Recurrent neural network (RNN) Computational linguistics Feedforward neural networks
Title	Bidirectional Language Modeling: A Systematic Literature Review
Type	Article
Volume Number	2021
ESSN	1875-919X
dc.accessType	Open Access

Files in this item

Name:: 6641832.pdf
Size:: 1.815Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Accounting & Information Systems [‎527‎ items ]

Show simple item record

Bidirectional Language Modeling: A Systematic Literature Review

Files in this item

This item appears in the following Collection(s)

Video