• English
    • العربية
  • العربية
  • Login
  • QU
  • QU Library
  •  Home
  • Communities & Collections
  • Help
    • Item Submission
    • Publisher policies
    • User guides
    • FAQs
  • About QSpace
    • Vision & Mission
View Item 
  •   Qatar University Digital Hub
  • Qatar University Institutional Repository
  • Academic
  • Faculty Contributions
  • College of Health Sciences
  • Biomedical Sciences
  • View Item
  • Qatar University Digital Hub
  • Qatar University Institutional Repository
  • Academic
  • Faculty Contributions
  • College of Health Sciences
  • Biomedical Sciences
  • View Item
  •      
  •  
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    ChatGPT’s scorecard after the performance in a series of tests conducted at the multi-country level: A pattern of responses of generative artificial intelligence or large language models

    Thumbnail
    View/Open
    Publisher version (You have accessOpen AccessIcon)
    Publisher version (Check access options)
    Check access options
    1-s2.0-S2590262824000200-main.pdf (7.298Mb)
    Date
    2024-03-02
    Author
    Manojit, Bhattacharya
    Pal, Soumen
    Chatterjee, Srijan
    Alshammari, Abdulrahman
    Albekairi, Thamer H.
    Jagga, Supriya
    Ige Ohimain, Elijah
    Zayed, Hatem
    Byrareddy, Siddappa N.
    Lee, Sang-Soo
    Wen, Zhi-Hong
    Agoramoorthy, Govindasamy
    Bhattacharya, Prosun
    Chakraborty, Chiranjib
    ...show more authors ...show less authors
    Metadata
    Show full item record
    Abstract
    Recently, researchers have shown concern about the ChatGPT-derived answers. Here, we conducted a series of tests using ChatGPT by individual researcher at multi-country level to understand the pattern of its answer accuracy, reproducibility, answer length, plagiarism, and in-depth using two questionnaires (the first set with 15 MCQs and the second 15 KBQ). Among 15 MCQ-generated answers, 13 ± 70 were correct (Median : 82.5; Coefficient variance : 4.85), 3 ± 0.77 were incorrect (Median: 3, Coefficient variance: 25.81), and 1 to 10 were reproducible, and 11 to 15 were not. Among 15 KBQ, the length of each question (in words) is about 294.5 ± 97.60 (mean range varies from 138.7 to 438.09), and the mean similarity index (in words) is about 29.53 ± 11.40 (Coefficient variance: 38.62) for each question. The statistical models were also developed using analyzed parameters of answers. The study shows a pattern of ChatGPT-derive answers with correctness and incorrectness and urges for an error-free, next-generation LLM to avoid users’ misguidance.
    URI
    https://www.sciencedirect.com/science/article/pii/S2590262824000200
    DOI/handle
    http://dx.doi.org/10.1016/j.crbiot.2024.100194
    http://hdl.handle.net/10576/56120
    Collections
    • Biomedical Sciences [‎830‎ items ]

    entitlement


    Qatar University Digital Hub is a digital collection operated and maintained by the Qatar University Library and supported by the ITS department

    Contact Us | Send Feedback
    Contact Us | Send Feedback | QU

     

     

    Home

    Submit your QU affiliated work

    Browse

    All of Digital Hub
      Communities & Collections Publication Date Author Title Subject Type Language Publisher
    This Collection
      Publication Date Author Title Subject Type Language Publisher

    My Account

    Login

    Statistics

    View Usage Statistics

    About QSpace

    Vision & Mission

    Help

    Item Submission Publisher policiesUser guides FAQs

    Qatar University Digital Hub is a digital collection operated and maintained by the Qatar University Library and supported by the ITS department

    Contact Us | Send Feedback
    Contact Us | Send Feedback | QU

     

     

    Video