• English
    • العربية
  • العربية
  • Login
  • QU
  • QU Library
  •  Home
  • Communities & Collections
View Item 
  •   Qatar University Digital Hub
  • Qatar University Institutional Repository
  • Academic
  • Faculty Contributions
  • College of Health Sciences
  • Biomedical Sciences
  • View Item
  • Qatar University Digital Hub
  • Qatar University Institutional Repository
  • Academic
  • Faculty Contributions
  • College of Health Sciences
  • Biomedical Sciences
  • View Item
  •      
  •  
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    ChatGPT’s scorecard after the performance in a series of tests conducted at the multi-country level: A pattern of responses of generative artificial intelligence or large language models

    Thumbnail
    View/Open
    Publisher version (You have accessOpen AccessIcon)
    Publisher version (Check access options)
    Check access options
    1-s2.0-S2590262824000200-main.pdf (7.298Mb)
    Date
    2024-03-02
    Author
    Manojit, Bhattacharya
    Pal, Soumen
    Chatterjee, Srijan
    Alshammari, Abdulrahman
    Albekairi, Thamer H.
    Jagga, Supriya
    Ige Ohimain, Elijah
    Zayed, Hatem
    Byrareddy, Siddappa N.
    Lee, Sang-Soo
    Wen, Zhi-Hong
    Agoramoorthy, Govindasamy
    Bhattacharya, Prosun
    Chakraborty, Chiranjib
    ...show more authors ...show less authors
    Metadata
    Show full item record
    Abstract
    Recently, researchers have shown concern about the ChatGPT-derived answers. Here, we conducted a series of tests using ChatGPT by individual researcher at multi-country level to understand the pattern of its answer accuracy, reproducibility, answer length, plagiarism, and in-depth using two questionnaires (the first set with 15 MCQs and the second 15 KBQ). Among 15 MCQ-generated answers, 13 ± 70 were correct (Median : 82.5; Coefficient variance : 4.85), 3 ± 0.77 were incorrect (Median: 3, Coefficient variance: 25.81), and 1 to 10 were reproducible, and 11 to 15 were not. Among 15 KBQ, the length of each question (in words) is about 294.5 ± 97.60 (mean range varies from 138.7 to 438.09), and the mean similarity index (in words) is about 29.53 ± 11.40 (Coefficient variance: 38.62) for each question. The statistical models were also developed using analyzed parameters of answers. The study shows a pattern of ChatGPT-derive answers with correctness and incorrectness and urges for an error-free, next-generation LLM to avoid users’ misguidance.
    URI
    https://www.sciencedirect.com/science/article/pii/S2590262824000200
    DOI/handle
    http://dx.doi.org/10.1016/j.crbiot.2024.100194
    http://hdl.handle.net/10576/56120
    Collections
    • Biomedical Sciences [‎843‎ items ]

    entitlement


    Qatar University Digital Hub is a digital collection operated and maintained by the Qatar University Library and supported by the ITS department

    Contact Us | Send Feedback
    Contact Us | Send Feedback | QU

     

     

    Home

    Submit your QU affiliated work

    Browse

    All of Digital Hub
      Communities & Collections Publication Date Author Title Subject Type Language Publisher
    This Collection
      Publication Date Author Title Subject Type Language Publisher

    My Account

    Login

    Statistics

    View Usage Statistics

    Qatar University Digital Hub is a digital collection operated and maintained by the Qatar University Library and supported by the ITS department

    Contact Us | Send Feedback
    Contact Us | Send Feedback | QU

     

     

    Video