Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models

Hameed, Saad; Qolomany, Basheer; Belhaouari, Samir Brahim; Abdallah, Mohamed; Qadir, Junaid; Al-Fuqaha, Ala

Author	Hameed, Saad
Author	Qolomany, Basheer
Author	Belhaouari, Samir Brahim
Author	Abdallah, Mohamed
Author	Qadir, Junaid
Author	Al-Fuqaha, Ala
Available date	2025-07-08T03:58:10Z
Publication Date	2025
Publication Name	IEEE Open Journal of the Computer Society
Resource	Scopus
Identifier	http://dx.doi.org/10.1109/OJCS.2025.3564493
ISSN	26441268
URI	http://hdl.handle.net/10576/66083
Abstract	Determining the ideal architecture for deep learning models, such as the number of layers and neurons, is a difficult and resource-intensive process that frequently relies on human tuning or computationally costly optimization approaches. While Particle Swarm Optimization (PSO) and Large Language Models (LLMs) have been individually applied in optimization and deep learning, their combined use for enhancing convergence in numerical optimization tasks remains underexplored. Our work addresses this gap by integrating LLMs into PSO to reduce model evaluations and improve convergence for deep learning hyperparameter tuning. The proposed LLM-enhanced PSO method addresses the difficulties of efficiency and convergence by using LLMs (particularly ChatGPT-3.5 and Llama3) to improve PSO performance, allowing for faster achievement of target objectives. Our method speeds up search space exploration by substituting underperforming particle placements with best suggestions offered by LLMs. Comprehensive experiments across three scenarios-(1) optimizing the Rastrigin function, (2) using Long Short-Term Memory (LSTM) networks for time series regression, and (3) using Convolutional Neural Networks (CNNs) for material classification-show that the method significantly improves convergence rates and lowers computational costs. Depending on the application, computational complexity is lowered by 20% to 60% compared to traditional PSO methods. Llama3 achieved a 20% to 40% reduction in model calls for regression tasks, whereas ChatGPT-3.5 reduced model calls by 60% for both regression and classification tasks, all while preserving accuracy and error rates. This groundbreaking methodology offers a very efficient and effective solution for optimizing deep learning models, leading to substantial computational performance improvements across a wide range of applications.
Language	en
Publisher	IEEE
Subject	Deep Learning Optimization Hyper-parameter Optimization LLM Machine Learning PSO
Title	Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models
Type	Article
dc.accessType	Open Access

Files in this item

Name:: Large_Language_Model_Enhanced_ ...
Size:: 2.194Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Computer Science & Engineering [‎2482‎ items ]

Show simple item record

Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models

Files in this item

This item appears in the following Collection(s)

Video