Two reinforcement learning strategies based meta-heuristics for scheduling partial reentrant distributed flow-shops.

Jia, Yanan; Gao, Kaizhou; Ren, Yaxian; Suganthan, Ponnuthurai Nagaratnam; Sang, Hongyan

View/Open

10.3934_jimo.2025127.pdf (825.8Kb)

Date

2025-10

Author

Jia, Yanan
Gao, Kaizhou
Ren, Yaxian
Suganthan, Ponnuthurai Nagaratnam
Sang, Hongyan

Metadata

Show full item record

Abstract

Reentrant or partial reentrant widely exists in practical manufacturing scenarios, which is rarely considered in literature. This work investigates a distributed flow-shop scheduling problem with partial reentrant constraint (DFSP PR). The objective is to minimize the maximum completion time (makespan). First, a mathematical model for the DFSP PR is developed, which integrates the characteristics of partial reentrant and distributed manufacturing scenarios. Second, three meta-heuristics are employed and enhanced to solve the concerned problems. The Nawaz-Enscore-Ham (NEH) heuristic is used to initialize the population. Based on the nature of the DFSP PR, six local search strategies are designed to improve the convergence efficiency of meta-heuristics. Third, two cutting-edge reinforcement learning algorithms, Q-learning and state-action-reward-state’-action’ (SARSA), are integrated into the meta-heuristics to select the most effective local search strategy during iterations. Finally, comprehensive experiments on 48 benchmark instances with varying scales demonstrate the effectiveness of the proposed approaches, where Q-learning and SARSA significantly improving the performance of the meta-heuristics.