ENHANCING ADAPTABILITY AND SCALABILITY IN DISTRIBUTED STORAGE THROUGH REINFORCEMENT LEARNING-BASED DATA PLACEMENT STRATEGIES
Date
2026-01Metadata
Show full item recordAbstract
The exponential growth of data and increasing workload complexity pose major challenges for distributed and hierarchical storage systems. Enterprises and cloud providers must balance cost, latency, and availability while managing petabytes of data across geographically distributed infrastructures. Traditional data placement schemes such as Consistent Hashing and CRUSH are often static and cost-centric, limiting their adaptability in heterogeneous and dynamic environments. This thesis presents SimBench-HSS, a configurable simulation framework for evaluating hierarchical storage architectures and data placement strategies. Built with high modularity and realism, SimBench-HSS models multi-tier systems, network effects, and file hotness dynamics, enabling both heuristic and learning-based policy evaluation under reproducible conditions. Leveraging this framework, we propose a reinforcement learning (RL) approach to adaptive data placement using Deep Q-Network (DQN) and Double Deep Q-Network (DDQN) agents. The RL agents learn to balance multiple objectives-cost, response time, and availability-based on feedback from the simulated environment. Experimental results demonstrate that RL-based placement achieves up to 28% cost reduction, 15% lower latency, and 40% fewer unavailable accesses compared to heuristic baselines, with DDQN consistently delivering the most stable and efficient outcomes. Overall, this research establishes reinforcement learning as a viable and effective mechanism for optimizing hierarchical storage management. The results validate that learning-based approaches can dynamically adapt to workload shifts, offering improved efficiency and reliability for next-generation distributed storage systems.
DOI/handle
http://hdl.handle.net/10576/69607Collections
- Computing [117 items ]

