سامانه پژوهشی دانشگاه خلیج فارس

عنوان	Green Speculative Decoding for Energy-Efficient Small Lamgauge Models
نوع پژوهش	مقالات در همایش ها
کلیدواژه‌ها	Green AI, Large Language Models, Energy Efficiency, Speculative Decoding, Sustainable Computing, Model Optimization
چکیده	— The rapid expansion of large language models has intensified computational and energy demands, making efficiency a key requirement for sustainable AI deployment. While most prior work targets optimization of very large models, the efficiency of widely deployed small-scale LLMs remains underexplored. This paper introduces a Green Speculative Decoding framework aimed at reducing the energy footprint of compact language models. The proposed approach employs a synchronized speculative pipeline in which a lightweight draft model generates candidate tokens that are selectively verified by a larger, more accurate target model. Experimental results from a simulation-based evaluation demonstrate notable improvements in inference efficiency, achieving higher throughput and significant energy savings without degrading output quality. These findings highlight the potential of speculative decoding as a practical and eco-friendly solution for energy-efficient LLM inference in real-world deployments.
پژوهشگران	نیما رزم جویی (نفر اول)، رضوان محمدی باغملایی (نفر دوم)
تاریخ انجام	1404-11-08

مشخصات پژوهش