Assessment of the regression relationship between the wear scar diameter of selected diesel fuel samples and the chosen physical parameters, By Shahriar Osfouri

Research

Title	Assessment of the regression relationship between the wear scar diameter of selected diesel fuel samples and the chosen physical parameters
Type	Article
Keywords	Diesel lubricity; Machine learning; Random forest regression; PySR symbolic regression; SHAP analysis; Industrial data
Journal	FUEL
DOI	https://doi.org/10.1016/j.fuel.2026.139185
Researchers	Hassan Behbahani (First researcher) , Reza Azin (Second researcher) , Shahriar Osfouri (Third researcher) , Erfan Mohammadian (Fourth researcher)

Abstract

While environmentally beneficial, reducing Sulphur content in diesel fuel has compromised its inherent lubricity, increasing wear in fuel injection systems. Conventional lubricity testing methods are often inconsistent and inefficient. This study introduces a data-driven framework that leverages machine learning to estimate diesel lubricity from standard, easily measurable fuel properties. A substantial industrial dataset of over 400 diesel samples from multiple refineries was analyzed using a dual-strategy approach: a high-performance Random Forest (RF) model and an interpretable Python symbolic regression (PySR) model, complemented by Principal Component Analysis (PCA) for dimensionality reduction. The RF model demonstrated high predictive accuracy (R2 > 0.96). In contrast, the PySR model generated a transparent, empirically derived equation, identifying distillation-related parameters as the most critical predictors within the analyzed dataset. While these regression models successfully capture statistical patterns, it is recognized that they primarily function as “black-box” estimators that do not account for the specific chemical additives or surface-active polar compounds that fundamentally govern boundary lubrication. SHAP analysis revealed that while parameters like density and flash point show statistical importance within this specific model, they are not necessarily physically related to Wear Scar Diameter (WSD) in practice. The methods use in the current work offers a refined statistical approach to estimating lubricity, providing a screening tool that complements traditional testing while acknowledging the inherent complexity of diesel fuel chemistry.