Changes in pH during CO2 injection into oceans can lead to significant negative environmental impacts, making
it particularly important to track these changes. However, previous studies have not comprehensively investigated
the development of machine learning models to estimate this parameter. To fill this research gap, this study
developed 15 models comprising five machine learning methods: regression trees, support vector regression,
Gaussian process regression, bagged trees, and boosted trees, and three optimization algorithms: random search,
grid search, and Bayesian optimization. A total of 170 data points were used to develop these models. After data
preprocessing and model development, it was determined that the boosted trees model optimized with grid
search, with an R2 = 0.9964 and RMSE = 0.0156, performed the best, while the support vector regression model
optimized with random search, with an R2 = 0.8426 and RMSE = 0.1030, had the lowest accuracy. The boosted
trees model optimized with grid search was found to estimate all data points with a residual error of less than
0.15 and an absolute relative error of 4.62 %. The 95 % confidence interval for the RMSE was calculated based on
the best model, showing that the error lies between 0.009060 and 0.023256 with 95 % confidence. Pearson
correlation analysis was used for sensitivity analysis. The results showed that in all models, temperature, solubility,
and pressure have a negative correlation with pH, while salinity has a positive correlation. Additionally,
solubility and salinity exhibited the highest and lowest correlations, with average values of 0.9225 and 0.0594,
respectively. Due to the accuracy of the developed models, these models can help to optimize the operation of
injecting CO2 into oceans to reduce harmful environmental effects.