Objective: This study aims to leverage machine learning techniques to enhance embryo selection accuracy, thereby reducing IVF treatment costs by addressing data scarcity challenges in medical imaging.
Methods: The study population consisted of 48 time-lapse videos (24 good-quality and 24 poor-quality embryos) from the Isfahan Fertility and Infertility Center. Data were collected using EmbryoScope time-lapse imaging systems under ethical approval from the center. Three state-of-the-art convolutional neural network architectures (VGG19, InceptionV3, and EfficientNetB3) were evaluated following comprehensive preprocessing, including frame extraction, embryo-centered cropping, and quality control. A multistage data augmentation strategy was employed, incorporating spatial, temporal, photometric, and advanced methods (MixUp, CutMix). Models were trained via a two-stage transfer learning approach. Data analysis involved 5-fold GroupKFold cross-validation with performance metrics (accuracy, AUC, F1-score) and statistical tests (t-test, Cohen’s d). Analyses were performed using Python with libraries such as TensorFlow, Keras, Scikit-learn, and Matplotlib. Interpretability was assessed using Grad-CAM, validated by clinical embryologists.
Results: All models exhibited limited performance, with VGG19 yielding the highest accuracy (66.26%) and AUC (0.6149), marginally above chance levels. InceptionV3 performed worst (48.82% accuracy, 0.5119 AUC), while EfficientNetB3 showed intermediate results but the lowest F1-score (0.3886). Advanced augmentations offered minimal gains (ΔAUC < 0.02). Grad-CAM visualizations indicated frequent model attention to irrelevant background areas rather than clinically relevant embryo features, with low overlap (Dice coefficient < 0.42) against embryologist annotations. Wide confidence intervals and low statistical power underscored uncertainty in estimates. Joint training improved efficiency but compromised performance.
Conclusion: This study provi