This study had two primary objectives: (1) to evaluate the validity of Persian best–worst (BW) norms of valence, arousal,
dominance, and concreteness relative to existing rating scale (RS) norms; and (2) to model and extrapolate these human
BW norms using skip-gram word embeddings. BW data were collected from 1071 Persian speakers for 3000 Persian words
and compared with individual and merged RS datasets and translated BW norms. Human BW norms correlated moderately
to strongly with these norms, despite substantial variability across RS datasets. Human BW and weighted RS composite
norms showed similarly weak correlations between arousal and valence and lexical decision reaction times. For extrapolation, emotional and semantic measures were modeled using generalized additive models on principal components of Persian
word embeddings. The models explained substantial variance in all emotional dimensions and concreteness, ranging from
34.5% (dominance) to 68.4% (concreteness). Extrapolated BW estimates showed strong correlations with human BW and
weighted RS composite norms for valence (r = .70 to .83), moderate correlations for arousal, and generally strong correlations across all dimensions with GPT-5.1 predictions and the estimates of other external resources. Qualitative comparisons
of human BW norms and extrapolated BW estimates showed strong semantic alignment at high values across dimensions
but weaker alignment at low arousal and dominance. The findings suggest BW scaling offers similar discrimination and
predictive validity compared to RS norms, with a higher potential for efficiently expanding emotional and semantic norms
in resource-limited languages like Persian. Extrapolated norms for 85,716 words are provided.