Our results revealed that the median percentage decrease in serum 25(OH)D levels was approximately 3.4% (for samples stored at -20 °C) and 4.9% (for samples stored at -80 °C) after 7 months of freezing and storage, as determined via liquid chromatography‒mass spectrometry. The reduction was statistically significant at 7 months in women undergoing frozen embryo transfer when stored at -20 °C and − 80 °C but not at 2 weeks compared with baseline. The corresponding median percentage drop in serum 25(OH)D levels was approximately 3.6% (for samples stored at -20 °C) and 4.0% (for samples stored at -80 °C) after 7 months of freezing and storage using immunoassay, which was statistically significant for samples stored at -80 °C.

Our study addresses a common practical issue in clinical practice and research: serum samples are often frozen and stored for transport to central laboratories and analyzed in one batch every 1–2 weeks. In some retrospective studies, serum samples could have been stored for longer durations in terms of months or even years before analysis12. The duration of cryostorage for 2 weeks and 7 months in our study was chosen to reflect such practices. The small but statistically significant magnitude of change in serum 25(OH)D levels with storage in our study is consistent with existing studies. Older studies reported a 10% decrease in serum 25(OH)D levels after storage at approximately − 20 °C17,23. However, in the study by Ocke et al., some samples presented a 10% increase in mean vitamin D levels, which the authors attributed to possible systematic differences in the laboratory measurements17. In another study, after 5 years of storage at − 80 °C, Cavalier reported a median increase of 4.7% (95% confidence interval: 3.5–6.3%) in 25(OH)D levels measured by liquid chromatography‒mass spectrometry18. In our study, we included an analysis of reproductive age women and analyzed women undergoing ovarian stimulation for IVF as a separate group. These women presumably had higher serum estradiol levels, which may stimulate an increase in vitamin D-binding protein. Changes in vitamin D-binding protein levels during storage have been postulated to affect measured 25(OH)D levels in certain types of immunoassays, but may have less effect when measured by mass spectrometry18. We did not find a statistically significant change in serum 25(OH)D levels after 7 months of storage in women undergoing ovarian stimulation for IVF, in contrast to women undergoing frozen embryo transfer, when liquid chromatography‒mass spectrometry was used. On the other hand, the median percentage decrease in 25(OH)D levels with storage in women receiving ovarian stimulation was as high as 13.9% using immunoassay, which although was not statistically significant, likely because of the small sample size of only 20 women in this group, could be clinically important and lead to reclassification of women into different vitamin D categories. Nevertheless, the percentage drop in our samples was much smaller than that of a conference abstract, which reported an approximately one-third decline in 25(OH)D levels after 7 months at -80 oC19, but there were 4 outliers with up to a 37% (-20 °C and − 80 °C, respectively) difference from the baseline level. The variability in all other cases was lower than the 25% error imposed by external evaluation programs such as the Vitamin D External Quality Assurance Scheme (DEQAS).

Our own published data revealed that the prevalence of vitamin D deficiency (less than 50 nmol/L) was 42.2%, and the prevalence of vitamin D insufficiency (50 nmol/L to less than 75 nmol/L) was 44.5% in a cohort of 1178 reproductive age women who underwent IVF from 2012 to 20165. However, assuming the samples stored at -20 oC reduced by 3.4% over time, 31/1178 (2.6%) women, especially those with serum 25(OH)D levels near the cutoff, would be reclassified, leading to different results. Considering the potential 3.4% reduction, 39.6% of the women would be vitamin D deficient, and 44.9% would be vitamin D insufficient if they were reclassified. Therefore, even slight variations applied to clinical thresholds to determine vitamin D deficiency may impact treatment and fertility outcome analysis.

Depending on the assay method used, the serum 25(OH)D levels from serum samples that had been stored for 40 years was 18% and 42% lower than the levels from serum samples frozen for 2 years from 2 pregnant cohorts12. The authors demonstrated that similar racial and seasonal trends were derived from serum stored for 40 years as from that stored for 2 years, suggesting that even if 25(OH)D levels decreased over time, they did so systematically across samples. Archived serum samples could be used to assess trends.

The strength of the study was the use of mass spectrometry, which is the gold standard, and comparing it with immunoassays. In a recent study, serum samples from patients with various health statuses were stored for 5 years at -80oC and analyzed with three different immunoassays and mass spectrometry, which revealed different magnitudes of change with the different methods used18. An increase of 17–20% was observed in the 25(OH)D levels measured in pregnant women using two immunoassays but not by mass spectrometry; one hypothesis is that the level of vitamin D-binding protein could have changed over the course of storage and may have affected some immunoassays. Notably, the mean 25(OH)D levels measured by mass spectrometry were higher than those measured by immunoassay in our study. In general, the agreement between the serum 25(OH)D levels measured by immunoassay and mass spectrometry in our study was fair among the 55 women, as reflected by the kappa coefficient as well as the concordance correlation coefficient using the classification suggested by McBride22. The same conclusion was reached from the Bland‒Altman plot because of the large difference in the range of vitamin D levels between the two methods (ranging from approximately − 36.6 to 40.6%) in the overall plot. The agreement seemed to be relatively better in the subgroup of women with vitamin D deficiency, but the results were more diverse across other vitamin D levels. This is in contrast to existing studies, which have shown good agreement between immunoassays and mass spectrometry, but, there was a reported negative bias of immunoassays compared with mass spectrometry in various studies, which was as high as 19%24,25,26. The Abbott architect assay is known to have ~ 80% cross-reactivity with 25(OH)D2, resulting in underestimation of the total 25(OH)D level26. Negative bias from immunoassays can potentially lead to unnecessary treatment for some patients. However, vitamin D supplements are low cost and are generally safe within the defined safety limits; therefore, they may not cause significant clinical issues.

We longitudinally assessed the change in serum 25(OH)D levels in the same subjects over time, and the samples were stored in aliquots that were undisturbed until assessment. However, a limitation of our study was that we did not assess these parameters beyond 7 months. It is not known whether further storage would lead to a greater reduction in serum levels. Our sample population included reproductive age women with supraphysiological estradiol levels as well as women undergoing natural cycle frozen embryo transfer whose estradiol levels were presumably within the physiological range. However, the results may not be generalizable to pregnant women, postmenopausal women or men.