Equivalencia y acuerdo en validación de instrumentos: una revisión metodológica práctica

##plugins.themes.themeEleven.article.main##

Silvina Dell’Era https://orcid.org/0000-0001-9186-6229
Vanina Pagotto https://orcid.org/0000-0003-0309-2660

Palabras clave

Métodos, Estudio de Validación, Exactitud de los Datos, Estudio de Evaluación, Análisis de Datos

Resumen

El análisis estadístico adecuado es esencial en estudios de validación de instrumentos que cuantifican variables continuas frente a un estándar de referencia. Este artículo describe enfoques estadísticos para evaluar la equivalencia entre instrumentos de medición que combinan métodos gráficos y pruebas estadísticas. Se ejemplifica su aplicación en un estudio que evaluó la exactitud de una pulsera de medición de actividad física (Xiaomi Mi Band 4) para contar pasos caminados en diferentes actividades en pacientes con enfermedades respiratorias crónicas, y se comparó con un método de referencia basado en videofilmación. Se emplearon intervalos de confianza frente a zonas de equivalencia predefinidas, pruebas TOST (two one-sided tests) y se calcularon indicadores de acuerdo grupal e individual como el error medio (ME), el error porcentual medio (MPE), el error porcentual absoluto medio (MAPE) y la raíz del error cuadrático medio (RMSE). Asimismo, se discutieron algunos errores frecuentes como el uso inapropiado de gráficos de dispersión o correlaciones para evaluar la exactitud. Se concluyó que la elección de métodos estadísticos apropiados es un aspecto clave para asegurar la validez clínica y metodológica en estudios de equivalencia entre instrumentos de medición que cuantifican variables continuas y un método de referencia.

Abstract 25 | PDF Downloads 15

Referencias

1. Shei RJ, Holder IG, Oumsang AS, et al. Wearable activitytrackers-advancedtechnologyoradvanced marketing?Eur J Appl Physiol. 2022;122(9):1975-90.doi: 10.1007/s00421-022-04951-1

2. Dixon PM, Saint-Maurice PF, Kim Y, et al. A Primer on the Use of Equivalence Testing for Evaluating Measurement Agreement. Med Sci Sports Exerc. 2018;50(4):837-45. doi: 10.1249/MSS.0000000000001481

3. Giurgiu M, von Haaren-Mack B, Fiedler J, et al. The wearable landscape: Issues pertaining to the validation of the measurement of 24-h physical activity, sedentary, and sleep behavior assessment. J Sport Health Sci. 2024;14:101006. doi: 10.1016/j.jshs.2024.101006

4. Dell’Era S, Gimeno-Santos E, Chain NAF, et al. Exactitud del Xiaomi Mi Band 4 para contabilizar pasos en adultos con enfermedades respiratorias crónicas. Estudio de concordancia. Respirar. 2024;16(2):101-12. doi: 10.55720/respirar.16.2.1

5. Kim J, Kenyon J, Billingsley H, et al. Validity of the Actigraph-GT9X accelerometer for measuring steps and energy expenditures in heart failure patients. PLoS One. 2024;19(12):e0315575. doi: 10.1371/journal.pone.0315575

6. Hibbing PR, Pilla M, Birmingham L, et al. Evaluation of the Garmin Vivofit 4 for assessing sleep in youth experiencing sleep disturbances. Digit Health. 2024. doi: 10.1177/20552076241277150

7. Taffé P, Zuppinger C, Burger GM, et al. The Bland-Altman method should not be used when one of the two measurement methods has negligible measurement errors. PLoS One. 2022;17(12):e0278915. doi: 10.1371/journal.pone.0278915

8. Welk GJ, Bai Y, Lee JM, et al. Standardizing Analytic Methods and Reporting in Activity Monitor Validation Studies. Med Sci Sports Exerc. 2019;51(8):1767-80. doi: 10.1249/MSS.0000000000001966

9. Ialongo C. The logic of equivalence testing and its use in laboratory medicine. Biochem Med (Zagreb). 2017;27(1):5-13.doi: 10.11613/BM.2017.001

10. Mayorga-Vega D, Casado-Robles C, Guijarro-Romero S, et al. Criterion-Related Validity of Consumer-Wearable Activity Trackers for Estimating Steps in Primary School children under Controlled Conditions: Fit-PersonStudy. J Sports Sci Med. 2024;23(1):79-96. doi: 10.52082/jssm.2024.79

11. Casado-Robles C, Mayorga-Vega D, Guijarro-Romero S, et al. Validity of the Xiaomi Mi Band 2, 3, 4 and 5 Wristbands for Assessing Physical Activity in 12-to-18-Year-Old Adolescents under Unstructured Free-Living Conditions. Fit-Person Study. J Sports Sci Med. 2023;22(2):196-211. doi: 10.52082/jssm.2023.196

12. Hao Y, Ma XK, Zhu Z, et al. Validity of Wrist-Wearable Activity Devices for Estimating Physical Activity in Adolescents: Comparative Study. JMIR Mhealth Uhealth. 2021;9(1):e18320. doi: 10.2196/18320

13. Ummels D, Bijnens W, Aarts J, et al. The Validation of a Pocket Worn Activity Tracker for Step Count and Physical Behavior in Older Adults during Simulated Activities of Daily Living. Gerontol Geriatr Med. 2020;6:2333721420951732. doi: 10.1177/2333721420951732

14. Kwon S, Wan N, Burns RD, et al. The Validity of Motion Sense HRV in Estimating Sedentary Behavior and Physical Activity under Free-Living and Simulated Activity Settings. Sensors (Basel). 2021;21(4). doi: 10.3390/s21041411

15. Viciana J, Casado-Robles C, Guijarro-Romero S, et al. Are Wrist-Worn Activity Trackers and Mobile Applications Valid for Assessing Physical Activity in High School Students? Wearfit Study. J Sports Sci Med. 2022;21(3):356-75. doi: 10.3390/s21041411

16. Silva JC, Silva KF, Torres VB, et al. Reliability and validity of My Jump 2 app to measure the vertical jump in visually impaired five-a-side soccer athletes. Peer J. 2024;12:e18170. doi: 10.7717/peerj.18170

17. Matlary RED, Holme PA, Glosli H, et al. Comparisonof free-living physical activity measurements between ActiGraph GT3X-BT and Fitbit Charge 3 in young people with haemophilia. Haemophilia. 2022;28(6):e172-80. doi: 10.1111/hae.14624

18. Sullivan K, Metoyer CJ, Hornikel B, et al. Agreement Between A 2-Dimensional Digital Image-Based 3-Compartment Body Composition Model and Dual Energy X-Ray Absorptiometry for The Estimation of Relative Adiposity. J Clin Densitom. 2022;25(2):244-51. doi: 10.1016/j.jocd.2021.08.004

19. Majmudar MD, Chandra S, Yakkala K, et al. Smartphone camera based assessment of adiposity: a validation study. NPJ Digit Med. 2022;5(1):79. doi: 10.1038/s41746-022-00628-3

20. Shinozaki K, Yu PJ, Zhou Q, et al. An Automation System Equivalent to the Douglas Bag Technique Enables Continuous and Repeat Metabolic Measurements in Patients Undergoing Mechanical Ventilation. Clin Ther. 2022;44(11):1471-9. doi: 10.1016/j.clinthera.2022.09.004

21. Correa-Rojas J. Coeficiente de correlación intraclase: aplicaciones para estimar la estabilidad temporal de un instrumento de medida. Cienc Psicol. 2021;15(2):e1220. doi: 10.22235/cp.v15i2.2318

22. Nazaroff J, Mark B, Learned J, et al. Measurement of acetabular wall indices: comparison between CT and plain radiography. J Hip Preserv Surg. 2021;8(1):51-7. doi: 10.1093/jhps/hnab008

23. Villa G, Cerfoglio S, Bonfiglio A, et al. Validation of a Commercially Available IMU-Based System Against an Optoelectronic System for Full-Body Motor Tasks. Sensors (Basel). 2025;25(12):3736. doi: 10.3390/s25123736

24. Johnston W, Judice PB, Molina García P, et al. Recommendations for determining the validity of consumer wearable and smartphone step count: expert statement and checklist of the INTERLIVE network. Br J Sports Med. 2021;55(14):780-93. doi: 10.1136/bjsports-2020-103147

25. Courtney JB, Nuss K, Lyden K, et al. Comparing the activPAL software’s Primary Time in Bed Algorithm against Self-Report and van derBerg's Algorithm. Meas Phys Educ Exerc Sci. 2021;25(3):212-26. doi: 10.1080/1091367x.2020.1867146

26. Tinsley GM, Park KS, Saenz C, et al. Deuterium oxide validation of bioimpedance total body water estimates in Hispanic adults. Front Nutr. 2023;10:1221774. doi: 10.3389/fnut.2023.1221774

27. McCarthy C, Tinsley GM, Yang S, et al. Smartphone prediction of skeletal muscle mass: model development and validation in adults. Am J Clin Nutr. 2023;117(4):794-801. doi: 10.1016/j.ajcnut.2023.02.003

28. Katz MJ, Wang C, Nester CO, et al. T-MoCA: A valid phone screen for cognitive impairment in diverse community samples. Alzheimers Dement (Amst). 2021;13(1):e12144. doi: 10.1002/dad2.12144

29. Cheng X, Liu J, Wang Y, et al. Comparison of Students’ Physical Activity at Different Times and Establishment of a Regression Model for Smart Fitness Trackers. Sensors (Basel). 2025;25(6). doi: 10.3390/s25061726

30. Gutierrez NM, Cribbie R. Effect Sizes for Equivalence Testing: Incorporating the Equivalence Interval. Methods in Psychology. 2022;9:100127. doi: 10.31234/osf.io/5buz9