JOURNAL ARTICLE
Keywords: AR, VR, learning analytics, logistic regression, elastic net, calibration, mixed effects.
Abstract: Two complementary education datasets, one VR and one AR, are used to test whether standard machine-learning models can classify improvement in learning outcomes and predict survey-based composite scores with transparent, reproducible steps. Local-aware cleaning handles semicolon delimiters and comma decimals; duplicates are removed; categorical variables are one-hot encoded; continuous variables are standardized where appropriate; targets are never imputed. For the VR task, Logistic Regression, Random Forest, and MLP are trained on a stratified train–validation–test split with probability calibration and decision-threshold tuning. Logistic Regression attains macro-F1 = 0.622 and ROC-AUC = 0.642 on the held-out test set. Setting the operating threshold to t = 0.30 yields accuracy = 0.692 and increases minority-class recall while maintaining stable macro-F1. For the AR task, ElasticNet, Random Forest, and Gradient Boosting are evaluated with 5×10 repeated cross-validation; ElasticNet achieves the lowest error with MAE = 1.812 ± 0.399. Model explanations indicate that access to VR equipment, habitual VR use, age, and weekly usage hours are the strongest correlations of improvement in the VR dataset, while ES subscales dominate prediction in the AR dataset. The approach emphasizes calibrated outputs, honest validation, and simple models that are easy to audit. A complete, reproducible Collab workflow with figures and tables accompanies the study to support classroom adoption and independent verification. Bottom line: linear methods with calibration suffice for VR classification, and shrinkage methods minimize error for AR prediction on correlated item sets.
Article Info: Received: 15 Aug 2025, Received in revised form: 07 Sep 2025, Accepted: 09 Sep 2025, Available online: 13 Sep 2025
DOI: 10.22161/ijtle.4.5.1
Total View: 63 | Page No: 1-9 | ![]() |