Smart Interview Stress Detection Using Multimodal AI: A Role-Based WebRTC Platform With Audio-Visual FusionID: 2353 Abstract :Remote Interviewing, Now Ubiquitous In Postpandemic Hiring And Academic Admissions, Reduces The Physical Cues That Interviewers Rely Upon To Gauge Candidate Composure, Leading To Subjective And Inconsistent Stress Assessments. This Paper Presents A Smart Interview Stress Detection System—an End-to-end, Role-based Web Platform That Integrates User Authentication, Interview Scheduling, Real-time WebRTC Audio-video Communication, Multimodal Stress Inference, And Post-session Visualization In A Single Unified Workflow. A MobileNetV2-based Emotion Classifier Fine-tuned On 35,000 Images From FER2013, AffectNet, And CK+ Datasets Achieves 88.5% Seven-class Accuracy And Maps Facial Expressions To Stress Scores Through Empirically Weighted Emotion-stress Mapping (fear: 0.95, Angry: 0.90, Happy: 0.10). Concurrently, Browser-side Web Audio API Extracts Three Acoustic Stress Features—RMS Energy, Spectral Centroid, And Fundamental Pitch Via Autocorrelation—normalized And Combined Into An Audio Stress Score With Fixed Weights (energy: 0.30, Pitch: 0.25, Centroid: 0.20, Variability Terms: 0.25). A Confidenceweighted Fusion Algorithm With Exponential Smoothing (β = 0.3) Integrates Both Modalities Adaptively, Falling Back To Single-modality Estimation When Lighting Or Noise Compromises One Stream. Strict Role-based Privacy Controls Ensure That Only Interviewers Observe The Real-time Stress Gauge And Trend Chart Via Socket.IO Events; Candidates Interact With A Standard Video Interface Devoid Of Any Stress Indicators. A Sliding-window Trend Detection Algorithm Identifies Stress Peaks (threshold: μ + 1.5σ) And Classifies Trajectory As Ascending, Stable, Or Descending Using Linear Regression Slope. Comprehensive Validation Yields 99.2% Test Pass Rate Across 122 Unit, Integration, System, And Performance Tests; Average Frame Analysis Latency Is 250– 400ms; And User Acceptance Testing With 13 Participants Scores 4.5/5 Overall, With Privacy Compliance Rated 4.8/5. Keywords—Interview Stress Detection, Multimodal Fusion, Affective Computing, WebRTC, Socket.IO, Emotion Recognition, MobileNetV2, Audio Feature Extraction, Role-Based Privacy, Flask, Real-Time Stress Analysis, Facial Expression Recognition |
Published:02-4-2026 Issue:Vol. 26 No. 4 (2026) Page Nos:149-154 Section:Articles License:This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. How to CiteA. Srinivasa Rao, P. Naga Syamala, M. Sarath Chandra, Y. Shilpa, B. Harsha Vardhan, Smart Interview Stress Detection Using Multimodal AI: A Role-Based WebRTC Platform with Audio-Visual Fusion , 2026, International Journal of Engineering Sciences and Advanced Technology, 26(4), Page 149-154, ISSN No: 2250-3676. |