ISSN No:2250-3676 ----- Crossref DOI Prefix: 10.64771 ----- Impact Factor: 9.625
   Email: ijesatj@gmail.com,   

(Peer Reviewed, Referred & Indexed Journal)


    An Audio Generation Model Based On Empirical Mode Decomposition And Generative Adversarial Networks For Enhancing Voice Quality And Diversity

    DR.S. Lakshmikantha Reddy, Boya Mahalakshmi, Gajjala Venkat Varshitha, A Narendra, Dudkula Vali

    Author

    ID: 2363

    DOI: Https://doi.org/10.64771/ijesat.2026.v26.i04.2363

    Abstract :

    This Paper Presents A Novel Audio Generation Framework Called EMDGAN, Which Integrates Improved Complete Ensemble Empirical Mode Decomposition (ICEEMD) With Generative Adversarial Networks (GANs) To Enhance Speech Quality And Diversity. The Proposed System Decomposes Speech Signals Into Intrinsic Mode Functions (IMFs) Before Adversarial Training, Allowing The Model To Better Capture Non-stationary And Nonlinear Characteristics Of Speech. Unlike Conventional WaveGAN, The Proposed Architecture Employs Multiple Generators Corresponding To Decomposed Signal Components And A Discriminator Optimized Using WGANGP Loss. Objective Evaluation Using Inception Score (IS) And Fréchet Inception Distance (FID), Along With Subjective Mean Opinion Score (MOS) Testing, Confirms Improved Clarity And Diversity. Furthermore, A Two-stage Filtering Process Is Introduced To Automatically Select High-quality Generated Samples. Experimental Results Demonstrate That EMDGAN Outperforms WaveGAN In Both Perceptual Quality. Keywords— GAN, ICEEMD, Audio Generation, Speech Enhancement, Data Augmentation, WGAN-GP.

    Published:

    02-4-2026

    Issue:

    Vol. 26 No. 4 (2026)


    Page Nos:

    226-231


    Section:

    Articles

    License:

    This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

    How to Cite

    DR.S. Lakshmikantha Reddy, Boya Mahalakshmi, Gajjala Venkat Varshitha, A Narendra, Dudkula Vali, An audio generation model based on empirical mode decomposition and generative adversarial networks for enhancing voice quality and diversity , 2026, International Journal of Engineering Sciences and Advanced Technology, 26(4), Page 226-231, ISSN No: 2250-3676.

    DOI: https://doi.org/10.64771/ijesat.2026.v26.i04.2363