Improved SEGAN based ATC speech enhancement algorithm for air traffic control

WANG Yuzhe; LI Xin; MOU Rui; ZHOU Jihua; HE Yifu

doi:10.13700/j.bh.1001-5965.2022.0874

Volume 50 Issue 12

Dec. 2024

Turn off MathJax

Article Contents

Journal of Beijing University of Aeronautics and Astronautics > 2024 > 50(12): 3930-3939.

WANG Y Z，Li X，MOU R，et al. Improved SEGAN based ATC speech enhancement algorithm for air traffic control[J]. Journal of Beijing University of Aeronautics and Astronautics，2024，50（12）：3930-3939 （in Chinese） doi: 10.13700/j.bh.1001-5965.2022.0874

Citation:

PDF( 1651 KB)

Improved SEGAN based ATC speech enhancement algorithm for air traffic control

doi: 10.13700/j.bh.1001-5965.2022.0874

WANG Yuzhe^{1, 2},
LI Xin^{1
,
,},
MOU Rui^{3, 4},
ZHOU Jihua⁵,
HE Yifu⁶

1.
College of Civil Aviation Safety Engineering，Civil Aviation Flight University of China，Guanghan 618307，China
2.
China Eastern Airlines Co.，LTD. Sichuan branch，Chendu 610000，China
3.
Civil Aviation Flight University of China Guang Han College，Civil Aviation Flight University of China，Guanghan 618307，China
4.
Key Laboratory of Civil Aviation Flight Technology and Flight Safety，Civil Aviation Flight University of China，Guanghan 618307，China
5.
College of Air Traffic Management，Civil Aviation Flight University of China，Guanghan 618307，China
6.
College of Economics and Management，Civil Aviation Flight University of China，Guanghan 618307，China

Funds:

National Natural Science Foundation of China Civil Aviation Joint Fund Key Project (U2033213); Civil Aviation Flight University of China-Safety and security capacity enhancement special (MHAQ2022007); Civil Aviation Flight University of China-Flight Technology Special Project of Key Laboratory of Civil Aviation Flight Technology and Flight Safety (FZ2022ZX10); Key Laboratory of Air Traffic Control Civil Aviation Flight University of China (XM3484)

More Information

Corresponding author: E-mail：lixin11236@163.com
Received Date: 30 Oct 2022
Accepted Date: 28 Jul 2023

Available Online: 26 Dec 2024

Publish Date: 08 Sep 2023

Abstract

Abstract

An enhanced version of the Speech Enhancement Generative Adversarial Network (SEGAN) algorithm used in air traffic control (ATC) is suggested in an effort to raise the standard of radiotelephony communication. Aiming at the problem that the traditional SEGAN is submerged under the condition of low signal-to-noise ratio, a multi-stage, multi-mapping, multi-dimensional output generation and multi-scale, multi-discriminator network models are proposed. First, the deep neural network structure is used to extract the speech semantic features, and the ATC speech semantic segmentation is finished. Secondly, set up multiple generators to further optimize the speech signal. Then, a down sampling module is added to the convolutional layer to improve the utilization of speech information by the model and reduce the loss of speech information. Finally, multi-scale, multiple discriminators are used to learn the distribution law and information of speech samples in multiple directions. According to the results, the improved SEGAN model's Short-Time Objective Intelligibility (STOI) and Perceptual Evaluation of Speech Quality (PESQ) are improved by 23.28% and 20.11%, respectively, under low signal-to-noise ratio conditions. This can effectively and swiftly improve ATC speech, making it a good option for follow-up. Provide preparatory work for subsequent Automatic Speech Recognition of ATC.
- speech enhancement,
- generative adversarial network,
- convolutional neural network,
- ATC speech,
- low signal-to-noise ratio

FullText(HTML)

References(23)

References

[1]	张军峰, 游录宝, 周铭, 等. 基于点融合系统的多目标进场排序与调度[J]. 北京亚洲成人在线一二三四五六区学报, 2023, 49(1): 66-73. ZHANG J F, YOU L B, ZHOU M, et al. Multi-objective arrival sequencing and scheduling based on point merge system[J]. Journal of Beijing University of Aeronautics and Astronautics, 2023, 49(1): 66-73 (in Chinese).
[2]	王钇翔. 面向民航空中管制语音指令的语音增强算法系统研究与应用[D]. 成都: 电子科技大学, 2022: 46-60. WANG Y X. Research and application of voice enhancement algorithm system for civil aviation air traffic control voice command[D]. Chengdu: University of Electronic Science and Technology of China, 2022: 46-60(in Chinese).
[3]	中国民航局. 2018年中国航空安全年度报告 [R]. 北京: 中国民航局, 2019. Civil Aviation Administration of China. Annual report on aviation safety in China, 2018 [R]. Beijing: Civil Aviation Administration of China, 2019(in Chinese).
[4]	周坤, 陈文杰, 陈伟海, 等. 基于三次样条插值的扩展谱减语音增强算法[J]. 北京亚洲成人在线一二三四五六区学报, 2023, 49(10): 2826-2834. ZHOU K, CHEN W J, CHEN W H, et al. Spline subtraction speech enhancement based on cubic spline interpolation[J]. Journal of Beijing University of Aeronautics and Astronautics, 2023, 49(10): 2826-2834(in Chinese).
[5]	KARAM M, KHAZAAL H F, AGLAN H, et al. Noise removal in speech processing using spectral subtraction[J]. Journal of Signal and Information Processing, 2014, 5(2): 32-41. doi: 10.4236/jsip.2014.52006
[6]	CHEN J D, BENESTY J, HUANG Y T, et al. New insights into the noise reduction Wiener filter[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2006, 14(4): 1218-1234. doi: 10.1109/TSA.2005.860851
[7]	LIM J S, OPPENHEIM A V. Enhancement and bandwidth compression of noisy speech[J]. Proceedings of the IEEE, 1979, 67(12): 1586-1604. doi: 10.1109/PROC.1979.11540
[8]	孙琦. 基于子空间的低计算复杂度语音增强算法研究[D]. 长春: 吉林大学, 2017: 18-23. SUN Q. Research on speech enhancement algorithm with low computational complexity based on subspace[D]. Changchun: Jilin University, 2017: 18-23 (in Chinese).
[9]	DENDRINOS M, BAKAMIDIS S, CARAYANNIS G. Speech enhancement from noise: A regenerative approach[J]. Speech Communication, 1991, 10(1): 45-57. doi: 10.1016/0167-6393(91)90027-Q
[10]	TUFTS D W, KUMARESAN R, KIRSTEINS I. Data adaptive signal estimation by singular value decomposition of a data matrix[J]. Proceedings of the IEEE, 1982, 70(6): 684-685. doi: 10.1109/PROC.1982.12367
[11]	LEE D D, SEUNG H S. Learning the parts of objects by non-negative matrix factorization[J]. Nature, 1999, 401: 788-791. doi: 10.1038/44565
[12]	娄迎曦, 袁文浩, 时云龙, 等. 融合注意力机制的QRNN语音增强方法[J]. 山东理工大学学报(自然科学版), 2022, 36(3): 7-12. LOU Y X, YUAN W H, SHI Y L, et al. A speech enhancement method based on QRNN incorporating attention mechanism[J]. Journal of Shandong University of Technology (Natural Science Edition), 2022, 36(3): 7-12 (in Chinese).
[13]	SCHMIDHUBER J. Deep learning in neural networks: An overview[J]. Neural Networks, 2015, 61: 85-117. doi: 10.1016/j.neunet.2014.09.003
[14]	SCALART P, FILHO J V. Speech enhancement based on a priori signal to noise estimation[C]// 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings. Piscataway: IEEE Press, 1996: 629-632.
[15]	XU Y, DU J, DAI L R, et al. An experimental study on speech enhancement based on deep neural networks[J]. IEEE Signal Processing Letters, 2014, 21(1): 65-68. doi: 10.1109/LSP.2013.2291240
[16]	KANG T G, KWON K, SHIN J W, et al. NMF-based speech enhancement incorporating deep neural network[C]//Interspeech 2014. Singapore: ISCA, 2014.
[17]	GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial networks[J]. Communications of the ACM, 2020, 63(11): 139-144. doi: 10.1145/3422622
[18]	PASCUAL S, BONAFONTE A, SERRÀ J. SEGAN: Speech enhancement generative adversarial network[EB/OL]. (2017-06-09)[2022-10-30].
[19]	尹文兵, 高戈, 曾邦, 等. 基于时频域生成对抗网络的语音增强算法[J]. 计算机科学, 2022, 49(6): 187-192. doi: 10.11896/jsjkx.210500114 YIN W B, GAO G, ZENG B, et al. Speech enhancement based on time-frequency domain GAN[J]. Computer Science, 2022, 49(6): 187-192 (in Chinese). doi: 10.11896/jsjkx.210500114
[20]	李晓理, 张博, 王康, 等. 人工智能的发展及应用[J]. 北京工业大学学报, 2020, 46(6): 583-590. LI X L, ZHANG B, WANG K, et al. Development and application of artificial intelligence[J]. Journal of Beijing University of Technology, 2020, 46(6): 583-590 (in Chinese).
[21]	QUAN T M, NGUYEN-DUC T, JEONG W K. Compressed sensing MRI reconstruction using a generative adversarial network with a cyclic loss[J]. IEEE Transactions on Medical Imaging, 2018, 37(6): 1488-1497. doi: 10.1109/TMI.2018.2820120
[22]	PHAN H, MCLOUGHLIN I V, PHAM L, et al. Improving GANs for speech enhancement[J]. IEEE Signal Processing Letters, 2020, 27: 1700-1704. doi: 10.1109/LSP.2020.3025020
[23]	PANDEY A, WANG D L. On adversarial training and loss functions for speech enhancement[C]// 2018 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE Press, 2018: 5414-5418.

Relative Articles

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(9) / Tables(9)

Get Citation

PDF

XML

Article Metrics

Article views(450) PDF downloads(21)

Improved SEGAN based ATC speech enhancement algorithm for air traffic control

doi: 10.13700/j.bh.1001-5965.2022.0874

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Proportional views

Related

Improved SEGAN based ATC speech enhancement algorithm for air traffic control

doi: 10.13700/j.bh.1001-5965.2022.0874

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Proportional views

Related

Export File

Citation

Format

Content