家电科技 ›› 2022, Vol. 0 ›› Issue (zk): 616-618.doi: 10.19784/j.cnki.issn1672-0172.2022.99.137

• 第五部分 健康适老与智能 • 上一篇    下一篇

面向智能家电的语音合成算法研究

李鹏, 胡蒙, 苏忠城   

  1. 无锡小天鹅电器有限公司 江苏无锡 214111
  • 发布日期:2023-03-28
  • 通讯作者: 胡蒙,E-mail:humeng3@midea.com。
  • 作者简介:李鹏,硕士学位。研究方向:智能家电语音交互。地址:江苏省无锡市新吴区长江南路18号。E-mail:lipeng90@midea.com。

Research on speech synthesis algorithm for intelligent home appliances

LI Peng, HU Meng, SU Zhongcheng   

  1. Wuxi Little Swan Electric Co., Ltd. Wuxi 214111
  • Published:2023-03-28

摘要: 语音合成作为智能家电语音交互功能的关键技术之一,其生成语音的质量直接影响着用户的智能交互体验。针对目前主流语音合成模型Glow TTS存在的合成语音时长固定且缺乏韵律的问题,使用基于标准化流的随机时长预测器对其进行改进优化,并以日语为研究对象进行试验。结果表明,改进后的模型在语音时长预测和语音合成质量方面均有较大提升,可为后续智能家电语音合成的优化提供指导。

关键词: 智能家电, 语音合成, Glow TTS模型优化, 标准化流, 随机时长预测器

Abstract: Speech synthesis is one of the key technologies of voice interaction on intelligent home appliances. The quality of synthesized speech directly influences the users’ intelligent interactive experiences. For the problem that the duration of synthesized speech is deterministic and lack of rhythms, utilizing the flow-based stochastic duration predictor to optimize the mainstream Glow TTS model, and an experiment was conducted with Japanese as the research object. The result indicates that the accuracy of speech duration prediction and the quality of synthesized speech are greatly improved. It can provide guidance for the following optimization study on speech synthesis of intelligent home appliances.

Key words: Intelligent home appliances, Speech synthesis, Glow TTS model optimization, Standard flow, Stochastic duration predictor

中图分类号: