基于LLM的家电智能交互控制系统研究

doi:10.19784/j.cnki.issn1672-0172.2024.99.026

家电科技 ›› 2024, Vol. 0 ›› Issue (zk): 125-129.doi: 10.19784/j.cnki.issn1672-0172.2024.99.026

基于LLM的家电智能交互控制系统研究

李伟^1,2, 贾奇伟^1,2, 劳春峰¹, 宋玉军¹

1.青岛海尔空调器有限总公司山东青岛 266101;
2.数字家庭网络国家工程研究中心山东青岛 266101

出版日期:2024-12-10 发布日期:2024-12-31
作者简介:李伟,博士学位,研究方向：机器人、家电的智能感知、交互、决策等智能化设计。地址：山东省青岛市崂山区海尔路1号。E-mail：liwei12@haier.com。
基金资助:
山东省博士后资助项目-基于大语言模型的智慧家电控制系统研究

Research on intelligent interactive control system of home appliances based on LLM

LI Wei^1,2, JIA Qiwei^1,2, LAO Chunfeng¹, SONG Yujun¹

1. Qingdao Haier Air Conditioner Co., Ltd. Qingdao, 266101;
2. National Engineering Research Center of Digital Home Networkin Qingdao, 266101

Online:2024-12-10 Published:2024-12-31

摘要/Abstract

摘要： 家电智能化研究不断推进,当前智能控制系统的研究与应用实践中存在诸多影响用户体验的技术缺陷。针对智能语音交互控制的不足,研究并提出基于大语言模型（Large Language Model,LLM）的智能家电控制系统,内容重点聚焦于大语言模型在家电领域垂直应用的训练方法与参数调优,并同步构建基于大模型的家电智能交互系统。创新性地引入语音情感识别（Speech Emotion Recognition,SER）和情感语音合成（Text to Speech,TTS）,构建一整套拟人化应用的智能人机交互体系。该人机交互系统的应用,可进一步提升家电设备交互控制能力,更接近人的自然交互方式,缩短机器与人的距离,显著提升用户的设备使用体验。

关键词: 家电智能化, 大语言模型, 语音情感识别, 情感语音合成

Abstract: The research on intelligentization of home appliances is advancing continuously, and there are many technical deficiencies in the current research and application practice of intelligent control systems that affect the user experience. This study proposes the research of intelligent home appliance control system based on Large Language Model (LLM) to address the shortcomings of intelligent voice interaction control. The content focuses on the training method and parameter tuning of the vertical application of the Large Language Model in the field of home appliances, and the construction of the intelligent home appliance interaction system based on the Large Model is synchronized. Speech Emotion Recognition (SER) and Text to Speech (TTS) are innovatively introduced to construct a whole set of intelligent human-computer interaction system for anthropomorphic applications. The application of this human-computer interaction system can further enhance the interaction and control capability of home appliances, closer to the natural interaction of human beings, shorten the distance between machines and human beings, and significantly improve the user's experience of using the equipment.

Key words: Intelligent home appliances, Large language model, Speech emotion recognition, Emotional text to speech

中图分类号:

TM925.1
TP3

李伟, 贾奇伟, 劳春峰, 宋玉军. 基于LLM的家电智能交互控制系统研究[J]. 家电科技, 2024, 0(zk): 125-129.

LI Wei, JIA Qiwei, LAO Chunfeng, SONG Yujun. Research on intelligent interactive control system of home appliances based on LLM[J]. Journal of Appliance Science & Technology, 2024, 0(zk): 125-129.

/ 推荐

参考文献 17

[1]	Liu S, Demirel M F, Liang Y.N-gram graph: Simple unsupervised representation for graphs, with applications to molecules[J]. Advances in neural information processing systems, 2019, 32.
[2]	Vaswani A, Shazeer N, Parmar N, et al.Attention is all you need[J]. Advances in neural information processing systems, 2017, 30.
[3]	Peters M, Neumann M, Iyyer M, et al.Deep contextualized word representations[A]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)[C], volume 1. 2018: 2227-2237.
[4]	Radford A, Narasimhan K, Salimans T, et al.Improving language understanding by generative pre-training[J].
[5]	Radford A, Wu J, Child R, et al.Language models are unsupervised multitask learners[J]. OpenAI blog, 2019, 1(08): 09.
[6]	Raffel C, Shazeer N, Roberts A, et al.Exploring the limits of transfer learning with a unified text to-text transformer[J]. The Journal of Machine Learning Research, 2020, 21(01): 5485-5551.
[7]	Brown T, Mann B, Ryder N, et al.Language models are few-shot learners[J]. Advances in neural information processing systems, 2020, 33:1877-1901.
[8]	Zhao W X, Zhou K, Li J, et al. A survey of large language models[J]. arXiv preprint arXiv:2303.18223, 2023.
[9]	田云龙, 王统帅, 牛丽. 智能家居领域利用AIGC家电垂直大模型提升洗衣机智能交互体验的系统和方法[J]. 家电科技, 2023(S1): 126-130.
[10]	Paine T L, Khorrami P, Chang S, et al. Fast wavenet generation algorithm[J]. arxiv preprint arxiv:1611.09482, 2016.
[11]	Klejsa J, Hedelin P, Zhou C, et al.High-quality speech coding with sample RNN[A]//ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)[C], IEEE, 2019: 7155-7159.
[12]	Sotelo J, Mehri S, Kumar K, et al.Char2wav: End-to-end speech synthesis[J]. 2017.
[13]	Wang Y, Skerry-Ryan R J, Stanton D, et al. Tacotron: Towards end-to-end speech synthesis[J]. arxiv preprint arxiv:1703.10135, 2017.
[14]	Skerry-Ryan R J, Battenberg E, **ao Y, et al. Towards end-to-end prosody transfer for expressive speech synthesis with tacotron[A]// international conference on machine learning. PMLR[C], 2018: 4693-4702.
[15]	McGilloway S, Cowie R, Douglas-Cowie E, et al. Approaching Automatic Recognition of Emotion from Voice: A Rough Benchmark [A]// ISCA Workshop on Speech & Emotion[C], 2000.
[16]	Sun C, Zhang M, Wu R, et al.A convolutional recurrent neural network with attention framework for speech separation in monaural recordings[J]. Scientific Reports, 2021, 11(01): 1434.
[17]	Wei-Ning Hsu, Benjamin Bolte, Yao-Hung Hubert Tsai, Kushal Lakhotia, Ruslan Salakhutdinov,Abdelrahman Mohamed, “Hubert: Selfsupervised speech representation learning by masked prediction of hidden units,” Trans. of TASLP, 2021[Z].

基于LLM的家电智能交互控制系统研究

Research on intelligent interactive control system of home appliances based on LLM

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献 17

相关文章 1

Metrics

本文评价

推荐阅读 10