智慧家庭语音交互意图理解评测现状

doi:10.19784/j.cnki.issn1672-0172.2025.99.004

摘要/Abstract

摘要： 随着人工智能与物联网的融合,智慧家庭语音交互正由功能指令执行迈向需求理解和主动服务。意图理解作为服务的核心,其评测体系是衡量语音助手性能与用户体验的关键。综述了智慧家庭意图理解的评测现状,系统梳理了技术演进、评测数据集和指标体系。针对现有局限,提出评测体系面临的三大挑战：智能程度量化难、复杂交互覆盖不足、语料标准化缺失,并展望未来应构建以用户体验与情感智能为导向的新型评测框架。进一步提出三项建设路径：标准化数据与脚本、统一指标与报告协议、仿真与真实场景的迁移验证,为实现人本位智能评测提供参考。

关键词: 智慧家庭, 语音交互, 意图理解, 评测系统, 大语言模型

Abstract: With the integration of artificial intelligence (AI) and the Internet of Things (IoT), smart home voice interaction is evolving from functional command execution to demand understanding and proactive service. Intent Understanding is the core of this service , and its evaluation system is crucial for measuring voice assistant performance and user experience. Reviews the current status of evaluation for intent understanding in smart home voice interaction, systematically summarizing its technical evolution, datasets, and metric systems. Addressing existing limitations, it identifies three major challenges: difficulty in quantifying intelligence, insufficient coverage of complex interactions, and lack of corpus standardization. Further envisions a human-centered evaluation framework guided by user experience and emotional intelligence. Also puts forward three construction paths: standardizing data and scripts , unifying metrics and reporting protocols , and validating transferability between simulation and real-world scenarios, providing a reference for achieving human-centric intelligence evaluation.

Key words: Smart home, Voice interaction, Intent understanding, Evaluation framework, Large Language Model (LLM)

中图分类号:

TM925
TP18

潘悦然, 焦利敏, 杨荣祥, 崔庆用, 姚昌松, 徐翼. 智慧家庭语音交互意图理解评测现状[J]. 家电科技, 2025, 0(zk): 17-23.

PAN Yueran, JIAO Limin, YANG Rongxiang, CUI Qingyong, YAO Changsong, XU Yi. A survey on the evaluation of intent understanding in smart home voice interaction[J]. Journal of Appliance Science & Technology, 2025, 0(zk): 17-23.

/ 推荐

参考文献

[1] 焦利敏, 李红伟, 曲宗峰, 等. 基于DeepSeek推理大模型的智能家电智能体技术路线及应用[J]. 家电科技, 2025(01): 43-47.
[2] 智能家用电器的语音交互技术第1部分:通用要求[S]. GB/T 45354.1—2025. 国家市场监督管理总局, 国家标准化管理委员会, 2025-04-14.
[3] 智能家用电器的语音交互技术第2部分:测试方法[S]. GB/T 45354.2—2025.国家市场监督管理总局, 国家标准化管理委员会, 2025.
[4] 信息技术智能语音交互测试方法第2部分:语义理解[S]. GB/T 41813.2—2022.国家市场监督管理总局, 国家标准化管理委员会, 2022-10-12.
[5] 人工智能大模型第2部分:评测指标与方法[S]. GB/T 45288.2—2025. 国家市场监督管理总局, 国家标准化管理委员会, 2025-02-28.
[6] Li J., Chen C., Rahimi Azghadi, M.et al. Security and Privacy Problems in Voice Assistant Applications: A Survey[J]. Computers & Security, 2023,134.
[7] Dutsinma F. L., Pal D., Funilkul S.et al.A Systematic Review of Voice Assistant Usability: An ISO 9241-11 Approach[J]. SN Computer Science, 2022, 3(04).
[8] Rocha A. P., Ketsmur M., Almeida N., & Teixeira, A. An Accessible Smart Home Based on Integrated Multimodal Interaction[J]. Sensors, 2021, 21(16): 54-64.
[9] 周少龙, 刘广通, 窦方正, 等. 长上下文学习在家电知识问答中的应用[J]. 家电科技, 2024(zk): 445-448.
[10] 焦利敏, 刘泽超, 顾子谦, 等. 智能家电语音交互能力测试语料库建设的研究[J]. 家电科技, 2022(zk): 130-134.
[11] 李思琳, 郭宇航, 姚家树, 等. Home Bench: Evaluating LLMs in Smart Homes with Valid and Invalid Instructions Across Single and Multiple Devices[OL]. 计算机科学. 自然语言处理, 2025: arXiv:2505.19628.
[12] Salton, G., Buckley, C.Term-weighting approaches in automatic text retrieval[J]. Information Processing & Management, 1988, 24(05): 513-523.
[13] Mikolov T., Sutskever I., Chen K., et al.Distributed representations of words and phrases and their compositionality[A]//In Advances in Neural Information Processing Systems[C] , 2013: 3111-3119.
[14] Mochurad L, Babii V, Greguš M.Machine Learning Models for the Recognition of Commands in Smart Home Technologies[A]//CEUR-WS[C] , 2024, 3699: 70-87.
[15] Tian M.Understanding and Implementing Smart Home Voice Commands and Intent Recognition Based on Deep Learning[J]. Applied and Computational Engineering, 2025, 146(01): 187-194.
[16] Zhao J, Wen Y, Li Q, et al. Deep learning approaches for multimodal intent recognition: A survey[EB/OL] , 2025: arXiv preprint arXiv:2507.22934.
[17] Zheng Y, Chen G, Huang M.Out-of-Domain Detection for Natural Language Understanding in Dialog Systems[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2020, 28: 1198-1209.
[18] Luger E, Sellen A.Like Having a Really Bad PA: The Gulf between User Expectation and Experience of Conversational Agents[A]//Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI 2016)[C] , 2016.
[19] Shi Y, Liu X, Yu C, et al. Bridging the gap between natural user expression with complex automation programming in smart homes[EB/OL] , 2024: arXiv preprint arXiv:2408.12687.
[20] King E, Yu H, Lee S, et al.Sasha: Creative goal-oriented reasoning in smart homes with large language models[A]//Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies[C] , 2024, 8(01): 1-26.
[21] Rivkin D, Hogan F, Feriani A, et al. SAGE: Smart Home Agent with Grounded Execution[EB/OL] , 2023: arXiv preprint arXiv:2311.00772.
[22] Lewis P, Perez E, Piktus A, et al.Retrieval-augmented generation for knowledge-intensive NLP tasks[A]//In: Advances in Neural Information Processing Systems[C] , 2020, 33: 9459-9474.
[23] Bu H, Du J, Na X, et al.AISHELL-1: An open-source Mandarin speech corpus and a speech recognition baseline[A]//In: Proceedings of the 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA)[C] , 2017-11-01, Seoul, Korea. IEEE, 2017: 1-5.
[24] Du J, Na X, Liu X, et al. AISHELL-2: Transforming Mandarin ASR research into industrial scale[EB/OL] , 2018: arXiv preprint arXiv:1808.10583.
[25] Shi Y, Bu H, Xu X, et al.AISHELL-3: A multi-speaker Mandarin TTS corpus and the baselines[EB/OL] , 2020: arXiv preprint arXiv:2010.11567.
[26] Home-Assistant-Requests[DB/OL]. Hugging Face. Retrieved2025-10-16, from https://huggingface.co/datasets/acon96/Home-Assistant-Requests.
[27] Smart Home Commands Dataset[DB/OL]. Kaggle.Retrieved from https://www.kaggle.com/datasets/bouweceunen/smart-home-commands-dataset.
[28] 智能人机交互自然语言理解数据集[DB/OL]. DataFountain.Retrieved from https://www.datafountain.cn/competitions/511.
[29] 唐杰, 贾巨涛, 李立辉, 等. 智能语音交互中的多意图解析技术研究[J]. 家电科技, 2024(zk): 418-421.
[30] 讯飞开放平台. 基于星火大模型的群聊对话角色要素提取挑战赛[EB/OL]. 科大讯飞股份有限公司, 2024-06-09[2025-09].
[31] smart_home_control[DB/OL]. Hugging Face. Retrieved2025-10-16, from https://huggingface.co/datasets/Charles95/smart_home_control.
[32] CSDN 博客: 智能家电语音交互技术文章[EB/OL] , 2025. URL: https://blog.csdn.net/yyyy2711/article/details/140211501.
[33] Chakraborty S, Chakraborty D. Alexa’s Voice Command (Intent) Customization For Our Smart Home[J]. World Wide Journal of Multidisciplinary Research and Development, 2025, 11(03): 76-83.
[34] Zhang Z, Lei L, Wu L, et al.SafetyBench: Evaluating the safety of large language models with multiple choice questions[A]//In: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Vol 1: Long Papers)[C] , 2024, Bangkok, Thailand: 15537-15553.
[35] 焦利敏, 任兆亭, 李红伟, 等. 生活场景驱动的AI智能家电的新研究范式[J]. 家电科技, 2025(03): 85-89.
[36] 李伟, 贾奇伟, 劳春峰, 等. 基于LLM的家电智能交互控制系统研究[J]. 家电科技, 2024(zk): 125-129.