时间: 2024-06-24 | 次数: |
谭雪青,宋军,张慢慢,等. 基于生成式人工智能的眼动样本生成及识别[J].河南理工大学学报(自然科学版),doi:10.16186/j.cnki.1673-9787. 2024040012.
TAN X Q,SONG J,ZHANG M M,et al.Generation and recognition of eye movement samples based on generative artificial intelligence[J].Journal of Henan Polytechnic University( Natural Science) ,doi:10.16186/j.cnki.1673-9787. 2024040012.
基于生成式人工智能的眼动样本生成及识别(网络首发)
谭雪青1,2,宋军2,张慢慢1,臧传丽1
(1.天津师范大学 心理学部,天津 300387;2. 河南理工大学 机械与动力工程学院,河南 焦作 454000)
摘要: 目的 生成式和传统人工智能模型是信息时代的关键工具。在这些技术的助力下,眼动过程的样本生成与识别显得尤为关键,它已成为深入研究认知机制的重要手段。因此,本研究旨在推动生成式人工智能在眼动技术领域的应用发展,解决眼动样本生成及因网络深度增加而导致的不透明性和不可解释性问题,并深入挖掘与幼儿语言发展相关的眼动数据。方法 本研究采集4-6岁幼儿理解不同焦点结构的眼动数据,采用生成式人工智能模型-变分自编码器(Variational AutoEncoder, VAE)和传统模型¬-多层感知器(Multi-Layer Perceptron, MLP)识别其眼动模式的发展差异并尝试生成新样本,基于灰色关联分析和混淆矩阵对生成式数据集进行解释。结果 结果表明:(1) VAE生成的4岁组、5岁组和6岁组幼儿眼动数据集精度高于MINIST(Mixed National Institute of Standards and Technology database)数据集,且与MLP分析结果一致,具有准确性、多样性和一定的可解释性;(2)生成式眼动数据及混淆矩阵结果表明,在无焦点结构句式中,幼儿在4-5岁、5-6岁两个阶段理解水平均有提升,而宾语焦点结构和主语焦点结构的眼动特征在4-5岁变化较小,5-6岁变化较大,说明幼儿对焦点结构的理解在5岁是一个关键期,这符合幼儿焦点结构理解发展规律。结论 本文所提出的人工智能耦合分析方法,具备有效识别眼动特征发展模式的能力,并能据此生成可靠的新样本。这一方法不仅为生成式人工智能与眼动技术的融合开辟了新的途径,也为复杂语言理解问题提供了全新的思考方向。
关键词: 生成式人工智能;变分自编码器;多层感知器;眼动
doi:10.16186/j.cnki.1673-9787. 2024040012
基金项目: 国家自然科学基金资助项目(31800920)
收稿日期:2024-04-12
修回日期:2024-05-23
网络首发日期:2024-06-24
Generation and recognition of eye movement samples based on generative artificial intelligence
TAN Xueqing1,2, SONG Jun2, ZHANG Manman1, ZANG Chuanli1
(1. Faculty of Psychology, Tianjin Normal University, Tianjin 300387, China;
2. School of Mechanical and Power Engineering, Henan Polytechnic University, Jiaozuo 454000, Henan, China)
Abstract: Objective Generative and traditional artificial intelligence models are pivotal tools in the information age, firmly establishing the groundwork for technological advancements. Leveraging these technologies, the generation and identification of eye movement samples have emerged as critical components, facilitating deeper explorations into cognitive mechanisms. Therefore, this study aims to promote the development of generative artificial intelligence in the field of eye tracking technology, solve the problem of eye movement sample generation and the opacity and inexplicability caused by the increase in network depth, and deeply mine eye tracking data related to children's language development. Methods This study collected data on the eye movement process of 4-6 years old children's understanding of different focus structures. Generative artificial intelligence model - variational autoencoder (VAE) and traditional models - multi-layer perceptron (MLP) were used to identify the developmental differences in their eye movement patterns and attempt to generate new samples. Interpreting generative datasets based on grey relational analysis and confusion matrix. Results The results show that: (1) the eye movement datasets generated by VAE for 4, 5, and 6-year-old children have higher accuracy than the MINIST dataset, and are consistent with the MLP analysis results, with accuracy, diversity, and certain interpretability; (2) The results of generative eye movement data and confusion matrix indicate that in unfocused structure, children's understanding level improves at the ages of 4-5 and 5-6, while the eye movement characteristics of object-focus structure and subject-focus structure change less at the ages of 4-5 and more at the ages of 5-6, indicating that children's understanding of focus structure is a critical period at the age of 5, which is in line with the development law of children's understanding of focus structure. Conclusion The artificial intelligence coupling analysis proposed in this article can identify the development patterns of eye movement features and generate reliable new samples, providing new ideas for the combination of generative artificial intelligence and eye movement technology.
Key words: generative artificial intelligence; variational autoencoder; multi-layer perceptron; eye movement