节点文献

基于多模式卫生信息网络门户的3G手机语音控制研究

Research on 3G Mobile Voice Control of a Multimodal Health Information Web Portal

【作者】 罗裕坤

【导师】 叶志前;

【作者基本信息】 浙江大学 , 生物医学工程, 2010, 硕士

【摘要】 现在可用包括移动通信等多种方式访问卫生信息系统,但很难实现语音识别模式。本论文主要研究和提供如何将语音识别与第三代(3G)手机界面结合、查看并访问卫生信息的方法。与以往的研究不同,本文构建了一个利用多组件分发同步工作的分布式多式系统。为了实现这一系统的同步性和互操作性,本文将国际电信、网络和多通道体系结构标准归并。参考文献中将提供有关这些标准和协议的资料。多模式系统由两个互动的形态组件构成:选用手机web浏览器作为3G手机界面和图形形态组件;语音形态组件则是在远程服务器上而不是在手机上使用一个IVR语音框架执行语音识别。基于网络的卫生模拟转换门户网站作为医疗信息的数据接口,通过图形形态组件同步查看上述信息,与通过语音形态组件输入的语音构成互动机制。上述构成考虑到了其它实现web门户网站数据转换的技术问题,并且图形模式组件可实时更新。本文将最终集成实现上述原理的原型系统,可基于互操作分布式的国际标准同步显示语音和图形。将来,可利用该系统进一步研究语音识别及个人医疗web门户系统,并在卫生信息门户系统集成标准化的基础上为评价研究医生工作流提供建议。本文的创新点:集成分布式国际标准,提出了一个基于该标准的多模式交互系统,系统各接口与通信组件实现了与其它标准化组件的互操作性。只要遵守该标准化接口,可在不知道各组件内部细节的情况下独立开发其它组件——这样就可以用较少的数据资源进行各种组件的开发。这有利于未来多通道交互系统项目的规范化,因此,本文不仅仅研究了基于图形和语音的卫生信息系统更有利于其它传感器技术的合作和共同操作,有助于卫生信息系统的无障碍交互。

【Abstract】 Health information systems can now be accessed in a variety of ways, including using mobile devices. However, its use with speech recognition has thus far been limited.This thesis’s main objective is to research and provide a method on how to use speech recognition and a 3G mobile phone interface to view and access this health information. To approach this problem differently, a distributed multimodal system where many components are distributed apart, but work together in synchronisation is proposed. To achieve this distribution, synchronisation and interoperability, the thesis concentrated on adhering to and implementing international telecommunication, web and multimodal architecture standards, whose relevant information is later provided.The multimodal system consisted of two modality components i.e. modes of interaction. The mobile web browser was chosen to act as the 3G mobile interface and form the graphical modality. The voice modality consisted of a speech framework to perform the speech recognition on a remote server instead of on the phone. A simulated transformed web-based health portal is chosen as the interface to health information data. In order to synchronise viewing these web portals via the graphical modality with speech from the voice modality an interaction mechanism was implemented. The framework had to consider other technologies to implement transformation of the web portal data and to update the graphical modality.The final integrated and implemented prototype system is then presented. The results are shown where voice and a graphical modality are synchronised after a web initiated session. It is shown that distributed components need to conform to standards to interoperate with the system. This multimodal system based on standards can now be used for continuing speech recognition and health web portal research. Suggestions are provided so that the system can be fully integrated with a standardised health information portal and then how to evaluate physician workflow for future research.By providing a distributed standardised multimodal interaction system, components with standardised interfaces and communication allowed for mutual interoperability. Components could also be developed independently and separately without knowing the inner details of each other component as long as standardised interfaces were conformed to. A distribution of components allowed for more powerful components to do the data processing of components with fewer resources. For future projects, with a standardised distributed multimodal interaction system, not just graphic and speech, but other sensor modalities from other research teams may co-operate together to allow seamless interaction with Health Information Systems.

  • 【网络出版投稿人】 浙江大学
  • 【网络出版年期】2011年 02期
  • 【分类号】TN929.53
  • 【被引频次】1
  • 【下载频次】42
节点文献中: 

本文链接的文献网络图示:

本文的引文网络