节点文献

基于队员行为信息的体育视频内容分析方法研究

Research on Sports Video Content Analysis Using Player Behavior Information

【作者】 朱光宇

【导师】 高文;

【作者基本信息】 哈尔滨工业大学 , 计算机应用技术, 2009, 博士

【摘要】 随着计算机技术、网络技术和多媒体技术的迅速发展,多媒体数据正在呈指数级增长。视频作为多媒体数据的重要组成部分,其结构复杂,数据量庞大。体育视频由于拥有广泛的受众群体及巨大的市场潜力,以体育视频内容分析为主要目标的研究已成为视频分析领域内的一个热点。本文重点研究面向广播体育视频的内容分析技术。针对目前体育视频分析研究中存在的低层视频特征无法准确反映人类高层语义概念的问题,提出了以队员行为(轨迹、动作)分析为基础,结合音频分析的多模态融合体育视频语义分析与战术分析方法。重点讨论了广播体育视频中队员的轨迹跟踪与动作识别,基于队员轨迹与动作信息并采用多模态融合与领域知识构建视频内容的语义/战术中层表达,基于中层表达对广播体育视频进行语义内容分析与战术内容分析等几个关键技术问题。具体的研究内容如下:提出了基于支持向量机与粒子滤波的广播体育视频中队员检测与跟踪方法。首先,将支持向量分类与球场分割方法相结合,提出了一种针对体育视频中队员的自动检测算法,用来初始化后续视觉对象的跟踪。其次,将支持向量回归与序列蒙特卡罗框架相结合,提出了一种应用于视觉对象跟踪的改进粒子滤波算法,使得传统粒子滤波方法在小规模粒子集情况下能够实现对视觉对象的鲁棒跟踪,并有效提高跟踪系统的运行效率。提出了基于支持向量机与光流分析的广播体育视频中队员动作识别方法。针对广播体育视频图像质量差、摄像机非静止、队员图像分辨率低的问题,从运动分析角度出发,基于被跟踪队员区域光流场的空间分布性质,采用局部分析思想的栅格划分方法提取动作识别的描述特征。此种特征提取方法有别于传统的光流分析思想,将被跟踪区域内的光流矢量场看成是一种运动模式的空间分布信息,从而提高光流特征的鲁棒性。采用支持向量机作为模式分类器并结合时序投票策略,识别队员动作的类型。与现有基于表观特征的识别方法相比较,提出的运动描述特征及以此为基础的识别算法取得了更好的识别结果。提出了基于队员行为信息与体育比赛特定音频关键字多模态融合的体育视频摘要精彩排序方法。首先将球拍类体育比赛视频中队员的轨迹、动作信息结合音频关键字进行多模态融合,构建视频内容的“轨迹-动作-音频”中层表达。基于“轨迹-动作-音频”表达提取可计算的情感特征,用以描述用户对体育视频摘要片断进行精彩度排序的主观情感过程。考虑到目前人类情感思维的生理、心理学研究情况,提出了基于核统计学习的非线性精彩排序模型构建方法。此种构建方法不仅能够增强模型对噪声数据的鲁棒性,同时可以扩展模型的有效性与通用性。此外,还提出了精彩排序的客观评价标准,用于评价自动评估结果与主观感知事实的匹配程度。利用此评价标准,一方面可以评估精彩排序模型构建的有效性;另一方面结合前向搜索算法,从而指导情感特征的提取及有效特征的选择。提出了基于队员轨迹信息的广播体育视频战术分析方法。体育视频战术内容分析的目的在于发现体育比赛事件中队员个人或队员之间在完成一次比赛动作(或任务)过程中所使用的战术模式或比赛策略。基于比赛事件中队员和球的多对象轨迹信息,首先提出了一种基于时间片断分割的局部时间/空间交互关系分析算法,根据各时间片断中轨迹间的形状与距离度量及各片断之间轨迹的速度与距离度量,利用图模型方法构建对体育比赛中事件视频的战术表达,即交互轨迹。通过对交互轨迹中各组成片断的分析,对足球比赛视频中进攻事件的战术模式进行由粗至细的层次化识别:在粗识别过程中,将交互模式分为协同进攻与个人进攻;在进一步的精细识别中,将协同进攻模式细分为有拦截进攻与无拦截进攻,将个人进攻模式细分为直接进攻与带球进攻。

【Abstract】 With the rapid development of the technologies of computer, network and multi-media, there is an explosive growth in the amount of available multimedia information.Video is one of the most important components of the multimedia data, which has hugequantity and complex structure. As an important genre of video document, sports videohas attracted increasing attention in automatic video analysis due to its wide viewershipand tremendous commercial potential.The research of this dissertation focuses on the problem of broadcast sports videocontent analysis. To solve the problems of low-level video features cannot represent hu-man high-level semantic concepts, this dissertation proposes a novel approach for thesports video analysis based on the player behavior (trajectory and action) and the inte-gration of audio analysis in terms of semantics and tactics. Some important technologiesand solutions are studied which especially concentrate on the player trajectory trackingand action recognition, semantic/tactic mid-level representation construction using playerbehavior information and multimodal fusion with domain knowledge and semantic/tacticcontent analysis of broadcast sports video based on mid-level representation. The detaileddescription of the research content in the dissertation is as follows:A new player detection and tracking approach in broadcast sports video using sup-port vector machine and particle filter is proposed. Support vector classification combinedwith playfield segmentation is employed to automatically detect the players in sportsvideo. Then, an improved particle filter called support vector regression particle filteris proposed as the player tracker by integrating support vector regression into sequentialMonte Carlo framework. The improved particle filter not only enhances the performanceof classical particle filter with small sample set but also improves the efficiency of trackingsystem.A novel player action recognition approach in broadcast sports video based on sup-port vector machine and optical ?ow analysis is proposed. Different from the existingappearance-based methods, our approach is based on the motion analysis and extract mo-tion descriptor in terms of spatial distribution and grid partition of the optical ?ow fieldwithin the player figure region. In the proposed approach, the optical ?ow is treated as the spatial patterns of the noisy measurements instead of the precise pixel displacementsto enhance the robustness of motion descriptor. Support vector machine and temporalvoting strategy are employed to recognize the type of player action in the video clip. Theproposed motion descriptor and the action recognition approach significantly outperformsthe existing appearance-based method.A novel multimodal approach of highlight ranking for sports video summaries in af-fective context is proposed based on player behavior information and audio keywords ofsports game. The mid-level representation“trajectory-action-audio”is constructed for thevideo content by fusing the information of player trajectory, action and audio keywords.Based on“trajectory-action-audio”, the computational affective features are extracted todescribe the objective process of highlight ranking of sports video summaries from usersubjective perception. A kernel based nonlinear probabilistic ranking model constructionmethod is proposed, which is robust for the noisy data and provided with good expan-sibility. In addition, a new subjective evaluation criterion is proposed to guide modelconstruction and feature extraction with the assistance of forward search algorithm.A new tactic analysis of broadcast sports video is proposed based on player trajec-tory information. Tactic analysis of sports video aims to recognize and discover tacticpatterns and match strategies that teams and individual players used in the games. Basedon players and ball trajectories, an algorithm of local temporal-spatial interaction analysisis firstly proposed. Using the multi-object trajectories, a weighted graph is constructedvia the analysis of temporal-spatial interaction among the players and the ball based onthe metrics of distance and shape in temporal interval and velocity and linking distanceamong intervals. The aggregate trajectory which is a new tactic representation of sportsvideo is computed based on the weighted graph. The interactive relationship of aggregatetrajectory with the hypothesis testing for trajectory temporal-spatial distribution are em-ployed to discover the tactic patterns in a hierarchical coarse-to-fine framework for theattach events of soccer game video.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络