节点文献

投资者情绪的统计测评及其应用研究

A Study on Statistical Evaluation of Investor Sentiment and Its Application

【作者】 崔亮

【导师】 聂富强;

【作者基本信息】 西南财经大学 , 统计学, 2013, 博士

【摘要】 截至2012年末,我国已拥有超过2494家上市公司,大量企业通过上市改制筹资迈向了现代化企业的道路,我国资本市场服务实体经济的能力不断增强,在推动经济发展方式转变、加快结构调整和产业升级、落实创新型国家战略等方面发挥了不可替代的重要作用。同时不能回避的问题即是相关外部监督机制的配套亟需建立健全。07年次贷危机给我们敲响警钟,危机从美国市场首先爆发进而演化为世界范围内的经济衰退,各国政府与国际组织在疲于应对的同时,也积极开展对危机的总结与反思——初步共识是金融机构行为、金融体系顺周期性以及缺乏对系统性风险的防控是导致此次全球金融危机的重要原因,信用评级机构成为众矢之的。对目前的评级体系进行分析,就主体而言,评级机构与评级对象从对立面走到了不同程度的统一;就客体而言,评级角度多基于客观视角,而客观评级的弊端在于其数据来源存在滞后性。因此,结合主观评价法来拓展现有上市公司评级视角,以此完善相关股票市场外部监督机制,就颇为必要且具有现实价值。然而,究竟从哪个方面获取具有说服力的主观指标数据是我们首先要面对的问题。本文认为,投资者情绪是联系投资者与股票市场、上市公司的桥梁。投资者情绪测评方法主要包括两种:间接指标测量和直接指标测量。间接指标测量是指从金融市场相关数据中提炼出能间接反映投资者情绪的代理变量,这种方法虽然有一定说服力,但并不能直接反映投资者情绪;而直接指标测量虽能直接反映投资者情绪,却容易受到调查成本的限制,调查对象(投资者)也可能因为种种顾虑不愿意表达真实想法而最终影响测评质量。这迫使我们把注意力转向新型社会化媒体。以博客、微博、社会化新闻和网络论坛为主的社会化媒体正以迅猛的速度充斥着整个互联网空间。在社会化媒体中,每个人既是信息的发布者、传播者,也是媒体的受众。社会化媒体不仅改变了信息发布和传播方式,也引导着人们投资方式的改变。互联网从一个简单的信息发布技术平台演变成为社会化媒体的主要载体,发展为一个交互式的信息发布、共享、交流与协作的社会化网络,同时也为投资者情绪测评提供了廉价且丰富的数据来源。相关研究已经指出,网络谣言已经成为困扰我国股票市场发展的一个老大难问题,不仅严重影响了股价,还极大地左右着市场投资者的投资决策和信心。然而面对庞大复杂的互联网信息,限于学科理论交叉与信息技术限制,使得基于网络舆论的投资者情绪测评虽被屡屡提及却鲜有实证研究涉及。由此,本文基于网络舆论构建投资者情绪指数,探寻投资者情绪与股票市场的关系,进而把投资者情绪引入上市公司公众评级(以下简称公众评级),成为一个极具挑战且富有现实价值的命题。首先,借助于投资者情绪指数,能够深入把握我国股票市场投资者的心理变化和行为特征,从而可以进一步分析投资者情绪与股票市场之间的联动关系。投资者情绪指数的权威发布将有助于增进投资者对市场风险的认识,有助于投资者全面把握上市公司的投资价值,同时也从机制上起到了对上市公司自主行为的约束,减少其财务失真进而左右市场看法行为的发生,从而减少股票市场信息不对称。而对上市公司而言,借助于投资者情绪指数也能及时了解市场与投资者的关注重点,适时做出积极应对,客观上避免了网络谣言对股票市场不利的冲击。由于目前网络舆论监管机制相对缺乏,导致不真实评论信息的过度传播。因此,本研究的意义还在于探讨增强股票市场网络舆论的监管途径,进而达到减少由于噪音信息扰动导致的股票市场系统性风险。其次,引入投资者情绪的公众评级系统能够为完善现有上市公司评级系统提供参考。当前,评级机构主要运用客观评级系统对上市公司进行评级,评价体系主要侧重于财务指标。由于公司自身财务状况具有演化过程,而其后续发布亦有拖延,使评级结果具有滞后性;另一方面,现有评级系统面对上市公司财务数据失真时,将变得无能为力。此外,现有评级系统虽然极力纳入主观评级指标,但仍面临指标收集与量化问题,致使主观评级数据存在偏差。因此,从网络舆论入手分析投资者情感倾向,进而对上市公司进行公众评价,将是现有评级系统的一个有效的补充。评级过程文本信息预处理、特征挖掘、情感分析和统计评价等多种方法的融合,将实现对网络文本信息的情感倾向量化,引入公众评级系统将形成对上市公司评级视角的拓展。更重要的是,网络舆论的引入使预期信息度量成为可能,基于可量化的上市公司投资者情绪指数来跟踪监测投资者的情绪变化,进而对投资者情绪变化与股票市场之间的联动关系展开分析,可以为我国加强股票市场管理与优化市场稳定机制提供新的思路与选择,这对于逆周期背景下的宏观审慎监管更具现实意义。本文遵照“提出问题——分析问题——解决问题”的研究思路,同时采用统计评价与实证检验相结合的经济统计学研究范式,实证研究部分以定性分析与定量分析相结合,以定量分析为主的方法实现对投资者情绪测评及应用研究。首先定义了本研究的投资者情绪,结合行为金融学核心理论分析了我国股票市场投资者情绪的典型特征,重点研究了投资者情绪与网络舆论、股票市场价格和上市公司评级的内在联系,提出了实证研究的前提假设和理论假说。其次,在现有投资者情绪测评方法比较分析的基础上,提出基于网络舆论构建投资者情绪指数,运用数据挖掘技术搭建了投资者情绪测评的技术框架,以沪深300上市公司为测评对象,得到了样本期内我国股票市场的投资情绪指数。再次,对网络舆论下的投资者情绪与股票市场联动关系进行专项研究,重点分析了投资者情绪对股票市场价格的影响效应,验证了理论分析中提出三个理论假设。最后,将投资者情绪的应用拓展到上市公司评级,为补充完善现有上市公司投资价值评估提供新的思路。本文的主要结论如下:(1)本研究基于网络舆论所构建的投资者情绪指数,能够客观反映观测期内我国股票市场投资者的情绪状态,从而初步验证了从网络文本信息中获取投资者情绪指数是一条有效的测评途径。(2)在多种测评方法比较分析的基础上,验证了基于金融证券领域情感词典与属性词库所构建的细粒度情感分析方法,在面对海量文本信息情感倾向量化方面表现出一定的灵敏性。(3)借助投资者情绪指数对股票市场影响效应分析发现:网络舆论下投资者情绪是影响股票市场一个系统性因素,投资者情绪与股市收益率呈正相关联动关系,存在短期滞后影响。(4)基于文本信息中投资者情感倾向构建的上市公司公众评级系统,能够粗略反映观测期内上市公司发展状态,分析发现宏观经济政策调整和突发公共事件对公众评级结果影响明显,从而验证了公众评级系统的即时性和有效性。本文突破了传统投资者情绪测评及其应用研究的视角,基于网络舆论对投资者情绪进行统计测评,并对相关问题进行了探索性研究,凸显跨学科研究中的统计学特色。纵观全文,融合行为金融学、信息技术和统计测评方法等多学科理论与方法,基于网络舆论构建具有现实解释力的投资者情绪指数,从投资者情绪对股票市场影响效应和上市公司公众评级两个方面进行拓展应用,力图为量化社会科学研究中的复杂文本数据提供新的研究思路。从研究结果来看,本文较有新意之处可能体现在以下几个方面:(1)基于行为金融学与统计指数理论,构建了基于网络舆论的投资者情绪指数。在测评方法比较分析的基础上,结合投资者情绪在网络舆论中所展现的特征,把关注度指标作为情感倾向指标的权重系数构建出的投资者情绪指数,具有直观、简便、解释力强的优点,同时又能准确、适时、综合地刻画股票市场投资者的情绪变化。在投资者情绪测评指标来源视角和指数合成思路上有所创新。(2)融合信息挖掘技术与统计评价方法,探索了基于海量文本信息实现投资者情绪测评的有效途径。通过网页文本信息抓取、信息预处理、特征挖掘和情感分析等数据挖掘技术方法的结合,构建了从非结构化文本信息中提取投资者情绪测评指标的技术框架。在具体测评过程中,首先,对于海量文本信息去噪提供了一个简便有效的处理思路;其次,基于特征挖掘技术得到了含有632个词的上市公司属性词库;再次,在情感分析过程中专门构建了适应证券投资领域的情感词典(含有23333个情感词),最后,找到了适合金融短文本信息的情感分析方法。通过这些基础工作,解决了投资者情绪测评中的两个关键问题:提取关注度指标和量化文本信息情感倾向。(3)运用本研究获得的投资者情绪指数,考察了网络舆论中的投资者情绪与股票市场特征指标的联动关系。重点分析了投资者情绪对股票市场价格变化的影响效应,通过运用改进的FF三因子模型和VAR模型验证了三个理论假说:第一,网络舆论下的投资者情绪是影响股市收益率的一个系统性因素。第二,网络舆论下的投资者情绪对股票市场收益率存在正向影响。第三,网络舆论下的投资者情绪对股票市场收益率的影响存在短期滞后效应,不存在长期滞后效应。(4)基于网络舆论中的投资者情感倾向,拓展了上市公司评级的视角,为金融市场中股票投资价值的即时评价进行了新的探索。从文本信息中提取上市公司属性作为评级指标,将文本信息中的投资者情感倾向量化为评级指标数据,运用因子分析法构建了上市公司公众评级系统。基于网络舆论构建的上市公司评级系统,弥补了客观评级系统在主观指标量化和评级时效上的不足,为补充完善现有的评级系统提供了新的思路。由于研究水平和技术手段的限制,本文仍然存在不足之处:首先,受技术条件的限制,本研究只选取沪深300成分股上市公司作为评级对象,并没有完全覆盖所有上市公司,且数据来源只选取具有代表性的东方财富网股吧作为网页抓取对象。今后可考虑把信息来源拓展到微博、博客和社交网站等其他新型社会化媒体,对所有上市公司进行全面分析。其次,受技术手段和知识面的限制,在文本数据处理过程中,虽然建立了股票投资领域情感词典和上市公司属性词库,但并不能完全覆盖网络媒体中复杂多变的情感词汇,这有可能影响到情感极性分类的准确率。今后可在本研究的基础上不断扩充,建立有广泛认可度的上市公司属性词库和情感词典。最后,受到数据处理能力的限制,没有对回复贴进行更深入的数据挖掘,损失了部分文本数据信息。今后可重点对回复贴做进一步的情感分析,以期获取更丰富的投资者情感信息。后续研究将从进一步优化网络舆论下的投资者情感量化技术、重视解决情感极性偏移问题和考虑网络“水军”干扰信息的影响等方面,对投资者情绪统计测评及其应用做更深入的探讨。

【Abstract】 As of the end of December2012, China has more than2,494listed companies and a large number of these companies have marched towards the path of the modern enterprise by listed restructuring and financing. China’s capital market has not only increased the capacity of the real economy, but also has played an irreplaceable role in shaping the pattern of economic development, speeding up structural adjustment and industrial upgrading, and implementing the innovative national strategy.However, we could not avoid the problem of establishing and updating the assets of oversight mechanisms perfectly. In2007, the subprime crisis has occurred first in the U.S. market, and then evolved into a worldwide recession. Government and international organizations were struggled to cope with it but they also put some actions to summarize and to reflect on it:they made a preliminary agreement, which is the behavior of financial institutions and the pro-cyclicality of the financial system and the lack of systemic risk prevention are the important factors for the global financial crisis. Furthermore, credit rating agencies become the target of public criticism. By analyzing the current rating system, the subjective reason is that rating agencies and rating objects have become a uniformity of different grades from the opposite sites; the object reason is that the rating point of view is mostly based on objective perspective. However, drawback of objective rating is the hysteresis effect of its data sources. Therefore, using both subjective and objective evaluation methods for expanding existing rating points of view, in order to complete the external oversight mechanisms of the relevant stock market is quite necessary and with realistic value.However, the first problem we have to face is from what respect to obtain the subjective data to evaluate the listed company. This paper argues that investor sentiment is the bridge to contact investors, stock market and listed companies. Investor sentiment evaluation method consists of two types:indirect indicators of measurement and a direct indicator of measurement. Indirect measurement must be persuasive, but do not completely reflect investor sentiment, which is to extract the data from financial markets that can indirect reactions proxy variables of investor’s emotion. Direct indicator of measurement can directly reflect the investment by emotions, but vulnerable to the survey cost constraints, because the survey (investors) may also willing to express ideas and ultimately affect the quality investigation. This forces us to turn our attention to the new social media.Blog, micro blogging, social news and networking forum-based social media are filled with the entire Internet space at a rapid pace. In social media, everyone is the publisher of. information, also the disseminator of information, but also the receiver of information. Social media has not only changed the publication and dissemination of information, but also guided the change in the way of people’s social life and investment. Internet evolved from a simple information technology platform to become the main carrier of social media, and develop into a collection of information release, sharing, communication as one of the social network, but also to provide the cheap and abundant data sources about invest sentiment survey. Related studies have pointed out that the network rumors plagued the development of China’s stock market, which has become a chronic problem, not only a serious impact on stock prices, but also greatly influence investment decisions of investors and market confidence. However, because of the internet information is huge, cross discipline theory and the limitation of information technology, makes the ratings of listed companies based on network public opinion repeatedly mention, but there are few empirical studies. Thus, from the investor sentiment index constructed based on the network of public opinion, to explore the relationship between investor sentiment and the stock market, and thus the introduction of public listed companies investor sentiment rating (hereinafter referred to as public rating), become a challenging and full of real value proposition.First of all, with the help of investor sentiment index, we can depth grasp the psychological changes and behavioral characteristics of China’s stock market investors, which can further analysis of the linkage relationship between investor sentiment and stock market. The authority of the investor sentiment index published will help to enhance investor awareness of the market risk, help investors fully grasp the investment value of listed companies, helps to constraints the behavior of the listed companies, help reduce the financial fraud, eventually reduce the stock market information asymmetry. For listed companies, by means of the investor sentiment index can keep abreast of market and investor focus, to make a positive response in a timely manner, to avoid the network rumors adverse impact on the stock market. At present the lack of network public opinion supervision mechanism, leading to excessive spread of false information. Therefore, the significance of this study is to explore ways to enhance the supervision of the stock market network of public opinion, and thus to reduce the systemic risk due to the noise disturbance caused by the stock market.Secondly, the introduction of investor sentiment public rating system is able to provide a reference to improve the existing listed companies rating system. At present, the rating agencies rating of listed companies using objective rating system, and evaluation system focused primarily on financial indicators. Evolution process of company’s financial status and information release delay problem made the evaluation result lags behind. On the other hand, the existing rating systems are powerless facing the listed companies’financial frauds. In addition, the existing rating systems strongly lead into the subjective rating index, but it faces indicators collection and quantifies issues, resulting in a subjective rating index data deviation. To make up for these shortcomings, the emotional tendency of investors based on the network of public opinion analysis, and then the public evaluation of listed companies, will be an effective complement. Rating process text takes feature mining, sentiment analysis and comprehensive evaluation methods to quantify the emotional tendencies of the network text information, to form the expansion of listed companies evaluation perspective.More importantly, the introduction of network public opinion makes it possible to forecast information measure, based on quantifiable investor sentiment index of listed companies to track monitoring investor’s mood changes, and changes in investor sentiment and the linkage relationship between the stock market analysis, to strengthen the management of the stock market in our country and optimize market stability mechanism provides new train of thought and choice, this is under the background of counter-cyclical macro-prudential regulation more practical significance.In this paper, according to "ask questions-analysis problem-problem solving" research train of thought, at the same time, by adopting the combination of statistical evaluation and empirical analysis of economic statistical research paradigm, the empirical research part by combining qualitative analysis with quantitative analysis, as well as the mainly quantitative analysis method on investor sentiment evaluation and application research.First defines the sentiment of this study, combined with core behavioral finance theory to analyze the typical characteristics of the Chinese stock market investor sentiment, focuses on investor sentiment and the network public opinion, the stock market price and rating of the inner link of listed companies, puts forward the hypothesis premise of empirical research and theory in this paper. Secondly, in the existing investor sentiment evaluation method on the basis of comparative analysis, based on network public opinion to build investor sentiment index, use data mining technology to build the technical framework of investor sentiment evaluation model, in Shanghai and Shenzhen300listed companies as the object of measurement, the sample period, the sentiment index of the stock market in our country. Again, the network public opinion to the investor sentiment and the stock market under the linkage relationship between special research, analyses the effect of investor sentiment influence on stock market prices, to verify the theoretical analysis of the three theoretical hypothesis is put forward. Finally, the application of investor sentiment to rating of listed companies, to complement the existing rating system of listed company to provide new ideas.The main results of this article are as follows:(1) Constructions of Investor sentiment index in this study are based on the network public opinion, which can objectively reflect the observation period in our country’s stock market investors’ emotional state, thus, preliminarily verified the text information from the network for investor sentiment index is an effective measurement way.(2) In a variety of methods, on the basis of comparative analysis, this article has verified the emotional dictionary based on the field of financial securities and property thesaurus of fine-grained sentiment analysis method, in the face of massive amounts of text information, emotional tendency quantitative aspects also show certain sensitivity.(3) With the help of investor sentiment index, stock market effect analysis has found that:under the network public opinion, investor sentiment is a systematic factor that affects the stock market, the positive correlation between stock market returns and investor sentiment; there are short-term lag influences on it.(4) Based on text information investors emotions tending to build public rating system of listed companies, listed companies can roughly reflect observation period development status, the analysis has found that macroeconomic policy adjustments and public emergency rating obviously affects the results to the public, to verify the real-time and effectiveness of public rating system.This paper which based on investor sentiment in the network of public opinion in the evaluation of the relevant issues broke through the traditional listed company rating perspective, highlighting the interdisciplinary study of statistical characteristics. Overall, combining of Behavioral Finance, Informational Technology and Statistic Analysis with other theoretical subjects, this article has used network opinions to establish investor emotion data with interpretation of reality, as well as expanding applications of investor emotion data for effects of stock market and the public rating for listed company, in order to investigate new research methodologies for quantifications of complex text data in social science research. According to the result, the most innovative aspects of this article are as follows:Based on behavior finance and statistics index theory, this article has built Investor sentiment index under a network of public opinion. Also with the basic of analysis methods with comparison, it has used combination of characteristics of investor sentiment in public opinions. Seeing the attention indicators as the number of indicators, emotional tendencies as quality indicators, and using quantitative indicators as quality indicators weight coefficient to build investor sentiment evaluation model. This approach has the advantages of simple, intuitive, and with powerful explanation, as well as with comprehensive, correct and timely portrait of changes in stock market investor sentiment. This is not only the expansion of investor sentiment evaluation index source perspective, but also innovative methods of assessment of investor sentiment.According to the methodology of information mining technology and statistical analysis, this article has searched methods to do investor sentiment analysis by using a large number of text information. The use of the network text mining technology including crawl through the web text information, text information pretreatment, feature mining and sentiment analysis to achieve the evaluation index of investor sentiment, extracting investor sentiment from a wide range of unstructured range of information, solving the two key issues of build investor sentiment evaluation:extract attention indicators and quantitative emotional tendencies.First of all, providing a simple and effective treatment ideas for mass text information denoising; second, building the investor sentiment the feature index system and getting properties thesaurus contains632words; third, specifically building an emotional lexicon (containing23,333emotional words) to adapt to the field of securities investment in the process of sentiment analysis. Finally, it has constructed the explanatory power of investor sentiment index which based on the reality of network public opinion and getting the listed company public rating functions. This basic work will help promote a deeper level study on the network of public opinion, the market effect and financial stability mechanism innovation.Using this research to gain investor sentiment data has explored the relationship between investor sentiment and the stock market characteristics index in public opinions. This research has also mainly analyzed the influential effects among the changes of stock market pricing by investor sentiment. Thus, we get three theoretical assumptions:first, the network of public opinion investor sentiment is a systemic factor that affects the stock market rate of return. Second, the network of public opinion under the investor sentiment is a positive impact to the rate of return on the stock market. Third, the impact which investor sentiment in the network of public opinion affects the stock market rate of return is a short-term lag effect, not long-term lag effect.By expanding the application of investor sentiment based on the perspective of the listed companies, extracting rating index system of listed companies from the text information, quantizing investors emotional tendency into the rating index data, thus, we rating the public listed companies, providing a new way of thinking to supplement and improve the existing rating system.In addition, because of the constraints of research level and technical approaches, this article inevitably leaves insufficient issues. Firstly, according to the limit of technical conditions, this study has only selected CSI300constituent stocks of listed companies as research sample, which is not the complete coverage of all listed companies; also the Eastern wealth network shares are selected as the only and typical data source, which could be expanded to micro blogs, blogs, social network sites and other social media platforms to enable a complete analysis. Secondly, regarding the limitation of technical approaches and the scope of knowledge, during the text data processing, while we have established the emotion dictionary and listed property thesaurus in stock investment areas, it could not completely cover the complex emotional vocabulary in network media, which may affect the accuracy of sentiment classification. Therefore actions should be made in future research to build a complete emotion dictionary and listed property thesaurus. Finally, with the limitations of the data processing capability, we have not done further data mining to replying posts, which has lost part of the text data, which can be improved in future research by using in-depth emotion analysis to replying posts in order to gain more reliable emotional information on investors.For further research, this article gives an outlook for in-depth statistics analysis for investor sentiment and applications in order to improve emotional quantitative techniques in network areas with public opinions, solving polarity shift processing problems and considering the influence of the network interference such as "water army" information and so forth.

  • 【分类号】F224;F830.59
  • 【被引频次】5
  • 【下载频次】2868
  • 攻读期成果
节点文献中: