节点文献

数据挖掘在银行业务中的应用

Data Mining in the Banking Business

【作者】 蔡剑

【导师】 倪子伟;

【作者基本信息】 厦门大学 , 计算机软件与理论, 2008, 硕士

【摘要】 本论文丰要围绕银行增值业务的课题进行设计和实现。通过对增值业务的信息进行分析,围绕客户、产品和竞争对手三个主题,我们构建银行增值业务数据仓库,侧重对分析人员和高层管理人员进行决策支持,以便他们准确和及时地掌握企业的经营状况,了解市场需求,制定正确的经营方案。本文提出了一种新的数据分析方法进行数据挖掘,其结合了粗糙集理论和概率统计学中的多元线性回归模型,在对其进行一定的改进后同充分发挥两者的优点时。运用该方法在对大量银行业务记录进行分析,不仅从中找出业务规律,并通过影响算予对其进行二次加工,从中得出直观的参数函数,进而获得分析信息和决策依据。改进后的算法在进行数据挖掘的环节中,加入频度属性这个参数,建立带有频度属性的决策表。频度属性F记录的是对象x在知识库中出现的次数,取值范围为正整数,它既不属于条件属性集,也不属于决策属性集。对特征集集合进行多步递推运算,排除其中的小特征集,并提取符合条件的大特征集[Xk],将规则[Xk]→Y[Sup][Con]添加到规则集合中;同时对符合支持度的候选特征集进行重新组合,牛成更多元的特征集,从而生成反映决策偏好信息的决策规则。新的算法对生成的带频度属性的决策规则进行约简处理,得出最简决策规则。然后再将这些规则作为样本进行多元线性回归分析,并通过影响算予对其进行再次加工,建立相应的多元线性回归模型。然后根据样本及回归模型找到局部最优回归子集,并由此建立新的多元线性回归模型,最后采用最小二乘法对新的回归模型中的待估回归系数进行估计,求得待估回归系数,从中得出直观的参数函数A0i01A12A2+…+βpAp=fi(A1,…,Ap),i=1,2,…,p。用户将可以方便地运用这组函数来查看决策依据和获得直观地分析信息。

【Abstract】 The paper presented mainly involves in the designing and implementation a project of bank value-added service. Through the analysis to the value-added service’s information, the customers, the products and the competitor, we construct a data warehouse for bank value-added services. We put the emphasis on policy-making support to the analysts and senior management staff, so that they can accurately and timely know the operation state of enterprise, understands the market requirement, and make the correct plan.The paper proposes one recent a new data analysis method to carry on data mining. The method unifies the rough set theory and the multi-dimensional linear regression model in probability statistics, and fully plays their merits after make some improvement to the model. By using the improved method, we analyze the massive records of banking. We not only discover the service rules and carry on twice processing to data with the influence operator, but also obtain the intuitive parametric function and get the analysis information and the policy-making basis.The improved algorithm increases the frequency attribute parameter and establishes the decision-making table with frequency attribute in carries on the data mining in the link. What frequency attribute F record is the object x the number of times which appears in the knowledge library, the value scope is the positive integer, it already does not belong to the condition attribute collection, also does not belong to the policy-making attribute collection. Carries on many step recursion operations to the characteristic collection set, removes small characteristic collection, and withdraws conforms to the condition big characteristic collection[Xk], increases the rule [Xk]→Y[Sup][Con] to the regular set in; Meanwhile to conform to the support candidate characteristic collection to carry on combines the production is more Yuan characteristic collection, thus production reflection decision-making by chance information decision rule.The new algorithm carries out the reduction processing to the generated decision rules containing the frequency attribute and obtains the simplest decision rules. Then these new rules are taken as the sample and analyzed through the multivariate linear regression and are executed reprocessing by means of the influence operator. The corresponding multiple linear regression model was established and the local optimum subset was found according to the sample. From them the new multiple linear regression model was built. Finally the least squares method was used to the new regression model to estimate the regression coefficient to carry on the estimate, obtained treats estimates the regression coefficient A0i01A12A2+…+βpAp=fi(A1,…, Ap), i=1, 2,…, p, obtained the direct-viewing parametric function. The user might utilize this group of functions to examine that conveniently the decision-making rests on and obtains intuitively the analysis information.

  • 【网络出版投稿人】 厦门大学
  • 【网络出版年期】2009年 08期
  • 【分类号】TP311.13;F830.4
  • 【被引频次】2
  • 【下载频次】284
  • 攻读期成果
节点文献中: 

本文链接的文献网络图示:

本文的引文网络