节点文献

用于数据检索的Linux集群系统中的负载均衡机制研究与应用

The Research and Application of Linux Cluster System’s Load Balance Mechanism for Data Retrieval

【作者】 崔爽

【导师】 房至一;

【作者基本信息】 吉林大学 , 计算机系统结构, 2010, 硕士

【摘要】 本文的研究课题来源于国家科技部科技型中小企业创新基金项目—用于数据检索的Linux集群系统,该项目是基于Linux平台上的高性能、高可靠的用于数据检索的集群系统软件,由于它集成了高可用软件、负载均衡和集群文件系统于一体,简化了集群的管理,方便了使用,能为企业级关键业务应用提供强大的保障。负载均衡是Linux集群系统的关键技术,可以拓展网络设备和服务器带宽,增加吞吐量和提高网络处理能力,为高可用集群系统的正常运行提供了可靠保证。本文结合当前负载均衡的算法分析和比较,提出一种负载均衡的动态反馈策略。在负载均衡检查方面,综合了服务器性能指标值和服务器节点动态负载值作为评估服务器节点负载能力的指标。引入了服务器节点负载冗余值,可以有效预测当前节点的负载能力,协助负载调度器分配任务请求,避免单个服务器节点负载过量的问题。采用基于二叉排序树的负载调度策略,简化了负载均衡器分配任务的过程和方法。本文给出了系统的具体实现方法,并搭建基于Linux的集群系统平台,调试并运行程序,实现Linux集群系统下的负载均衡。本项目成果已通过中国软件评测中心吉林分中心的测试。

【Abstract】 With fast growth of the network business, the nodes, which provide network services, are facing the increasing service requests from users, data flow and computing intensity are increasing constantly, bringing tremendous challenges about the network bandwidth and server. In the future, there will be more and more bottlenecks appear in the server port, it is an emergency that how to build the highly availability, better function and price, scalable network services to meet demand of the growing load. In this case, the load balance technology of cluster based on Linux has emerged.The research topic of this article comes from National Ministry of Science and Technology, the project of Science and Innovation Fund for SMEs, Linux cluster for data retrieval system, the project product is based on the Linux cluster system, with high-performance, highly reliable cluster system for data retrieval software products. Because of it integrates high-availability software, load balance and cluster file system in whole, simplifies the cluster management and convenient application, provides a strong protection for enterprise’s important business application. Load balance is a Linux cluster system’s key technology, which can expand the network device and server bandwidth, increase throughput and network capacity, and provide a reliable guarantee for the normal operation of high-availability.Firstly, this paper introduces the project’s background and relevant technical knowledge and discusses the characteristics of the cluster system as well as its classification. Secondly, the paper introduces some contents about the load balance, through research to the frequently-used load balance strategy, proposing dynamic real-time feedback information, the load balance strategy can predict the load capacity, and give a description of the algorithm. Finally, the paper introduces the realization process of the load balance based on Linux cluster system, and conducts a simulated environment for debugging and running.There is a technique called the cluster technology, that organizes multiple computers to work together to simulate a more powerful computer to solve the problem. A cluster system consists of few servers that have shared data storage, each server communicates with each other through the network, when a server is out of order, its application automatically taken over by the other server. In the most models, all the computers of the cluster have a common name, any running service on the system of the cluster can be used by all the users, presenting a whole system. Load balance can be divided into a static form of load balance and dynamic load balance in accordance with the allocation of the task. In the network environment, when the load balance is receiving task request from user, it will allocate task as much as possible to each server of the cluster according to some particular algorithm, so that maintain the user request amount for each server at a relative balance, but this balance can not take into consideration about the load capacity of the server itself. The method of the dynamic load balance has many advantages than that of the static load balance, dynamic load balance refers to the thing that assign the task to the lighter load of the server node, gives a real-time dynamic record for the load information of server node, so as to avoid a single node overload , so as to the sever, that the members of the cluster server achieve uniform as much as possible, this technology can achieve dynamic allocation , which would take into account the various nodes in the server’s actual carrying capacity.This paper discusses the balance strategy of dynamic feedback load information; the algorithm mainly has the following characteristics: First, it gives full consideration to each server node’s processing power and the current load conditions. As the cluster system, the performance of the various server nodes may different, so in practice, to consider the server’s performance index, allocate high- server with high process ability, while the low-performance configuration of the server with the low process ability. When introducing the server node performance indicator, it sets the value of the dynamic load by real-time monitoring of each server node responding the actual load capacity of this server. Second, the collection, calculations of the node information are put into various nodes, avoiding load balancer work too heavy itself, that become the system bottlenecks. The system transforms the focus dispatch collection work of all the nodes by the central node before, to the work of collection by each node itself, according to its status, it sends to the scheduling center. Thus, this central dispatch node only need to accord to the current node, sending the load information to make dispatch decision, instead of taking collect information from each node machine, such to reduce the additional communication overhead due to the load information collection and the scheduling node burden. Finally, the algorithm introduces the concept of binary sort tree, according to unite the server node performance and real-time load information indicators to calculate the weight so as to generate the binary sort tree, so only need LDR this binary sort tree can arrange the current load condition of server node in order like small to large, load balancers depend on the load information dispatch task. Because of the algorithm introduces the concept of the load redundancy value for each server node to predict the real-time load redundancy ability, avoiding the single server node will be requested excessive task in a short period of time.In the Linux virtual machine environment, we imitate a Linux load balance cluster. Load balancer’s main task is to collect the information from the sub-server node at regular time, receive the task from external request, and then allocate the tasks to the sub-server nodes. Server node’s main task is that send its system information to the load balancer at regular time, while dealing with the task request from balancer. The establishment of simulation test WEB server, sending request from the user page, exchange data though XMLHTTP Request technology and Apache2 Web server, after the server analyses the type and quantity of the request, using the CGI that written by the Perl script, invoking relevant procedure based on the Socket communication, sending external requests to the load balancer. It uses the HTML and JavaScript scripting languages to write simulate task request user interface, where a page is simulated about user task request, which the user can set the request type ,the number of task request,the IP of the load balancer, and the port number. Linux cluster is built in the simulation system for the program’s debugging and running. The system has passed through the test by the Jilin Branch Center of China Software Testing Center and obtained the product test report.

  • 【网络出版投稿人】 吉林大学
  • 【网络出版年期】2010年 09期
  • 【分类号】TP391.3
  • 【下载频次】122
节点文献中: 

本文链接的文献网络图示:

本文的引文网络