P2P-based Weighted Behavioral Characteristics Of Deep Packet Inspection Algorithm

更新时间:2023-06-07 19:29:01 阅读量: 实用文档 文档下载

说明:文章内容仅供预览,部分内容可能不全。下载后的文档,内容与下面显示的完全一致。下载之前请确认下面内容是否您想要的,是否完整无缺。

201O International Conference on Computer, Mechatronics, Control and Electronic Engineering (CMCE)

P2P-based Weighted Behavioral Characteristics Of Deep Packet Inspection Algorithm

LiJuan

Zhang

College of Computer Science and Engineering

Changchun University of Technology Changchun, 130012, P.R. China

DongMing

Li

College ofinformation Technology

Jilin Agriculture University Changchun, 130118, P.R. China Idm0214@

Jing Shi, JunNan Wang

College of Computer Science and Engineering

Changchun University of Technology Changchun, 130012, P.R. China

Abstract-This paper deeply analyzes the characteristics of P2P

flow and the most popular detection algorithm, on the basis of which, P2P-based weighted behavioral characteristics of deep packet inspection algorithm is proposed, which better combined two kinds of practical P2P detection technology, solving problems of a high rate of false behavioral detection technology positively in the way of P2P flow weighted behavioral characteristics, solving problems of the P2P flow classification through deep packet inspection technology, and providing better solutions both to decrease the delay of deep packet inspection and help to relieve the problems of large number of inspection affecting network speed.

Keywords-Peer-to-Peer;

Deep

Packet

Inspection;

Payload

Eigenvalue; Weighted Behavioral Characteristics

1 INTRODUCE

Peer-to-Peer is a point-to-point peer network, and a user's

computer is not only a server but also a client-computer. While each node enjoying services provided by others, they also offer services to others. Since the emergence of Napster in 1999, a variety of P2P software are springing up one after another. According to relevant statistics, data traffic of P2P has occupied more than 70% bandwidth of the internet which severely affects other Internet services and damages to the ISP's interests. Therefore, in the network, testing and the implementation of effective controlling data flow of P2P has become a focus issue of network management.

11

ANALYSIS OF P2P INSPECTION TECHNOLOGY

At present, the practical P2P inspection technology includes port-based recognition technology, Deep Packet Inspection (DPI) based technology, behavioral characteristics of the transport layer inspection based techniques and machine learning inspection based technology.

A.

Port Identification Technology

The initial P2P flow used a fixed port to transfer data which is the same as general service data flow. As a result, when detecting P2P flow ,it only needs to analyze the transport layer of data packets to determine whether it is P2P data flow or not. With the development of P2P technology, more and more P2P applications use dynamic ports technology

978-1-4244-7956-6/1 0/$26.00 ©20 1 0 IEEE

468

to avoid the detection of port recognition. Therefore the port is no longer suitable for the P2P detection operation.

B.

Deep Packet Inspection Technology

Deep Packet Inspection technology is curr e ntly the most popular P2P detection technology. The basic principle is, as each P2P data packet has a payload eigenvalue, the payload eigenvalue is consist of a set of continuous or separated strings, as shown in Table 1, the detector analyze P2P protocol in prior and fmd the payload eigenvalue , then co-exist into the feature library. When detecting of the packets, it should analyze the application layer data firstly, and find out the payload eigenvalue from which, then match the value of library and features, a successful match illustrates that the packet is P2P data packets, on the contrary non-P2P data packets.

Table 1

P2P

Protocal

Bittorrent PPLive QQLive fasttrack Application layer payload eigenvalue of

I Transport Layer Payload eigenvalue

Protocal

TCP Ox13"bittorrent protocol"

TCPIUDP "GNUT","GIV","GND"

TCPIUDP Oxe3,Oxc5 TCPIUDP Ox39000000,Oxe903 UDP Oxfe TCPIUDP "GIVE",0x270000002980

..

The advantages of DPI technology are as follows: It IS easy to comprehend, convenient to upgrade, simple to maintain, and the most common way to apply at present; its missed detection rate rather low, and the error detection won't happen.

The disadvantages of the DPI technique are as follows: it has hysteresis for new P2P application detection, namely it can't detect new P2P application before not upgrading the feature library and can implement effective detection of the application after fmding the payload eigenvalue of new application; detection of encrypted P2P applications are limited; algorithm performance is related with the complexity of the payload eigenvalue, the more complex of the eigenvalue, the higher the detection cost is ,the worse the algorithm performance is and so on.

CMCE 2010

C. Inspecting technology based on the transport layer

behavioral characteristics

The flow characteristics of P2P data flow shown in the transport layer are different from general network service application. The detection based on the transport layer behavioral characteristics is to take advantage of such as IP address, port number, transport protocols and so on, to carry

out the P2P flow detection technology, which uses a number of general concepts of statistics field, to fmd out the behavioral characteristics during the transmission process of the P2P flow, and not need to analyze the interior in the data packet which can determine whether it is the P2P data flow or not. The technology perfectly solves the deficiencies of the port detection technology and OPI detection technology.

Its advantages are as follows: there is no need to analyze data packet, which increases the detection rate and has good detection capability for the unknown P2P flow.

III OPI DETECTION ALGORITHM BASED ON WEIGHTED

BEHAVIOR CHARACTERISTICS

By means of the analysis of technologies above, we can draw that OPI detection technology can prevent the fault detection. But the massive packets detection will affect network transmission speed, and have hysteresis quality for the new P2P applications, the missing rate is higher. The transport layer behavior characteristics detection have efficiency and ability of testing new P2P, but sometimes there are fault detection. Therefore, synthesizing the two kinds of detecting technology above, we put forward a method based on weighted behavior characteristics of deep packet inspection algorithm, which, in condition of ensuring the quality of QoS, avoid fault detection and reduce the rate of missing to the best of one's abilities.

A. The general thought of a lgorithm

Firstly, scanning the data flow which is to be detected on transport layer behavior characteristic to fmd the data collection respectively which is fit for different behavior characteristic, weighting data flow within the collection , and

get the union set T of the collection, weighting data flow within T again and detect the data flow which less than appointed weight number within T with OPI technology to fmd whether it is P2P flow or not. If it is not P2P flow, we will handle it as normal data flow. For data flow within T whose weight value is greater than or equal the appointed weight, directly defmed as P2P flow, without OPI detection.

B. Determining the transport layer behavior characteristic

of t he P2P fl ow

Based on the analyzing data flow of some P2P software at present and consulting relevant documents, the algorithm have applied three strong practical behavior characteristic.

Feature 1: Flow between source IP address and purpose IP address using TCP and UDP transfer protocol at the same time may be P2P flow. P2P applications generally adopt UOP transmission control information, use TCP for data transmission, but rarely use two protocol at the same time.

Feature 2: If number of different IP connected with the same {IP , Port } equals the number of different port to the general, the {IP, Port}may be the P2P applications

469 transmission . A new host joining P2P systems, will sent its IP address and Port to super peer at first, then Other hosts which receive this information will connect with this host through Port, meanwhile, the number of different IP addresses connected with the Port are equal to the number of different ports connected with this host Port in general, because the possibility of different hosts connected with this Port using the same port number which is rarely possible. As well as general applications, the number of IP address connecting establish is far less than the number of ports connecting establish.

Feature 3: Over a period of time, if the ratio of downlink traffic which is belong to the port of an IP address is in a certain range, the port may be P2P data flow. The tenet of P2P systems is "all for one, one for all", so each node who downloads data from other nodes, Simultaneously uploads data for others. Therefore, within a period of time, after P2P systems runn i ng in balance, the ratio of downlink traffic of the host operating P2P will be to balance within a certain range. However, the ratio of downlink traffic of the host operating general application changes greatly.

C. Weighting behavior characteristic

The above three transport layer behavior has some misstatement, for example, ONS system also use TCP and UDP protocol at the same time; E-mail application and some network game also have the characteristic of [Feature 2], these data flow are non-P2P flow. Therefore, the results of behavior characteristic detection is uncertain. Setting weight number for fitting some behavior characteristics data flow , which is for evaluating the uncertainty of data flow. Larger the testing error rate of behavior characteristics detection is, smaller the weight number is. In the algorithm, ways of setting weight number as following:

(1 )If the error detection rate of feature n is an, then, to comply with feature n, a data flow weight is lIan.

(2)If the same data flow corresponds with feature n and feature m at the same time, the data flow weight is lI(an * am) .

D. The description of d eep packet inspection algorithm

based on weighted behavioral characteristics The flow chart of deep packet inspection algorithm which based on the weighted behavioral characteristics is shown in Figure 1:

Get the network data flow+'

scan a collection scan a collection of sc an a colle cti on

of d ata flow T 1data flow T l which of data stre am TJ

which conform conform the second which conform

the fir st features+' features+' the thir d fe atur es+'

T=T1UT J UT J+'

T in the data flow on the weighted+'

Figure 1. Based on the transport layer behavioral

heuristic DPI detection flow chart

The specific steps are as follows:

characteristics

a) The use of NetFlow software for getting network data flow.

b) Through analyzing the data flow, collecting data flow information on the five-group.

c) Scan the five-group information of data flow, respectively ,to identify characteristics of feature 1, feature 2 and feature 3's host data flow, and added to the set of Tl, T2,

T3.

d) Calculate the set of Tl, T2, T3's union of set T, and calculate the data flow of weight in set T.

e) To determine whether the set of T in the data flow is less than the specified weight, if less than the specified weight, go

to step 6, otherwise identified as P2P traffic.

±) For the data flow which less than the specified weight, using detection technique of DPI to determine whether is the

P2P flow or not.

g) If the DPI identified as P2P flow, it is determined as

P2P flow. If no identification, it will be identified as normal data flow.

IV. CONCLUSION

The deep packet inspection algorithm which based on the weighted behavioral characteristics is better combined two kinds of practical P2P detection technology. By the way of

470 weighted behavioral characteristics can effectively solve the transport layer behavior characteristics, which has a high error detection rate technology issues. Through the DPI technology, we can solve the P2P traffic classification and other issues. Meanwhile, for the DPI to detect the lag and a lot of a inspection and other issues which affecting network speed has also been well solved.

As the algorithm has a higher scability, it can be mainly improved from two aspects. First, adding more behavior characteristics of detection methods, given appropriate weight in order to improve the capacity of the detection of unknown P2P flow; Second, update DPI detection payload eigenvalue in order to reduce generation of the lower error detection rate and missing alarm rate.

REFERENCES

[1] Karagiannis T., Broido A., Faloutsos M., et aI. Transport layer

identification of P2P traffic. in Proceedings of the 2004 ACM SIGCOMM Internet Measurement Conference. Taormina, Italy: Association for Computing Machinery, United States, 2004. 121-134 [2] Karagiannis T., Broido A., Brownlee N., et aI. Is P2P dying or just

hiding? in IEEE Global Telecommunications Conference. Dallas, TX, United States: Institute of Electrical and Electronics Engineers Inc, United States, 2004. 1532-1538

[3] Liu Bin.Study of P2P Flow Measurement and Identification Method. A

Dissertation Submitted to Academic Degrees Evaluation Committee of HUST for the Degree of Doctor of Philosophy in Engineering,2008.14-22

[4] Wang rui,wang yixin.A Method Of P2P traffic detection Based on A

Cross-layer[J].Joumal of Computer applications,2006,26:30-32

[5] Liu bin,Li zhitang,Li jia.A Method of Real-time P2P Traffic

Identification Based on Flow Characteristics[J].Joural of Xiamen University:Natural Science,2007,46: 132-135.

[6] Kumar A., Minho S., Jun x., et aI. Data streaming algorithms for

efficient and accurate estimation of flow size distribution. in Performance Evaluation, New York, NY, USA: ACM, 2004.176-184.

[7] Chuanfu C., Congjing R., Qiong T., et al. Application of law of

copyright disputes of P2P file-sharing technology. in 2008 Fourth International Conference on Networked Computing and Advanced Information Management Gyeongju, South Korea: IEEE, 2008. 234-236

本文来源:https://www.bwwdw.com/article/zhh1.html

Top