IJRIT International Journal of Research in Information Technology, Volume 2, Issue 4, April 2014, Pg: 80- 88

International Journal of Research in Information Technology (IJRIT) www.ijrit.com

ISSN 2001-5569

String Pattern Matching For High Speed in NIDS Mr. B.VARUNKUMAR, Mrs.S.V.SURYAKALA M.Tech Student, Department of Electronics and Communication Engineering SRM University Chennai, India Asst.Professor, Department of electronics and communication engineering SRM University Chennai, India [email protected] , [email protected]

Abstract — This paper is based on Network Intrusion Detection System (NIDS) using FPGA. Now days the system is highly affected through the malicious patterns, so it demands exceptionally high performance. To improve the performance hardware components are required. In NIDS pattern matching demands exceptionally high performance to match the patterns. The work has done in this paper is to increase the throughput and the speed of the system and the memory required for area has to reduced. In this the string pattern matching has to be done in ASCII based bits and BIT based bits using Aho-Corasick algorithm and Short pattern matching algorithm. The speed of the ASCII based bits is 160.21MHZ where as in BIT based is 360.01MHZ.Our implementation will be having the throughput of 1Gbps. The malicious patterns can be detected in this performance and the power consumption will be low, for pattern matching Finite State Machine(FSM) is used to change the sequential states. Keywords: NIDS, FSM, AHO-CORASICK algorithm, Frequency, Power Consumption.

1.INTRODUCTION Pattern matching for network security and intrusion detection demands exceptionally high performance. Much work has been done in this field and yet efficient, flexible, and powerful systems still have significant room for improvement. Methods commonly used to protect against security breaches include firewalls with filtering mechanisms to screen out obviously dangerous packets, and intrusion detection systems which use much more sophisticated rules and pattern matching to sense potential malicious packets. These techniques require significant computational resources, and, using highly-parallel flexible fabrics such as FPGA, provide opportunities for dramatic improvements. The power of the internet has grown explosively to a giant open network. Internet attacks require little efforts and monetary investment to create, are difficult to trace, and can be launched from virtually anywhere in the world. Therefore, computer networks are constantly assailed by attacks and scams, ranging from nuisance hacking to more nefarious probes and attacks. The most commonly used network protection systems are firewall and Network Intrusion Detection System (NIDS). They are critical net-work security tools that help protect high-speed computer networks from malicious users. Firewall and NIDS are installed at the border of a network to inspect and monitor the incoming and outgoing network traffic. Firewall, which performs only layer 3 or 4 filtering, processes packets based on their headers. NIDS, in contrast, provides not only layer-3 or 4, but also layer-7 filtering. NIDS searches both packet headers and payloads to identify attack patterns (or signatures). Hence, NIDS can detect and prevent harmful content, such as computer worms, malicious codes, or attacks being transmitted over the network. Such systems examine network communications, identify patterns of computer attacks, and then take action to either Mr. B.VARUNKUMAR, IJRIT

80

IJRIT International Journal of Research in Information Technology, Volume 2, Issue 4, April 2014, Pg: 80- 88

terminate the connections or alert system administrators.

With the rapid expansion of the Internet and the explosion in the number of attacks, design of Network Intrusion Detection Systems has been a big challenge. Advances in optical networking technology are pushing link rates beyond OC-768 (40 Gbps). This throughput is impossible to achieve using existing software-based solu-tions [9], and thus, must be performed in hardware. Most hardware-based solutions for high-speed string matching in NIDS fall into three main categories: ternary content addressable memory (TCAM)-based, dynamic/ static random access memory (DRAM/SRAM)-based, and SRAM-logic-based solutions. Although TCAM-based engines can retrieve results in just one clock cycle, they are power hungry and their throughput is limited by the relatively low speed of TCAMs. On the other hand, SRAM and SRAM-logic-based solutions require multiple cycles to perform a search. Therefore, pipelining techniques are commonly used to improve the throughput. In the SRAM-logic-based approach, a portion of the dictionary is implemented using logic resource, making this approach logic bound and hard to scale to support larger dictionaries. The SRAM-based approaches, which are memory bound, result in an inefficient memory utilization. This inefficiency limits the size of the supported dictionary. In addition, it is difficult to use external SRAM in these architectures, due to the constraint on the number of I/O pins. This constraint restricts the number of external stages, while the amount of on-chip memory upper bounds the size of the memory for each pipeline stage. Due to these two limitations, state-of-the-art SRAM-based solutions do not scale well to support larger dictionary. This scalability has been a dominant issue for implementation of NIDSes in hardware . The key issues to be addressed in designing an architecture for string pattern matching engines are 1. Size of the supported dictionary, 2. throughput, 3. scalability with respect to the size of the dictionary, and 4. dictionary update. To address these challenges, we propose a preprocessing algorithm and a scalable, high-throughput, Memory-efficient Architecture for large-scale String Matching (MASM). This architecture utilizes binary search tree (BST) structure to improve the storage efficiency. MASM also provides a fixed latency due to the linear pipelined architecture. This paper makes the following contributions: An algorithm called Aho-Corasick algorithm that can be performed for the ASCII bits. In that it requires more area and the speed of the system is increased as compared to normal performance. The short-pattern matching algorithm is performed for BIT based string pattern matching and the area consumed will be less and the speed will be increased.

2. BACKGROUND AND RELATED WORK 2.1. String Pattern Matching String pattern matching (or simply string matching) is one of the most important functions of the NIDSs, as it provides the content-search capability. A string matching algorithm compares all the string patterns in a given dictionary (or database) to the traffic passing through the device. Note that the string matching is also referred to as exact string matching. Among currently available NIDS solutions, Snort [2] is a popular open source and cross-platform NIDS. Snort uses signatures and packet headers to detect malicious internet activities. As an open source system, Snort rules are contributed by the network security community to make widely accepted and effective rule-sets. These rule-sets, which include both packet headers and signatures (strings and regular expressions), have grown quite rapidly, as Mr. B.VARUNKUMAR, IJRIT

81

IJRIT International Journal of Research in Information Technology, Volume 2, Issue 4, April 2014, Pg: 80- 88

rules are added as soon as they are extracted by the network security experts. The string patterns constitute the largest portion of the signatures in a Snort database. String pattern matching (or simply string matching) is oneof the most important functions of the NIDSs, as itprovides the content-search capability. A string matchingalgorithm compares all the string patterns in a givendictionary (or database) to the traffic passing through thedevice. Note that the string matching is also referred to asexact string matching.Among currently available NIDS solutions, Snort is apopular open source and crossplatform NIDS. Snort usessignatures and packet headers to detect malicious internetactivities. As an open source system, Snort rules arecontributed by the network security community to makewidely accepted and effective rulesets. These rule-sets,which include both packet headers and signatures (stringsand regular expressions), have grown quite rapidly, as rulesare added as soon as they are extracted by the networksecurity experts. The string patterns constitute the largestportion of the signatures in a Snort database. There are over8K string signatures in the current Snort database.

Fig 1.Multiple string-matching where state machine will recognize the appearance of any of the search strings anywhere in the entire data stream. To address some challenges in pattern matching sequence, a pre-processingString algorithm and a scalable, highthroughput, Memory-efficientArchitecture for large-scale String Matching (MASM).This architecture utilizes binary search tree (BST) structure to improve the storage efficiency. MASM also provides afixed latency due to the linear pipelined architecture.This paper makes the following contributions: • • • •

An algorithm called leaf-attaching to efficientlydisjoint a given dictionary without increasing thenumber of patterns An architecture that achieves a memory efficiencyof 0.56 (for Rogets) and 1.32 byte/char (for Snort)Stateof-the-art designs can only achievethe memory efficiency of over 2 byte/char in thebest case. The implementation on ASIC and FPGA shows asustained aggregated throughput of 24 and 3.2 Gbps,respectively . The design can be duplicated to improve the throughput by exploiting its simple architecture.

Mr. B.VARUNKUMAR, IJRIT

82

IJRIT International Journal of Research in Information Technology, Volume 2, Issue 4, April 2014, Pg: 80- 88

2.2. SPLIT ALGORITHM FOR PATTERN MATCHING Most patterns use only a small subset of the 256 possible characters.Some pattern characters are frequent and appear in a transition almost in every state while others appear infrequently. The pattern matching module which is shown in Fig 2 operates at pipelining stage. 2.3. ARCHITECTURE There are two matching steps in the architecture: 1) Pattern matching and 2) Labelmatching, handled by the pattern matching module andlabel matching module (LMM), respectively.Input datastream is fed into PMM L bytes at a time. This input windowis advanced 1 byte per clock cycle. PMM then matches theinput string against the pattern database, while LMMmatches the {prefix; suffix; match vector} combination tovalidate the long pattern and outputs the matching result.In LMM, all entries are uniquely defined. Hence, any matching mechanism can be utilized. The critical point isthe relationship between the size of the input window Land the number of entries in the LMM. The window size Lshould be greater than or equal to the matching latency of theLMM. For this reason, L should be chosen according to thesize of the dictionary.The block diagram of the basic pipeline and a singlestage of a BST are shown in Fig. 2. To take advantage of thedual-port feature offered by SRAM, the architecture isconfigured as dual-linear pipelines. This configurationdoubles the matching rate. At each stage, the memory hastwo sets of Read/Write ports so that two strings can beinput every clock cycle. The content of each entry in thememory includes: 1) a pattern P, 2) a match vector MV, and3) a pattern label PL. In each pipeline stage, there are fourdata fields forwarded from the previous stage: 1. The input string SI , 2. The matching status vector MSV , 3. The memory access address Addr, and 4. The matched label ML.

Mr. B.VARUNKUMAR, IJRIT

83

IJRIT International Journal of Research in Information Technology, Volume 2, Issue 4, April 2014, Pg: 80- 88

Fig 2 Basic Pattern Matching module with pipelining stage The forwarded memory address is used to retrieve thepattern and its associated data stored in the local memory.This information is compared with the input string todetermine the matching status. In case of a match, thematched label (ML) and the matching status vector (MSV)are updated. The comparison result (1 if the input string isgreater than the node’s pattern, 0 otherwise) is appended tothe current memory address and forwarded to the nextstage. 2.4. Comparator Module There is one comparator in each stage of the pipeline. Itcompares the input string with the node’s pattern, and usesthe node’s match vector (MV) to produce the matchingstatus vector (MSV ). Fig. 3 depicts the block diagram of an8-byte comparator. The inputs include: 1. An input string SI , 2. A pattern P, 3. A match vector MV, 4. A pattern label PL, and 5. A match label.

Fig 4. Block diagram of 8 byte comparator

Fig.4.1.Operation table of 8 bit matching vector decoder The input string and the pattern P go into the “bytecomparator,” which performs byte-wise comparisons of thetwo inputs. The results (M7-M0) are fed into the “matchingvector decoder,” which operates based on the truth table Mr. B.VARUNKUMAR, IJRIT

84

IJRIT International Journal of Research in Information Technology, Volume 2, Issue 4, April 2014, Pg: 80- 88

shown in Fig. 4. The output of the decoder is AND-ed withthe node’s match vector. The result is then AND-ed with theoutput of the “string comparator,” which compares thepattern label and the match label to produce an 8bitmatching vector (MSV).

3. FSM STATES The Matching Pattern is done in a FSM flow that flows through a number of states based on the complexity of patterns.

Fig 5 FSM STATE BLOCK The memory efficiency has to be analyzed by two ways:  

number of states used in matching. number of Logic elements utilized after synthesis.

Fig.6.Fsm flow for ASCII based bits Mr. B.VARUNKUMAR, IJRIT

85

IJRIT International Journal of Research in Information Technology, Volume 2, Issue 4, April 2014, Pg: 80- 88

Fig.7.Fsm flow for bit based approach

4. AREA REPORT OF THE DESIGN Table .4.1 Area Report of Patterrn Matching NAME OF BITS TOP LEVEL ENTRY ENTITY NAME TOTAL LOGIC ELEMENTS TOTAL COMBINATIONAL FUNCTION AREA CONSUMED

ASCII bits AHOCORASICK

BIT BASED SHORT TOP MODULE

137

128

137

120

3%

2%

4.1.Fmax REPORT Table.4.2. Comparison of Fmax

ASCII BASED BIT BASED

Fmax 160.21MHZ 361.01MHZ

Mr. B.VARUNKUMAR, IJRIT

Clock CLK CLK

86

IJRIT International Journal of Research in Information Technology, Volume 2, Issue 4, April 2014, Pg: 80- 88

5. SIMULATION RESULTS

Simulation result of pattern matching

Simulation Result for series of Patterrn The series fo pattern matching for the text c,l,i,e,n,t are matched at each cycle i,e, at each states.

6. CONCLUSION A fixed length and arbitrary length matching algorithms for string matching has been done. The algorithm achieves better memory efficiency compared with that of the state of the arts and the speed has been increased for the processor system.

Mr. B.VARUNKUMAR, IJRIT

87

IJRIT International Journal of Research in Information Technology, Volume 2, Issue 4, April 2014, Pg: 80- 88

7. REFERENCES [1] Z.K. Baker and V.K. Prasanna, “A Methodology for Synthesis ofEfficient Intrusion Detection Systems on Fpgas,” FCCM ’04: Proc.12th Ann. IEEE Symp. Field-Programmable Custom ComputingMachines, pp. 135-144, 2004. [2] A. Basu and G. Narlikar, “Fast Incremental Updates forPipelined Forwarding Engines,” Proc. IEEE INFOCOM ’03,pp. 64-74, 2003.[6] CACTI Tool, http://quid.hpl.hp.com:9081/cacti/, 2012. [3] C.R. Clark and D.E. Schimmel, “Scalable Pattern Matching forHigh Speed Networks,” FCCM ’04: Proc. 12th Ann. IEEE Symp.Field-Programmable Custom Computing Machines, pp. 249-257, 2004. [4] P. Gupta and N. McKeown, “Algorithms for Packet Classification,”IEEE Network, vol. 15, no. 2, pp. 24-32, Mar/Apr. 2001. [5] N. Hua, H. Song, and T.V. Lakshman, “Variable-Stride Multi-Pattern Matching for Scalable Deep Packet Inspection,” Proc. IEEEINFOCOM ’09, Apr. 2009. [6] H.-J. Jung, Z. Baker, and V. Prasanna, “Performance of FPGAImplementation of Bit-Split Architecture for Intrusion DetectionSystems,” Proc. Int’l Parallel and Distributed Processing Symp.,p. 177, 2006. [7] J. Dharmapurikar, S. Lockwood. Fast and scalable pattern matching for network intrusion detection systems. IEEE Journal on Selected Areas in Communications, 24(10):1781– 1792, 2006. [8] M. French, E. Anderson, and D.-I. Kang. Autonomous system on a chip adaptation through partial runtime reconfiguration. In FCCM ’08: Proceedings of the 2008 16th International Symposium on Field-Programmable Custom Computing Ma-chines, pages 77–86, Washington, DC, USA, 2008. IEEE Computer Society. [9] P. Gupta and N. McKeown. Algorithms for packet classifica-tion. IEEE Network, 15(2):24–32, 2001. [10] N. Hua, H. Song, and T. V. Lakshman. Variable-stride multi-pattern matching for scalable deep packet inspection. In INFOCOM 2009. The 28th Conference on Computer Communications. IEEE, April 2009. . [11] R. Scrofano, M. B. Gokhale, F. Trouw, and V. K. Prasanna. Accelerating molecular dynamics simulations with recon-figurable computers. IEEE Trans. Parallel Distrib. Syst., 19(6):764–778, 2008. [12] I. Sourdis and D. Pnevmatikatos. Fast, large-scale string match for a 10gbps fpga-based network intrusion. FPL, 2003:880–889, 2003. [13] L. Tan, B. Brotherton, and T. Sherwood. Bit-split string-matching engines for intrusion detection and prevention. ACM Trans. Archit. Code Optim., 3(1):3–34, 2006. [14] L. Tan and T. Sherwood. A high throughput string matching architecture for intrusion detection and prevention. In ISCA ’05: Proceedings of the 32nd annual international symposium on Computer Architecture, pages 112–122, Washington, DC, USA, 2005. IEEE Computer Society. [15] Y.-H. E. Yang and V. K. Prasanna. Memory-efficient pipelined architecture for large-scale string matching. In FCCM ’09: Proceedings of the 17th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, Washington, DC, USA, 2009. IEEE Computer Society. [16] F. Yu, R. H. Katz, and T. V. Lakshman. Gigabit rate packet pattern-matching using tcam. In ICNP ’04: Proceedings of the 12th IEEE International Conference on Network Protocols, pages 174–183, Washington, DC, USA, 2004. IEEE Com-puter Society.

Mr. B.VARUNKUMAR, IJRIT

88

String Pattern Matching For High Speed in NIDS - IJRIT

scalability has been a dominant issue for implementation of NIDSes in hardware ... a preprocessing algorithm and a scalable, high-throughput, Memory-effi-.

2MB Sizes 0 Downloads 218 Views

Recommend Documents

String Pattern Matching For High Speed in NIDS
They are critical net-work security tools that help protect high-speed computer ... Most hardware-based solutions for high-speed string matching in NIDS fall into ...

Pattern Matching
basis of the degree of linkage between expected and achieved outcomes. In light of this ... al scaling, and cluster analysis as well as unique graphic portrayals of the results .... Pattern match of program design to job-related outcomes. Expected.

Tree Pattern Matching to Subset Matching in Linear ...
'U"cdc f f There are only O ( ns ) mar k ed nodes#I with the property that all nodes in either the left subtree ofBI or the right subtree ofBI are unmar k ed; this is ...

Towards High-performance Pattern Matching on ... - Semantic Scholar
such as traffic classification, application identification and intrusion prevention. In this paper, we ..... OCTEON Software Developer Kit (Cavium SDK version 1.5):.

Towards High-performance Pattern Matching on ... - Semantic Scholar
1Department of Automation, Tsinghua University, Beijing, 100084, China. ... of-art 16-MIPS-core network processing platform and evaluated with real-life data ...

High Speed Wavelet Based FIR Filter Architecture on FPGA ... - IJRIT
Abstract. This paper presents a new architecture for high speed implementation of wavelet based FIR filter on FPGA. The proposed architecture presents the ...

Fast exact string matching algorithms - Semantic Scholar
LITIS, Faculté des Sciences et des Techniques, Université de Rouen, 76821 Mont-Saint-Aignan Cedex, France ... Available online 26 January 2007 ... the Karp–Rabin algorithm consists in computing h(x). ..... programs have been compiled with gcc wit

Fast exact string matching algorithms - ScienceDirect.com
method to avoid a quadratic number of character com- parisons in most practical situations. It has been in- troduced ... Its expected number of text character comparisons is O(n + m). The algorithm of Wu and ...... structure for pattern matching, in:

Efficient parameterized string matching
Jun 14, 2006 - means by definition that P [j] = i. If any of ..... with realistic real world data. .... Parameterized duplication in strings: algorithms and an application.

Tree pattern matching in phylogenetic trees: automatic ...
Jan 13, 2005 - ... be installed on most operating systems (Windows, Unix/Linux and MacOS). ..... a core of genes sharing a common history. Genome Res., 12 ...

Tree pattern matching in phylogenetic trees: automatic ...
Jan 13, 2005 - leaves. Then, this pattern is compared with all the phylogenetic trees of the database, to retrieve the families in which one or several occur- rences of this pattern are found. By specifying ad hoc patterns, it is therefore possible t

Optimization of Pattern Matching Algorithm for Memory Based ...
Dec 4, 2007 - widely adopted for string matching in [6][7][8][9][10] because the algorithm can ..... H. J. Jung, Z. K. Baker, and V. K. Prasanna. Performance of.

q-Gram Tetrahedral Ratio (qTR) for Approximate Pattern Matching
possible to create a table of aliases for domain- specific alphanumeric values, however, it is unlikely that all possible errors could be anticipated in advance. 2.

Eliminating Dependent Pattern Matching - Research at Google
so, we justify pattern matching as a language construct, in the style of ALF [13], without compromising ..... we first give our notion of data (and hence splitting) a firm basis. Definition 8 ...... Fred McBride. Computer Aided Manipulation of Symbol

Efficient randomized pattern-matching algorithms
the following string-matching problem: For a specified set. ((X(i), Y(i))) of pairs of strings, .... properties of our algorithms, even if the input data are chosen by an ...

Optimization of Pattern Matching Algorithm for Memory Based ...
Dec 4, 2007 - accommodate the increasing number of attack patterns and meet ... omitted. States 4 and 8 are the final states indicating the matching of string ...

biochemistry pattern matching .pdf
biochemistry pattern matching .pdf. biochemistry pattern matching .pdf. Open. Extract. Open with. Sign In. Main menu. Whoops! There was a problem previewing ...

q-Gram Tetrahedral Ratio (qTR) for Approximate Pattern Matching
matching is to increase automated record linkage. Valid linkages will be determined by the user and should represent those “near matches” that the user.

Optimization of Pattern Matching Circuits for Regular ...
NFA approaches, a content matching server [9] was developed to automatically generate deterministic finite automatons (DFAs) .... construct an NFA for a given regular expression and used it to process text characters. ... [12] adopted a scalable, low

A New Point Pattern Matching Method for Palmprint
Email: [email protected]; [email protected]. Abstract—Point ..... new template minutiae set), we traverse all of the candidates pair 〈u, v〉 ∈ C × D.

Optimization of Pattern Matching Algorithm for Memory ...
Dec 4, 2007 - [email protected]. ABSTRACT. Due to the ... To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior.

Efficient Pattern Matching Algorithm for Memory ...
matching approaches can no longer meet the high throughput of .... high speed. Sourdis et al. ... based on Bloom filter that provides Internet worm and virus.

Efficient Pattern Matching Algorithm for Memory ... - IEEE Xplore
intrusion detection system must have a memory-efficient pat- tern-matching algorithm and hardware design. In this paper, we propose a memory-efficient ...

Optimization of String Matching Algorithm on GPU
times faster with significant improvement on memory efficiency. Furthermore, because the ... become inadequate for the high-speed network. To accelerate string ...