Innovating R Tree for Message Forwarding Technique and Hierarchical Network Clustering in Service Based Routing
Nguyen Thanh Long^{1}, Nguyen Duc Thuy^{2}, Pham Huy Hoang^{3}^{, *}
^{1}Informatic Center of Hanoi Telecommunications, Hoan Kiem, Hanoi, VietNam
^{2}Post and Telecommunications Institute, Cau Giay, Hanoi, VietNam
^{3}Hanoi University of Science Technology, Hanoi, VietNam
Email address:
To cite this article:
Nguyen Thanh Long, Nguyen Duc Thuy, Pham Huy Hoang. Innovating R Tree for Message Forwarding Technique and Hierarchical Network Clustering in Service Based Routing. Internet of Things and Cloud Computing. Vol.3, No. 3, 2015, pp. 59-65. doi: 10.11648/j.iotcc.s.2015030601.17
Abstract: In service based routing (SBR) [5], the problem for storing forwarding table that includes searching predicates received from subscribers through subscription messages is an important job. When a content request happens, the subscriber will create a subscription message. The subscription message stores several kinds of information, the most important content is a filter that is a conjunction of some constraints [5]. A filter is denoted by F character, that has mathematical formula: F = , in which is service request, has the format: = ‘Service_name = requested service name’. Every (i=) is a constraint that is formed by three components: (Key, op, Value), Key is a keyword for searching, op is an operator, Value is searching condition. Key belongs to the set of name of properties of content messages that the requested service supplies. Op is an operator that depends on the type of data of the Key. Therefore the forwarding table is a set of filters which are received from all subscribers on networks. The algorithms for inserting, updating, deleting and finding filters that match content messages have been published by service providers are very important. In this paper, We mention a technique for forwarding technique on the basic of summary filter for storing and searching filter quickly. This technique is based on some previous researches. In section 8, give an algorithm for finding all network nodes that have matched filters with a content message. Section 7 introduces a cluster routing technique based on summary filter.
Keywords: Service Based Routing, Filter,Constraint, Service, Routing, Tree, Split, Summary, R, Hierarchical, Cluster, Head
1. Introduction
1.1. Subscription/ Un-Subscription Messages
The subscription/ un-subscription message consists of subscriber’s address and its content request. In which, content request is a set of some constraints that have a common form: (key, operator, value). A constraint is denoted by C: C = (Key, Operator, Value).
In which: Key is a keyword to search content in content message. Operator determines the operation between a pair of a Key and a value. Value is a condition on a property of content message.
So each subscription/ un-subscription message contains a set of constraints that is called a filter that is denoted by F character. Each filter is presented by the formula: F = , n is number of constraints.
Service based address of a network node is a set of some filters received from this node that is called a predicate that is denoted by P: P = . n is number of filters. The service based addresses are stored in service based routing tables.
1.2. Service Based Routing Table
The service based routing table stores service based addresses of network nodes. Service based address is extracted from subscription message. These service based addresses are updated regularly by subscription and un-subscription messages: Routing_Table = {P_{1}, P_{2}, …, P_{n}}.
1.3. Forwarding Technique
Forwarding technique is a searching technique that finds in service based routing table of each service the nodes that have matched subscription messages with each content message received from network.
How to do: i) Standard method by using ternary search tree and binary search: (a) scan each property of a content message; (b) For each property [p], find all matched constraints with this p; (c) After this loop, get the set of some matched constraints: {C_{1}, C_{2}, …, C_{n}}. (d) Use the SBR_Couting algorithms to find all nodes that have made filters that are satisfied; ii) By using summary filter, it will be done by algorithm that is introduced in section 7.
1.4. Summary Filter
Definition:
Assume there are some filters: F_{1}, F_{2}, …, F_{n}, the summary filter of these filters is a filter that is denoted by . The summary filter is determined by the formula: = .
So this summary filter satifies:
(1)
Therefore the value domain of covers all value domains of each filter: : i=. Operator is introduced in the next section.
1.5. Level of a Filter
The number of constraints of a filter is the level of this filter. For example: F = : Level(F) = n.
2. Some Previous Researches
There some researches on summary filters, that focuses on increasing searching speed and network throughput. There are some researches use summary filters for network clustering [6]. They use summary filter for partitioning subscription messages on some searching nodes. It uses filter summary for maintaing the locality of subscription information. As in [6], they use inaccuracy summary filter to increase network thoughput by 200% compares to using accuracy summary filter. In their network, the node automatically adjusts the level of inaccuracy for reducing network congestion. On the other hand, a node can forward subscription messages to other nodes for avoiding repartitioning its filters. R tree have been used in some fields, for example, use R tree for spatial objects searching in Google Map. R tree also been used for storing, searching content based addresses in content based routing protocol.
3. Overview of Summary Filter Technique
3.1. Overarching Concept
Considering the overarching concept, assume that there are two filters and are content requirements of one service that arising from network. We suppose that covers _{ }by the notation:
(2)
if all content messages satisfy conditions of F_{2 }then they satisfy . Symbolize by:
(3)
In that: =[] i = , =[] j = . (4)
For satisfying condition (2) must have: {, , …, } {, , …, }.
Suppose the constraint: P = [K op V], X is called the value domain of this constraint. If the set of values {v} are belonged to X that will satisfy the constraint that means the expression: Validate(v op V)=true.
If the condition (2) is true, at the same time with the conditions: nm and , ,
must have: i = .
That means all the keys and operators are contained in F_{1} that are also contained in F_{2}, domain value of each of constraint of _{ }covers domain value of each constraint of with the same key.
We suppose that F_{1} and F_{2} overlap each other if the following terms are satisfied:
(5)
Assume specific domain of each constraint has magnitude of , so that specific domain of the filter has magnitude of:
|Domain(F)|= (6)
3.2. Application of R Tree
When receiving a new subscription message from network interface (I) which requests service S, it has a filter that is denoted by the formula: F=. We have to check whether F is covered by an existing filter. If there is no filter, have to check whether it overlaps any existing filter. For doing this task efficiently have to apply and innovate R tree, in this chapter it is called R^{+}, this tree nowadays is used in many different fields. For example R tree is applied for storing space objects in Google MAP, digital Map, System Paging file with high efficiently in storing and searching.
4. Improving R+Tree to Store and Search Filter
4.1. Structure of R+ Tree
Structure of leaf node: it is a set S that consists of n elements, each element consists of two items P and FS: P is a pointer that points to a filter node F that is indexed in this R^{+} tree and FS is a summary filter that covers filter F: FS F.
The filters are added to R^{+} tree by the algorithm that will be presented in the next section.
a) Each filter is indexed in R^{+} tree that belonged to a filter node that consists of two components: IL and F, in which IL is a list of addresses of routers that have sent subscription messages which have the same filter F, F is a filter that consists of some constraints.
b) Similar to the leaf node, the structure of an inner node is presented in Fig. 2:
c) Each inner node stores a set of n elements, each element consists of two components: P and FS. P is a pointer that points to an inner or leaf node. FS is a summary filter that covers the whole set of n summary filters of the nodes pointed by P. This means: FS=. The following diagram describes the covering relationship with n=6: Filter of node R covers its children’s nodes’ filters (R_{1} to R_{6}):
So We have:
1. Therefore the summary filter of the root node covers the whole forwarding table. So that when checking if a filter belongs to the forwarding table. At first examine whether it is covered by summary filter of the root node. If it is not covered, We conclude this filter doesn’t belong to the forwarding table.
2. Each inner node has a number of child nodes which is bound by a lower bound m and a upper bound M: , m and MN. Root node and leaf node have number of child nodes which is less than M. Usually We choose m= or m= because when number of child nodes of one node reaches (M+1), We have to split this node into two nodes so that each node has an approximately equal number of child nodes. When number of child nodes of a node is less than m, have to remove this node and adjust this tree by an algorithm that will be introduce in the next section.
3. R^{+} tree is a balance tree, this means that the height from any leaf node to the root node of the tree is equal. The proving is based on building tree.
4. Thus the height of the tree from a leaf node to the root node is satisfied the following un-equation system: , in which N is number of nodes of tree. Prove: N and NN [n.
4.2. Establishing and Searching Algorithms
4.2.1. The Establishing Algorithm
At first R^{+}tree has only one node, this is both leaf and root of tree, this node points to an indexed filter node that is created to store the filter of a new subscription message that has been received at current router. When adding a new filter F to R^{+} tree, We have to find a leaf node N, add F to N. The process for finding node N begins from root node R:
1) If R has its child nodes are indexed filter nodes then: N=R;
2) If R has some child nodes. Assume that R consists of n components: S = {, ,..., }, C_{i} = (, ). For each : i) Find minimum filter _{ }satisfies: () ; ii) After doing this loop, will find out = min{}, it is respective to C_{min}. iii) If there are many _{ }satisfy this condition, choose the component that has is minimum.
3) Assume C_{min} has pointer points to its child node . Assign temporary node = .
4) Apply the step (2) for _{ }continuously until _{ }is leaf node.
5) Assign: N = . Check filter in each indexed filter node (, ) pointed by a component of N. If F, end the algorithm with conclusion: filter F has already existed in this tree, . Otherwise:
6) Add new indexed filter node contains F to N.
4.2.2. Searching Algorithm
a) The searching algorithm finds a leaf node N that has a filter that conforms most to a given filter F. Starting from the root R of tree, assume: S={, ,..., }. Assign R to and execute step (b).
b) For each of : = {, }, points to , _{ }is summary filter of .
Check: . i) If it is true then continue to check. If is leaf node then execute (c) else assign to and execute (b) recursively; ii) Otherwise, continue (b) with next component.
c) Calculate: =FS F. i) If F: check filter of each indexed filter node of : if end the algorithm with conclusion: F is found in this R^{+} tree; ii) Otherwise, denote is a set of some found. Add to . Go to (b) for next component of .
d) Find some maximum items {F} from , that belong to some leaf nodes {N_{max}}. Thus the set {N_{max}} is result of algorithm.
4.2.3. Algorithm for Adding a Filter to Tree
This algorithm adds a new filter F to a tree: check if this filter exists in this tree. If it’s true then add address of node that emitted F to address list of found indexed filter node and exit the algorithm. Otherwise find a leaf node in this tree to store the filter. The algorithm to insert a filter to a leaf node:
1) Create a filter node to store filter F and I, I is interface address of the router has emitted the subscription message which contained the filter F. i) create a filter node FN, ii) FN.filter=F; FN.IL.add(I);
2) Assume the leaf node for adding F is N. Create new component _{ }of N, it has two elements P and FS; Calculate: N.FS=Summary(F); N.P=FN. Check: (a) if number of components of N is lower than M then add to N, go to step (4). (b) Otherwise go to step (3).
3) Have to do: i) create new node , ii) split the set of all components of N with _{ }into two subsets fairly, each of which contains components of one node N or . Check the condition, whether N has no parent node: (a) If true, new node _{ }is created, this node has two components and . Assign: .FS=Summary(), .P=, .FS=Summary() and .P=. Now is the root of tree, finish algorithm (this is the proving that the root of tree has at least two children). (b) Otherwise, go to step (4).
4) N has a parent node, this node is denoted by , stores component . has two elements FS and P, P points to N. There are two cases: i) if hasn’t been created in (3) then recalculate filter summary for N, assign to FS then go to step (6); ii) otherwise go to step (5).
5) Create a new component _{ }of , has two elements P, FS. Execute two tasks: i) calculate filter summary of , assign to FS: .FS=Summary(); ii) assign P to : .P=. Go to step (7).
6) Check filter summary of , if it has changed, have to assign N=, then go to step (4). Continue until N is root node.
7) Check number of components of , if this number more than M, assign N=, go to step (3). Continue until go to the root node then finish algorithm.
4.2.4. Algorithm for Deleting a Filter From R+ Tree
Assume need to remove a filter F from R^{+} tree.
1) Find F from R^{+} tree by above searching algorithm. If F is found in the tree and leaf node contains F, continue next step. Otherwise end algorithm.
2) N has n components: {, , ..., }, for each component _{ }of N (_{ }has two elements and ): if F F, go to next step.
3) Assume is filter of indexed filter node () pointed by . There are two case: (i) If F then remove the address of the node that have made the request from the list of addresses (IL) of . If this list is empty (IL): remove from this tree. Adjust by following regulating algorithm, end the algorithm. (ii) otherwise go to step (2).
4.2.5. Algorithm for Regulating R+ Tree
1) Starting from the leaf node S from which a filter is removed.
2) If the number of components of S isn’t satisfied the condition (3) then remove S from R^{+}, add S to a set Q which stores temporarily all nodes have removed of ^{ }and go to step (c). If it is satisfied then finish algorithm.
3) If S has parent node then assign S=S.parent. There are two cases: i) if S is not root of the tree then go to step (2); ii) otherwise add root of the tree to Q.
4) Get the last node S that has just been put to the set Q. Scan the sub-trees of tree with root node is S for adding its filters to original tree by the above adding algorithm.
5. Algorithm for Splitting a Node of Tree into Two Nodes
5.1. Processing Steps of the Algorithm
When number of components of one node is more than upper bound M, thus having to split this node into two nodes with each node has about components. Assume current processing node is N:
1. Create new node , get components from N to by one of two algorithms that are described more specific below.
2. If parent of N exists then add N_{1} to the set of children of parent node of N and continue step (c); otherwise create new node that will be new root of tree R. Add N and N_{1} to the set of children of R, finish the algorithm.
3. Assign N=N.Parent, if number of child nodes of N is more than M then go to step (1), otherwise finish the algorithm.
5.2. Algorithm Evaluation
The algorithm has maximum complexity in case of regulating tree from one leaf node to the root node of the tree. This is O[], in which N is number nodes of the tree, K is complexity of the splitting algorithm.
6. Algorithm for Splitting Components of a Node into Two Subsets
6.1. Exhaustive Algorithm
6.1.1. Algorithm Specification
Assume number of components of a node is n=M+1: S={, , ..., }, choose a random sub-set ={, , ..., } from S. There are: Choices to choose the set . The remaining elements of S are put to . For each pair of two sub-sets: calculate and are the summary filters for , respective. Choose the pair of sub-sets that have: is maximum.
6.1.2. Algorithm Evaluation
The exhaustive algorithm has complexity that is in proportion to the number of choices the pair of sub-sets and , so it is .
6.2. Quadratic Algorithm
6.2.1. Algorithm Specification
Assume S is a set of summary filters of current processing node that needs to be split: S={, , ..., } satisfies equation: n=M+1. The requirement of algorithm is: splitting S into two subsets and with almost the same size. Require to choose two beginning elements for these two subsets by the following algorithm:
Algorithm 1:
1. Give filter _{ }is temporary filter, , are two integers for storing two indexes of choosen filters, assign filter = , =-1, =-1;
2. For each filter in the set S of filters execute (3);
3. For each filter in the set S execute (4);
4. Calculate filter that covers both filter and : ; calculate: , , if (in which is filter subtract operator) then assign: =i, =j and = .
When this algorithm finished, gain and are indexes of two filters ({_{, }} = {, }) for beginning filters of two sets and .
Choose next elements for each set and _{ }according to following algorithms:
Algorithm 2:
1. Assume _{ }and _{ }are summary filters of and _{ }respectively, assign =, =.
2. For each filter _{ }of the set: S {_{, }} execute (c).
3. Calculate F_{1}’=() , F_{2}’=(), in which denote ( ) is filter that is result of extending to cover . Go to step (d).
4. If > : put into S_{2} and assign =_{ }(4).
5. If < : put into S_{1} and assign = (5).
6. If = : put into a set that has the number of elements is less than the other set. If two sets have equal number of elements then put _{ }into a random set, recalculate (4) or (5) respectively. Go to (b) for the next item of S.
6.2.2. Evaluating Algorithm
The algorithm 1 has complexity of O(n^{2}), the algorithm 2 has complexity of O(n). Therefore the complexity of Quadratic algorithm is O(n^{2}).
6.3. Linear Algorithm
6.3.1. Algorithm Specification
We have to split the set S of summary filters of the current processing node into two approximate equal sets and .
Some assumptions:
a) Each filter of S has a set of predicates, each predicate has a value domain D.
b) Each value domain has lower bound and upper bound , so that D=[, ].
The algorithm for finding two first elements of two subsets and
a) For each key K, find maximum lower bound _{ }and minimum upper bound _{ }of all predicates of S which have key K. These bounds are belonged to two filters , _{ }respectively.
b) For each K, also find the minimum lower bound and maximum upper bound of all predicates of S that have key K. These bounds are belonged to two filters and respectively.
c) Calculate for each key: = (6).
d) Choose two first elements of two subsets S_{1} and S_{2} from a pair of filters (, ) which has that is maximum on all their keys.
e) With (n-2) remaining filters, put each filter to one of two sets S_{1} and S_{2}in turn.
6.3.2. Evaluating Algorithm
The complexity of algorithm to choose two first filters of two sets S_{1} and S_{2 }is rated by formula *), is number of predicates of filter F, is number of filters of S.
The complexity of algorithm to put (n-2) remaining filters to two sets S_{1} and S_{2 }is O(n), n is number of filters of the set of filters S.
7. Cluster Routing Based on Summary Filter
7.1. Cluster routing Concept
Each node builds it’s own R^{+} tree [6] when it receives subscription filter from network. The root of each tree will store filter summary of its routing table. Define a point (P) is centroid of the rectangle that is established from each filter summary. To cluster network based on centroid of subscription filter, assume that: i) the space of all filter is S, ii) S consists of some sub-spaces {S_{1}, S_{2}, ..., S_{n}}. So when a node on a cluster receives one filter, it will forward this filter to appropriate cluster head based on point P of this filter.
7.2. Cluster Routing Performance
Use this cluster routing that will reduce time required to find matched subscriptions with each received content message. When a node receives a content message, this content message is a point on space S. It checks to find sub-space that P belongs to and S_{i} has cluster head () respectively. Then the node forwards this content message to . In all matched subscriptions will be found using above search algorithm. This cluster algorithm has complexity is O(n*m), n is number of sub-spaces and m is number of constraints of each filter, it is nearly a constant.
8. The Algorithm to Find Nodes Whose Filters Matched a Content Message
Define level of a filter is number of constraints of this filter. In SBR routing protocol, usually uses SBR_Counting to find nodes whose filters matched a content message (M). So the purpose of forwarding algorithm is to find filters that match M. In this section introduce an algorithm to find nodes have filters that match M based on R^{+} tree: (i) start from root of R^{+} tree, assign N to current node of tree. (ii) Scan each component of N, check if .FS matches M; if it is true then assign N=.P, N is node pointed by . If not N is leaf node, continue apply (ii) for N. (iii) Scan each filter of N, check matches M, if it is true add .I to the set of matched node S: S = S .I. Continue algorithm until S contains all network nodes or scan all the matched node of tree.
9. Evaluate Results and Future Development
9.1. The Results of Executing above Mentioned Algorithms
a) Build R^{+} with number of filters changes from 100000 to 290000, measure the time in ms required to execute for each number of filters to draw the diagram based on these results as below:
As the diagram shows that the time required to build R^{+} does not increase continuously when number of filters increases.
b) Build a R^{+}tree and use the Quadratic algorithm for splitting node with the number of filters changes from 100000 to 290000 and delete filters, measure the time required in milli second to delete filters and regulate the tree to draw diagram based on the measured results:
As the diagram shows that the time required to delete filters and regulate R^{+} does not increase continuously when number of filters increases.
9.2. Simulation for Applying R+ Tree for Hierarchical Network Clustering
1. Establish R^{+} tree for hierarchical network clustering:
The algorithms for establishing, updating R^{+} to hierarchically cluster network as above algorithms for building R^{+} to manage service based routing table.
Some modifications: i) A leaf node stores n components (, , …, ), each component stores a pointer P to an indexed item and a summary filter FS of its child node. Each indexed item stores a pair of two elements node identifier and its filter. Leaf node stores a summary filter of its components and cluster head’s ID of its group. So each leaf node give information about one basic group of the network; ii) Similarly, each inner or root node stores information about a upper group of the network. That is a cluster head’s ID of its group, some components that have pointers to its child nodes and their summary filters; iii) Condition for inserting a node N to a child group GC of a current group G is: check each components of G, calculate the formula: FT = , is FS of GC, F is the filter of the inserted node. Put N to GC that gives FT is minimum result. Starting this algorithm at root node of R to a leaf node, at the leaf node actually insert N to this basic group and regulate the R tree to the root node for satisfying condition: number of elements of each group in the range [m, M].
2. Simulate these algorithms of applying R^{+} tree to cluster network
Execute the process to insert node to the network until there are 1000 nodes. Perform this process by the regulation: when adding enough 10 nodes, then remove 1 node. Measure time in ms needed to add or remove a node, then collect the results to make a below graph:
As the graph shows that the time required to execute does not increase when the number of nodes increases. It is approximate one milli second. Sometimes this time increases for regulating R^{+} tree, but it is lower than one second.
9.3. Future Developing
At the current time above algorithms have been simulated to get above operational results are on numeric value type of predicate, in the future applying above algorithms for value type of string.
References