Extermination of Obsolete Relationship Through KTMIN-JAK-MAXAM Algorithm in Confusion Mining

: At the present time web contains many indistinguishable documents. Much effort made towards in investigates mechanism with identical detection algorithms, still the retrieved web documents with outmodedlink. In this proposed system, we are successfully identifying and minimize the redundant information and like link in web documents. We introduce the correct graph theory based KTMIN-JAK-MAXAM algorithm filters out the redundant link. From the proposed system, we have relevant information with more accuracy. Using this KTMIN-JAK-MAXAM algorithm accessing of web pages with reduced time and space complication.


Introduction
The data and information available on web is exponentially improving, duplication of web content also increase simultaneously [1,2]. Retrieving relevant information from web without redundancy is more challenge task nowadays where in web mining communities [3,4]. Web content mining is the way toward extracting the applicable information, data and learning from World Wide Web. Utilizing customary data recovery [5] and information mining systems it get to the known and obscure data from the Web content. Web mining is categorized into three group Web Content mining [6], Web structure mining, Web usage mining. Traditional web mining algorithms handle with structured document [7][8][9][10][11][12][13][14][15] than the advanced methodology of mining algorithm can dealthe entire heterogeneous document comprises of images [9], graphs, videos [16], etc.

Architecture of Proposed System
A query is searched in a web search tool to recover some significant and required data for the client, either the search query is known or unintelligible to the client, it generally to reply with relevant data rather than redundant, however we can't guarantee that the reply for the query about the significance and redundancy. Once the input query is requested, the search engine generate the document with multiple web pages along with the links, the user will be unaware about the content of the web pages, the extracted web documents contain multiple web pages either be redundant or not.
The Document retrieved must follow some constraints which have less time & space requirements, based upon the criteria the extracted web document must be preprocessed, for preprocessing & information selection, need to apply some techniques such as stop word removal, Stemming of word, phrasing, normalization of tokens. Once the document is preprocessed, Normalization of tokens is generated to further process the web content document. The proposed algorithm shows the procedure to eliminate indistinguishable pages in the set-up of web pages. Initially calculate measure for all the vertices and maintain the set U which contains a minimum and maximum degree for all vertices and isolated measure vertex. Repeatedly include the minimum measure in the set-up and each measured vertex included only once in the set-up. After applying the above steps the entire vertex without redundant information available in the set U.
Pseudo Code forThe Proposed Algorithm KTMIN-JAK-MAXAM Step1: Compute degree measure for all vertices in the setup.
Step2: Pick the minimum degree vertex 'v' in the set-up and include in the set U.
Step3:While U doesn't include all vertices Step3A: Include the entire isolated vertex which is adjacent to the vertex 'v' to U.
Step3B: Find the adjacent vertex 'u' to 'v' which is not in U and has maximumdegree. Update 'u' to U.
Step3C: Update the value of degree for all adjacent vertices of 'u'. Iterate throughall adjacent vertices if possible.
Repeat step 3 till all nodes are included in the set U.
Step4: Finally network U contains no cyclic information. Case I:Regular set-up Connected Regular Setup Case I: A Consider the following connected set-up G 1 in figure 2, having 12 nodes having 3 degree in all vertices along with redundant links. Apply the proposed KTMIN-JAK-MAXAM ALGORITHMto G 1 By step 1: deg(All Nodes of G 1 ) = 3 By step 2: Mark the node A as visited and put it onto the set U.
By step 3: There is no isolated vertex in the given graph G 1 By step 4: 4.1 Investigate any unvisited adjacent node from A. We have 3 nodes (B, 1 and 5) and we can pick the minimum degree node. Here all nodes have an equal degree (3). So pick any one of the node among the 3 adjacent nodes. Now the set U consists of the nodes A, B.
4.2 Search the node B, the unvisited adjacent node is from B as 2 and 7. Now the set U consists of the nodes A, B, 2.
4.3 Now travel around the node 2, the unvisited adjacent node is from 2 as 1 and 4. After inclusion of the node 1 the set U consists of the nodes A, B, 2, 1.
4.4 Discover the node 1, the unvisited adjacent node is from 1 as only 3. Now the set U consists of the nodes A, B, 2, 1, 3.
4.5 Survey the node 3, the unvisited adjacent node is from 3 as 4 and D. Now the set U consists of the nodes A, B, 2, 1, 3, 4.
4.6 Explore the node 4, the unvisited adjacent node is from 4 as only C. Now the set U consists of the nodes A, B, 2, 1, 3, 4, C. 4.7 See the sights the node C, the unvisited adjacent node is from C as 7 and 8. Now the set U consists of the nodes A, B, 2, 1, 3, 4, C, 7.
4.8 Look at the node 7, the unvisited adjacent node is from 7 as only 6. Now the set U consists of the nodes A, B, 2, 1, 3, 4, C, 7, 6.
4.10 survey the node 5, the unvisited adjacent node is from 5 as only D. Now the set U consists of the nodes A, B, 2, 1, 3, 4, C, 7, 6, 5, D.
4.11 Finally Look at the node D, the unvisited adjacent node of D is only 8. Now the set U consists of the nodes A, B, 2, 1, 3, 4, C, 7, 6, 5, D, 8.
By step 5: Finally network U contains withoutcyclic information.  Apply the proposed KTMIN-JAK-MAXAM ALGORITHM to G 2 By step 1: deg (All Nodes of G 2 ) = 3 By step 2: Mark the node F as visited and put it onto the set U.
By step 3: There is no isolated vertex in the given graph G 2 By step 4: 4.1: Investigate any unvisited adjacent node from F. We have 3 nodes (A, E and D) and we can pick the minimum degree node. Here all nodes have an equal degree (3). So pick any one of the node among the 3 adjacent nodes. Now the set U consists of the nodes A and F. 4.2: Discover the node A, the unvisited adjacent node is from A as B and C. Now the set U consists of the nodes F, A, and C. 4.3: Now travel around the node C, the unvisited adjacent node is from C as B and D. After inclusion of the node B the set U consists of the nodes F, A, C, and B. 4.4: Look at the node B, the unvisited adjacent node is from B as only E. Now the set U consists of the nodes F, A, C, B, and E. 4.5: Search the node E, the unvisited adjacent node is from E as only D. Now the set U consists of the nodes F, A, C, B, E, and D.
By step 5:Finally network U contains without cyclic information.
After Applying The Proposed KTMIN-JAK-MAXAM ALGORITHM To G 2 , we get Note: With reference from the figure 5, it is concluded that the solution is unique.
Case I: C Consider the following connected set-up G 3 in figure 6, having n = 10 nodes having 3 degree in all vertices along with redundant links. Here notice that, Regular connected network G 3 , after applying the proposed KTMIN-JAK-MAXAM ALGORITHM to G 3 , we get path of length 9 in figure 7.
Case II: Connected Complete Network G 4 in figure 8  Here notice that, all complete connected networks G, after applying the proposed KTMIN-JAK-MAXAM ALGORITHM to G, we get linear path of length (n-1).
Case III:Connected Irregular network Case III: A Consider the following irregular set-up G 5 in figure 10 After applying proposed KTMIN-JAK-MAXAM ALGORITHM for G 4 , we get the link as follows in figure 11 without repetitions on links.  After applying proposed KTMIN-JAK-MAXAM ALGORITHMfor G 6 , we get the link as follows in figure  13without repetitions on links. After applying the proposed KTMIN-JAK-MAXAM ALGORITHM to the above graph G 7 in figure 15.

Conclusion
In search engine generate relevant information but most of the time the information is not redundant. While getting more redundant web pages for a single search query, it's more difficult to recognize the redundant links. We propose a graph theoretical based algorithm for detecting and eliminating redundant links. Also we observed derived linked graph need not be unique but this approach will provide the optimized cost analysis report in future in data science field. Future work aims to create a finite automata tool to produce only relevant and without redundant information of web documents in data mining.