Class information for:
Level 1: INFORMATION EXTRACTION//WRAPPER INDUCTION//WEB DATA EXTRACTION

Basic class information

Bar chart of Publication_year

Last years might be incomplete

Hierarchy of classes

The table includes all classes above and classes immediately below the current class.



Cluster id Level Cluster label #P
12 4 COMPUTER SCIENCE, THEORY & METHODS//COMPUTER SCIENCE, INFORMATION SYSTEMS//COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE 1181119
31 3       COMPUTER SCIENCE, SOFTWARE ENGINEERING//COMPUTER SCIENCE, INFORMATION SYSTEMS//COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE 113161
355 2             RECOMMENDER SYSTEMS//COLLABORATIVE FILTERING//INFORMATION PROCESSING & MANAGEMENT 17972
17141 1                   INFORMATION EXTRACTION//WRAPPER INDUCTION//WEB DATA EXTRACTION 642

Terms with highest relevance score



rank Category termType chi_square shrOfCwithTerm shrOfTermInClass termInClass
1 INFORMATION EXTRACTION authKW 848133 16% 16% 105
2 WRAPPER INDUCTION authKW 641445 2% 93% 14
3 WEB DATA EXTRACTION authKW 506398 2% 74% 14
4 WRAPPER GENERATION authKW 409083 2% 83% 10
5 DEEP WEB authKW 267252 2% 39% 14
6 WEB INFORMATION EXTRACTION authKW 261810 1% 67% 8
7 HIDDEN WEB authKW 165348 1% 42% 8
8 SEMANTIC RELATION EXTRACTION authKW 157087 1% 80% 4
9 AUTOMATIC WRAPPER GENERATION authKW 147271 0% 100% 3
10 DATA MINING WEB BASED INFORMATION authKW 147271 0% 100% 3

Web of Science journal categories



chi_square_rank Category chi_square shrOfCwithTerm shrOfTermInClass termInClass
1 Computer Science, Information Systems 23886 50% 0% 323
2 Computer Science, Artificial Intelligence 17484 43% 0% 278
3 Computer Science, Software Engineering 6027 23% 0% 145
4 Computer Science, Theory & Methods 4782 24% 0% 154
5 Computer Science, Hardware & Architecture 799 7% 0% 42
6 Information Science & Library Science 764 6% 0% 36
7 Engineering, Electrical & Electronic 315 15% 0% 98
8 Automation & Control Systems 74 3% 0% 17
9 Computer Science, Interdisciplinary Applications 67 3% 0% 22
10 Telecommunications 54 3% 0% 20

Address terms



chi_square_rank term chi_square shrOfCwithTerm shrOfTermInClass termInClass
1 COMP SCI TECHNOL PROGRAM 88360 0% 60% 3
2 INFORMAT COMMUN COMP TECHNOL 65452 0% 67% 2
3 AD T INTELLIGENT INTERNET AGENT 49090 0% 100% 1
4 CELLULAR AUTOMATA KNOWLEDGE ENGN CAKE 49090 0% 100% 1
5 CITESEER PROJECT 49090 0% 100% 1
6 CITIZEN OBSERV 49090 0% 100% 1
7 CLUSTER EXCELLENCE ASIA EUROPE 49090 0% 100% 1
8 COMP EGNN 49090 0% 100% 1
9 COMP INTELLIGENT 49090 0% 100% 1
10 COMP SCI SCI 49090 0% 100% 1

Journals



chi_square_rank term chi_square shrOfCwithTerm shrOfTermInClass termInClass
1 ACM TRANSACTIONS ON THE WEB 28265 2% 5% 11
2 SIGMOD RECORD 27772 3% 3% 18
3 WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS 26765 2% 3% 16
4 DATA & KNOWLEDGE ENGINEERING 20864 4% 2% 23
5 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 11365 4% 1% 27
6 LECTURE NOTES IN COMPUTER SCIENCE 6985 18% 0% 113
7 LECTURE NOTES IN ARTIFICIAL INTELLIGENCE 6802 7% 0% 48
8 INFORMATION PROCESSING & MANAGEMENT 5953 2% 1% 16
9 JOURNAL OF UNIVERSAL COMPUTER SCIENCE 5508 2% 1% 14
10 AI MAGAZINE 5270 1% 1% 9

Author Key Words



chi_square_rank term chi_square shrOfCwithTerm shrOfTermInClass termInClass LCSH search Wikipedia search
1 INFORMATION EXTRACTION 848133 16% 16% 105 Search INFORMATION+EXTRACTION Search INFORMATION+EXTRACTION
2 WRAPPER INDUCTION 641445 2% 93% 14 Search WRAPPER+INDUCTION Search WRAPPER+INDUCTION
3 WEB DATA EXTRACTION 506398 2% 74% 14 Search WEB+DATA+EXTRACTION Search WEB+DATA+EXTRACTION
4 WRAPPER GENERATION 409083 2% 83% 10 Search WRAPPER+GENERATION Search WRAPPER+GENERATION
5 DEEP WEB 267252 2% 39% 14 Search DEEP+WEB Search DEEP+WEB
6 WEB INFORMATION EXTRACTION 261810 1% 67% 8 Search WEB+INFORMATION+EXTRACTION Search WEB+INFORMATION+EXTRACTION
7 HIDDEN WEB 165348 1% 42% 8 Search HIDDEN+WEB Search HIDDEN+WEB
8 SEMANTIC RELATION EXTRACTION 157087 1% 80% 4 Search SEMANTIC+RELATION+EXTRACTION Search SEMANTIC+RELATION+EXTRACTION
9 AUTOMATIC WRAPPER GENERATION 147271 0% 100% 3 Search AUTOMATIC+WRAPPER+GENERATION Search AUTOMATIC+WRAPPER+GENERATION
10 DATA MINING WEB BASED INFORMATION 147271 0% 100% 3 Search DATA+MINING+WEB+BASED+INFORMATION Search DATA+MINING+WEB+BASED+INFORMATION

Core articles

The table includes core articles in the class. The following variables is taken into account for the relevance score of an article in a cluster c:
(1) Number of references referring to publications in the class.
(2) Share of total number of active references referring to publications in the class.
(3) Age of the article. New articles get higher score than old articles.
(4) Citation rate, normalized to year.

Classes with closest relation at Level 1



rank cluster_id2 link
1 37426 STRING KERNELS//LEARNING WITH SEQUENTIAL DATA//SEQUENCE KERNELS
2 16078 PAGERANK//GOOGLE MATRIX//FOCUSED CRAWLER
3 26127 LINKED DATA//SEMANTIC WEB//SEMANT COMP GRP SECO
4 32954 DOMAIN SPECIFIC ONTOLOGY//TEXT KNOWLEDGE ENGN//KNOWLEDGE ACQUISIT SHARING GRP
5 2135 ONTOLOGY//SEMANTIC WEB//ONTOLOGY MATCHING
6 34343 KEYPHRASE EXTRACTION//DEWEY DECIMAL CLASSIFICATION DDC//GRAPH BASED KEYWORD EXTRACTION
7 13623 TEXT CATEGORIZATION//TEXT CLASSIFICATION//TERM WEIGHTING
8 30315 XML RETRIEVAL//STRUCTURED DOCUMENT RETRIEVAL//STRUCTURED INFORMATION RETRIEVAL
9 21641 WEB USAGE MINING//WEB LOG MINING//WEB ROBOT DETECTION
10 21063 TREE EDIT DISTANCE//TREE MINING//UNORDERED TREES

Go to start page