A Unique Book on Data Mining and Rule Induction.
Data Mining and Knowledge Discovery Approaches Based on Rule Induction Techniques


by Evangelos Triantaphyllou and Giovanni Felici (Editors)


An edited book published on June 2006 by Springer-Verlag, New York, NY, U.S.A., in its Massive Computing series, Vol. 6.
ISBN 0-3873-4294-X





TABLE OF CONTENTS (some of the links are under construction; to be finished soon)
List of Figures...............................................xxiii 
List of Tables.................................................xxix 
Foreword....................................................xxxvii 
Preface.......................................................xxxix 
Acknowledgments................................................xlvii 


Chapter 1
A COMMON LOGIC APPROACH TO DATA MINING 
AND PATTERN RECOGNITION, by A. Zakrevskij.........................1
Click here for the abstract of this Chapter in PDF format
1.   	Introduction..............................................2
1.1  		Using Decision Functions..........................2
1.2        	Characteristic Features of the New Approach.......4
2.      Data and Knowledge........................................6
2.1        	General Definitions...............................6
2.2	Data and Knowledge Representation 
                    	the Case of Boolean Attributes............9  
2.3	Data and Knowledge Representation 
                     	the Case of Multi-Valued Attributes......10
3. 	Data Mining – Inductive Inference........................12 
3.1	Extracting Knowledge from the Boolean Space 
                     	of Attributes............................12
3.2         	The Screening Effect.............................18
3.3        	Inductive Inference from Partial Data............20
3.4        	The Case of Multi-Valued Attributes..............21
4.  	Knowledge Analysis and Transformations...................23
4.1        	Testing for Consistency..........................23
4.2        	Simplification...................................27
5.  	Pattern Recognition – Deductive Inference................28
5.1        	Recognition in the Boolean Space.................28
5.2	Appreciating the Asymmetry in Implicative Regularities...31
5.3        	Deductive Inference in Finite Predicates.........34
5.4 		Pattern Recognition in the Space 
                of Multi-Valued Attributes.......................36
6.    	Some Applications........................................38
7. 	Conclusions..............................................40
References.......................................................41
Author’s Biographical Statement..................................43

Move UP to the Top of the Webpage



Chapter 2
THE ONE CLAUSE AT A TIME (OCAT) 
APPROACH TO DATA MINING AND 
KNOWLEDGE DISCOVERY, by E. Triantaphyllou........................45
Click here for the abstract of this Chapter in PDF format
1. 	Introduction.............................................46
2. 	Some Background Information..............................49
3. 	Definitions and Terminology..............................52
4. 	The One Clause at a Time (OCAT) Approach.................54
4.1 		Data Binarization................................54
4.2		The One Clause at a Time (OCAT) Concept..........58
4.3		A Branch-and-Bound Approach for 
                      	Inferring Clauses........................59
4.4		Inference of the Clauses for 
                      	the Illustrative Example.................62
4.5		A Polynomial Time Heuristic for 
                      	Inferring Clauses........................65
5.	A Guided Learning Approach...............................70
6.	The Rejectability Graph of Two Collections of Examples...72
6.1	The Definition of the Rejectability Graph................72
6.2	Properties of the Rejectability Graph....................74
6.3	On the Minimum Clique Cover 
                     	of the Rejectability Graph...............76
7.	Problem Decomposition....................................77
7.1	Connected Components.....................................77
7.2	Clique Cover.............................................78
8.	An Example of Using the Rejectability Graph..............79
9.	Conclusions..............................................82
References.......................................................83
Author’s Biographical Statement..................................87

Move UP to the Top of the Webpage



Chapter 3
AN INCREMENTAL LEARNING ALGORITHM FOR 
INFERRING LOGICAL RULES FROM EXAMPLES IN
THE FRAMEWORK OF THE COMMON REASONING 
PROCESS, by X. Naidenova.........................................89
Click here for the abstract of this Chapter in PDF format
1.	Introduction.............................................90
2.	A Model of Rule-Based Logical Inference..................96
2.1 		Rules Acquired from Experts or Rules of 
                      	the First Type...........................97
2.2 		Structure of the Knowledge Base..................98
2.3		Reasoning Operations for Using Logical Rules of 
                      	the First Type..........................100
2.4 		An Example of the Reasoning Process.............102
3.	Inductive Inference of Implicative Rules From Examples..103
3.1 		The Concept of a Good Classification Test.......103
3.2 		The Characterization of Classification Tests....105
3.3 		An Approach for Constructing Good 
                       	Irredundant Tests.......................106
3.4 		Structure of Data for Inferring Good 
			Diagnostic Tests........................107
3.5 		The Duality of Good Diagnostic Tests............109
3.6	Generation of Dual Objects with the Use              
			of Lattice Operations...................110
3.7 		Inductive Rules for Constructing Elements of 
			a Dual Lattice..........................111
3.8 		Special Reasoning Operations for Constructing 
			Elements of a Dual Lattice..............112
3.8.1  		The Generalization Rule.........................112
3.8.2 		The Diagnostic Rule.............................113
3.8.3 		The Concept of an Essential Example.............114
4. 	Algorithms for Constructing All
       		Good Maximally Redundant Tests..................115
4.1	NIAGaRa: A Non-Incremental Algorithm for Constructing 
			All  Good Maximally Redundant Tests.....115
4.2	Decomposition of Inferring Good Classification 
			Tests into Subtasks.....................122
4.2.1 		Forming the Subtasks............................123
4.2.2 		Reducing the Subtasks...........................125
4.2.3 		Choosing Examples and Values for the Formation 
          		of Subtasks.............................127
4.2.4 		An Approach for Incremental Algorithms..........129
4.3 		DIAGaRa: An Algorithm for Inferring All GMRTs  
			with the Decomposition into Subtasks of 
			the First Kind..........................130
4.3.1	The Basic Recursive Algorithm for Solving a Subtask 
			Of the First Kind.......................130
4.3.2 		An Approach for Forming the Set STGOOD..........131
4.3.3 		The Estimation of the Number of Subtasks to 
			Be Solved...............................131
4.3.4 		CASCADE: Incrementally Inferring GMRTs 
			Based on the Procedure DIAGaRa..........132
4.4  	INGOMAR: An Incremental Algorithm for 
			Inferring All GMRTs.....................132
5. 	Conclusions.............................................138
Acknowledgments.................................................138
Appendix........................................................139
References......................................................143
Author’s Biographical Statement.................................147

Move UP to the Top of the Webpage



Chapter 4
DISCOVERING RULES THAT GOVERN MONOTONE 
PHENOMENA, by V.I. Torvik and E. Triantaphyllou.................149
Click here for the abstract of this Chapter in PDF format
1.  	Introduction............................................150 
2.  	Background Information..................................152
2.1		Problem Descriptions............................152
2.2  		Hierarchical Decomposition of Variables.........155
2.3  		Some Key Properties of Monotone Boolean 
			Functions...............................157
2.4 		Existing Approaches to Problem 1................160
2.5 		An Existing Approach to Problem 2...............162
2.6 		Existing Approaches to Problem 3................162
2.7 		Stochastic Models for Problem 3.................162
3. 	Inference Objectives and Methodology....................165
3.1 		The Inference Objective for Problem 1...........165
3.2  		The Inference Objective for Problem 2...........166
3.3		The Inference Objective for Problem 3...........166
3.4	Incremental Updates for the Fixed Misclassification 
			Probability Model.......................167
3.5 		Selection Criteria for Problem 1................167
3.6 		Selection Criteria for 
			Problems 2.1, 2.2, and 2.3..............168
3.7 		Selection Criterion for Problem 3...............169
4. 	Experimental Results....................................174
4.1 		Experimental Results for Problem 1..............174
4.2 		Experimental Results for Problem 2..............176
4.3 		Experimental Results for Problem 3..............179
5. 	Summary and Discussion..................................183
5.1		Summary of the Research Findings................183
5.2		Significance of the Research Findings...........186
5.3		Future Research Directions......................187
6. 	Concluding Remarks......................................187
References......................................................188
Authors’ Biographical Statements................................191

Move UP to the Top of the Webpage



Chapter 5
LEARNING LOGIC FORMULAS AND RELATED ERROR 
DISTRIBUTIONS, by G. Felici, F. Sun, and K. Truemper............193
Click here for the abstract of this Chapter in PDF format
1. 	Introduction............................................194
2. 	Logic Data and Separating Set...........................197
2.1 		Logic Data......................................197
2.2 		Separating Set..................................198
3. 	Problem Formulation.....................................200
3.1 		Logic Variables.................................201
3.2 		Separation Condition for Records in A...........201
3.3 		Separation Condition for Records in B...........201
3.4 		Selecting a Largest Subset......................202
3.5 		Selecting a Separating Vector...................203
3.6 		Simplification for 0/1 Records..................204
4. 	Implementation of Solution Algorithm....................204
5. 	Leibniz System..........................................205
6. 	Simple-Minded Control of Classification Errors..........206
7. 	Separations for Voting Process..........................207
8. 	Probability Distribution of Vote-Total..................208
8.1 		Mean and Variance for ZA........................209
8.2 		Random Variables Yi.............................211
8.3 		Distribution for Y..............................212
8.4 		Distribution for ZA.............................213
8.5 		Probabilities of Classification Errors..........213
8.6 		Summary of Algorithm............................216
9. 	Computational Results...................................216
9.1 		Breast Cancer Diagnosis.........................218
9.2 		Australian Credit Card..........................219
9.3 		Congressional Voting............................219
9.4 		Diabetes Diagnosis..............................219
9.5 		Heart Disease Diagnosis.........................220
9.6 		Boston Housing..................................221
10. 	Conclusions.............................................221
References......................................................222
Authors’ Biographical Statements................................226

Move UP to the Top of the Webpage



Chapter 6
FEATURE SELECTION FOR DATA MINING 
by V. de Angelis, G. Felici, and G. Mancinelli..................227
Click here for the abstract of this Chapter in PDF format
1.	Introduction............................................228
2. 	The Many Routes to Feature Selection....................229
2.1 		Filter Methods..................................232
2.2 		Wrapper Methods.................................234
3. 	Feature Selection as a Subgraph Selection Problem.......237
4. 	Basic IP Formulation and Variants.......................238
5. 	Computational Experience................................241
5.1 		Test on Generated Data..........................242
5.2 		An Application..................................246
6. 	Conclusions.............................................248
References......................................................249
Authors’ Biographical Statements................................252

Move UP to the Top of the Webpage



Chapter 7
TRANSFORMATION OF RATIONAL AND SET DATA 
TO LOGIC DATA, by S. Bartnikowski, M. Granberry,  
J. Mugan, and K. Truemper.......................................253
Click here for the abstract of this Chapter in PDF format
1. 	Introduction............................................254
1.1	Transformation of Set Data..............................254
1.2	Transformation of Rational Data.........................254
1.3	Computational Results...................................256
1.4	Entropy-Based Approaches................................257
1.5	Bottom-up Methods.......................................258
1.6	Other Approaches........................................258
2. 	Definitions.............................................259
2.1	Unknown Values..........................................259
2.2	Records.................................................260
2.3	Populations.............................................260
2.4	DNF Formulas............................................260
2.5	Clash Condition.........................................261
3. 	Overview of Transformation Process......................262
4. 	Set Data to Logic Data..................................262
4.1	Case of Element Entries.................................262
4.2	Case of Set Entries.....................................264
5. 	Rational Data to Logic Data.............................264
6. 	Initial Markers.........................................265
6.1	Class Values............................................265
6.2	Smoothed Class Values...................................266
6.3	Selection of Standard Deviation.........................266
6.4	Definition of Markers...................................269
6.5	Evaluation of Markers...................................271
7. 	Additional Markers......................................271
7.1	Critical Interval.......................................272
7.2	Attractiveness of Pattern Change........................272
7.3	Selection of Marker.....................................273
8. 	Computational Results...................................274
9. 	Summary.................................................275
References......................................................276
Authors’ Biographical Statements................................278

Move UP to the Top of the Webpage



Chapter 8
DATA FARMING: CONCEPTS AND METHODS, by A. Kusiak................279
Click here for the abstract of this Chapter in PDF format
1. 	Introduction............................................280
2. 	Data Farming Methods....................................281
2.1 		Feature Evaluation..............................282
2.2 		Data Transformation.............................282
2.2.1 		Filling in Missing Values.......................282
2.2.2		Discretization..................................283
2.2.3		Feature Content Modification....................283
2.2.4 		Feature Transformation..........................286
2.2.5 		Data Evolution..................................289
2.3 		Knowledge Transformation........................290
2.4 		Outcome Definition..............................295
2.5 		Feature Definition..............................297
3.	The Data Farming Process................................298
4.	A Case Study............................................299
5. 	Conclusions.............................................301
References......................................................302
Author’s Biographical Statement.................................304

Move UP to the Top of the Webpage



Chapter 9
RULE INDUCTION THROUGH DISCRETE SUPPORT 
VECTOR DECISION TREES, by C. Orsenigo and C. Vercellis..........305
Click here for the abstract of this Chapter in PDF format
1.	Introduction............................................306
2.	Linear Support Vector Machines..........................308
3.	Discrete Support Vector Machines with Minimum Features..312 
4.	A Sequential LP-based Heuristic for 
       			Problems LDVM and FDVM..................314
5.	Building a Minimum Features Discrete Support 	
   	 		Vector Decision Tree....................316
6.	Discussion and Validation of the Proposed Classifier....319
7.	Conclusions.............................................322
References......................................................324
Authors’ Biographical Statements................................326

Move UP to the Top of the Webpage



Chapter 10
MULTI-ATTRIBUTE DECISION TREES AND 
DECISION RULES, by J.-Y. Lee and S. Olafsson....................327
Click here for the abstract of this Chapter in PDF format
1.	Introduction............................................328
2.	Decision Tree Induction.................................329
2.1		Attribute Evaluation Rules......................330
2.2		Entropy-Based Algorithms........................332
2.3		Other Issues in Decision Tree Induction.........333
3.	Multi-Attribute Decision Trees..........................334
3.1		Accounting for Interactions between Attributes..334
3.2		Second Order Decision Tree Induction............335
3.3		The SODI Algorithm..............................339
4.	An Illustrative Example.................................334
5.	Numerical Analysis......................................347
6.	Conclusions.............................................349
Appendix: Detailed Model Comparison.............................351
References......................................................355
Authors’ Biographical Statements................................358

Move UP to the Top of the Webpage



Chapter 11
KNOWLEDGE ACQUISITION AND UNCERTAINTY IN 
FAULT DIAGNOSIS: A ROUGH SETS PERSPECTIVE,
by L.-Y. Zhai, L.-P. Khoo, and S.-C. Fok........................359
Click here for the abstract of this Chapter in PDF format
1.	Introduction............................................360
2.	An Overview of Knowledge Discovery and Uncertainty......361
2.1		Knowledge Acquisition and Machine Learning......361
2.1.1		Knowledge Representation........................361
2.1.2		Knowledge Acquisition...........................362
2.1.3	Machine Learning and Automated 
			Knowledge Extraction....................362
2.1.4	Inductive Learning Techniques for Automated 
 		Knowledge Extraction............................364
2.2		Uncertainties in Fault Diagnosis................366
2.2.1		Inconsistent Data...............................366
2.2.2		Incomplete Data.................................367
2.2.3		Noisy Data......................................368
2.3		Traditional Techniques for Handling Uncertainty.369
2.3.1		MYCIN’s Model of Certainty Factors..............369
2.3.2		Bayesian Probability Theory.....................370
2.3.3		The Dempster-Shafer Theory of Belief Functions..371
2.3.4		The Fuzzy Sets Theory...........................372
2.3.5	Comparison of Traditional Approaches for
  			Handling Uncertainty....................373
2.4		The Rough Sets Approach.........................374
2.4.1		Introductory Remarks............................374
2.4.2		Rough Sets and Fuzzy Sets.......................375
2.4.3		Development of Rough Set Theory.................376
2.4.4	Strengths of Rough Sets Theory and Its
			Applications in Fault Diagnosis.........376
3.	Rough Sets Theory in Classification and 
  			Rule Induction under Uncertainty........378
3.1		Basic Notions of Rough Sets Theory..............378
3.1.1		The Information System..........................378
3.1.2		Approximations..................................379
3.2		Rough Sets and Inductive Learning...............381
3.2.1		Inductive Learning, Rough Sets and the RClass...381
3.2.2		Framework of the RClass.........................382
3.3		Validation and Discussion.......................384
3.3.1		Example 1: Machine Condition Monitoring.........385
3.3.2		Example 2: A Chemical Process...................386
4.	Conclusions.............................................388
References......................................................389
Authors’ Biographical Statements................................394

Move UP to the Top of the Webpage



Chapter 12
DISCOVERING KNOWLEDGE NUGGETS WITH A GENETIC
 ALGORITHM, by E. Noda, and A.A. Freitas........................395
Click here for the abstract of this Chapter in PDF format
1.	Introduction............................................396
2.	The Motivation for Genetic
			Algorithm-Based  Rule Discovery.........399
2.1		An Overview of Genetic Algorithms (GAs).........400
2.2		Greedy Rule Induction...........................402
2.3		The Global Search of Genetic Algorithms (GAs)...404
3.	GA-Nuggets..............................................404
3.1 		Single-Population GA-Nuggets....................404
3.1.1		Individual Representation.......................405
3.1.2		Fitness Function................................406
3.1.3		Selection Method and Genetic Operators..........410
3.2		Distributed-Population GA-Nuggets...............411
3.2.1		Individual Representation.......................411
3.2.2		Distributed Population..........................412
3.2.3		Fitness Function................................414
3.2.4		Selection Method and Genetic Operators..........415
4.	A Greedy Rule Induction Algorithm 
			for Dependence Modeling.................415
5.	Computational Results...................................416
5.1		The Data Sets Used in the Experiments...........416
5.2		Results and Discussion..........................417
5.2.1		Predictive Accuracy.............................419
5.2.2		Degree of Interestingness.......................422
5.2.3		Summary of the Results..........................426
6.	Conclusions.............................................428
References......................................................429
Authors’ Biographical Statements................................432

Move UP to the Top of the Webpage



Chapter 13
DIVERSITY MECHANISMS IN PITT-STYLE 
EVOLUTIONARY CLASSIFIER SYSTEMS, by M. Kirley, 
H.A. Abbass, and R.I. McKay.....................................433
Click here for the abstract of this Chapter in PDF format
1.	Introduction............................................434
2. 	Background –  Genetic Algorithms........................436
3.	 Evolutionary Classifier Systems........................439
3.1 		The Michigan Style Classifier System............439
3.2 		The Pittsburgh Style Classifier System..........440
4. 	Diversity Mechanisms in Evolutionary Algorithms.........440
4.1 		Niching.........................................441
4.2 		Fitness Sharing.................................441
4.3 		Crowding........................................443
4.4  	Isolated Populations....................................444
5. 	Classifier Diversity....................................446
6.  	Experiments.............................................448
6.1 		Architecture of the Model.......................448
6.2  		Data Sets.......................................449
6.3 		Treatments......................................449
6.4 		Model Parameters................................449
7. 	Results.................................................450
8. 	Conclusions.............................................452
References......................................................454
Authors’ Biographical Statements................................457

Move UP to the Top of the Webpage



Chapter  14
FUZZY LOGIC IN DISCOVERING ASSOCIATION 
RULES: AN OVERVIEW, by G. Chen, Q. Wei, and E.E. Kerre..........459
Click here for the abstract of this Chapter in PDF format
1.	Introduction............................................460
1.1		Notions of Associations.........................460
1.2		Fuzziness in Association Mining.................462
1.3   	Main Streams of Discovering Associations with 
			Fuzzy Logic.............................464
2.	Fuzzy Logic in Quantitative Association Rules...........465
2.1		Boolean Association Rules.......................465
2.2		Quantitative Association Rules..................466
2.3		Fuzzy Extensions of 
			Quantitative Association Rules..........468
3.	Fuzzy Association Rules with Fuzzy Taxonomies...........469
3.1		Generalized Association Rules...................470
3.2		Generalized Association Rules with 
			Fuzzy Taxonomies........................471
3.3		Fuzzy Association Rules with 
			Linguistic Hedges.......................473
4.	Other Fuzzy Extensions and Considerations...............474
4.1		Fuzzy Logic in Interestingness Measures.........474
4.2		Fuzzy Extensions of Dsupport / Dconfidence......476
4.3		Weighted Fuzzy Association Rules................478
5.	Fuzzy Implication Based Association Rules...............480
6.	Mining Functional Dependencies with Uncertainties.......482
6.1		Mining Fuzzy Functional Dependencies............482
6.2		Mining Functional Dependencies with Degrees.....483
7.	Fuzzy Logic in Pattern Associations.....................484
8.	Conclusions.............................................486
References......................................................487
Authors’ Biographical Statements................................493

Move UP to the Top of the Webpage



Chapter 15
MINING HUMAN INTERPRETABLE KNOWLEDGE WITH 
FUZZY MODELING METHODS:  AN OVERVIEW, by T.W. Liao..............495
Click here for the abstract of this Chapter in PDF format
1.	Background..............................................496
2. 	Basic Concepts..........................................498
3. 	Generation of Fuzzy If-Then Rules.......................500
3.1 		Grid Partitioning...............................501
3.2 		Fuzzy Clustering................................506
3.3 		Genetic Algorithms..............................509
3.3.1 		Sequential Pittsburgh Approach..................510
3.3.2 		Sequential IRL+Pittsburgh Approach..............511
3.3.3 		Simultaneous Pittsburgh Approach................513
3.4 		Neural Networks.................................517
3.4.1 		Fuzzy Neural Networks...........................518
3.4.2 		Neural Fuzzy Systems............................519
3.4.2.1			Starting Empty..........................519
3.4.2.2			Starting Full...........................520
3.4.2.3			Starting with an Initial Rule Base......524
3.5 		Hybrids.........................................526
3.6 		Others..........................................526
3.6.1 		From Exemplar Numeric Data......................527
3.6.2 		From Exemplar Fuzzy Data........................527
4. 	Generation of Fuzzy Decision Trees......................527
4.1	Fuzzy Interpretation of Crisp Trees with 
			Discretized Intervals...................528
4.2. 		Fuzzy ID3 Variants..............................529
4.2.1 	From Fuzzy Vector-Valued Examples.......................529
4.2.2 	From Nominal-Valued and Real-Valued Examples............530
5. 	Applications............................................532
5.1		Function Approximation Problems.................532
5.2 		Classification Problems.........................532
5.3 		Control Problems................................533
5.4 		Time Series Prediction Problems.................534
5.5 		Other Decision-Making Problems..................534
6. 	Discussion..............................................534
7. 	Conclusions.............................................537
References......................................................538
Appendix 1: 	A Summary of Grid Partitioning Methods 
			for Fuzzy Modeling......................545
Appendix 2: 	A Summary of Fuzzy Clustering Methods
			for Fuzzy Modeling......................546
Appendix 3:	A Summary of GA Methods for Fuzzy Modeling......547
Appendix 4:	A Summary of Neural Network Methods for 
			Fuzzy Modeling..........................548
Appendix 5:	A Summary of Fuzzy Decision Tree Methods for 
			Fuzzy Modeling..........................549
Author’s Biographical Statement.................................550

Move UP to the Top of the Webpage



Chapter 16
DATA MINING FROM MULTIMEDIA PATIENT RECORDS,
by A.S. Elmaghraby, M.M. Kantardzic, and M.P. Wachowiak.........551
Click here for the abstract of this Chapter in PDF format
1.	Introduction............................................552
2.	The Data Mining Process.................................554
3. 	Clinical Patient Records:  A Data Mining Source.........556
3.1 		Distributed Data Sources........................560
3.2 		Patient Record Standards........................560
4. 	Data Preprocessing......................................563
5. 	Data Transformation.....................................567
5.1 		Types of Transformation.........................567
5.2	An Independent Component Analysis:  
			Example of an EMG/ECG Separation........571
5.3	Text Transformation and Representation: 
			A Rule-Based Approach...................573
5.4	Image Transformation and Representation: 
			A Rule-Based Approach...................575
6. 	Dimensionality Reduction................................579
6.1 		The Importance of Reduction.....................579
6.2		Data Fusion.....................................581
6.3 		Example 1: Multimodality Data Fusion............584
6.4 		Example 2: Data Fusion in Data Preprocessing....584
6.5 		Feature Selection Supported By Domain Experts...588
7. 	Conclusions.............................................589
References......................................................591
Authors’ Biographical Statements................................595

Move UP to the Top of the Webpage



Chapter 17
LEARNING TO FIND CONTEXT BASED SPELLING 
ERRORS, by H. Al-Mubaid, and K. Truemper........................597
Click here for the abstract of this Chapter in PDF format
1. 	Introduction............................................598
2. 	Previous Work...........................................600
3. 	Details of Ltest........................................601
3.1 		Learning Step...................................602
3.2 	 	Testing Step....................................605
3.2.1 		Testing Regular Cases...........................605
3.2.2 		Testing Special Cases...........................606
3.2.3 		An Example......................................607
4. 	Implementation and Computational Results................607
5. 	Extensions..............................................614
6.  	Summary.................................................616
References......................................................616
Appendix A:  Construction of Substitutions......................619
Appendix B:  Construction of Training and History Texts.........620
Appendix C:  Structure of Characteristic Vectors................621
Appendix D:  Classification of Characteristic Vectors...........624
Authors’ Biographical Statements................................627

Move UP to the Top of the Webpage



Chapter 18
INDUCTION AND INFERENCE WITH FUZZY RULES 
FOR TEXTUAL INFORMATION RETRIEVAL, by J. Chen, 
D.H. Kraft, M.J. Martin-Bautista, and M. –A., Vila..............629
Click here for the abstract of this Chapter in PDF format
1.	Introduction............................................630
2.  	Preliminaries...........................................632
2.1 		The Vector Space Approach to 
			Information Retrieval...................632
2.2		Fuzzy Set Theory Basics.........................634
2.3		Fuzzy Hierarchical Clustering...................634
2.4 		Fuzzy Clustering by the 
			Fuzzy C-means Algorithm.................634
3. 	Fuzzy Clustering, Fuzzy Rule Discovery and
			Fuzzy Inference for Textual Retrieval...635
3.1		The Air Force EDC Data Set......................636
3.2 		Clustering Results..............................637
3.3		Fuzzy Rule Extraction from Fuzzy Clusters.......638
3.4	Application of Fuzzy Inference for 
			Improving Retrieval Performance.........639
4.  	Fuzzy Clustering, Fuzzy Rules and User Profiles for
			Web Retrieval...........................640
4.1 		Simple User Profile Construction................641
4.2		Application of Simple User Profiles in
				Web Information Retrieval.......642
4.2.1			Retrieving Interesting Web Documents....642
4.2.2 			User Profiles for Query Expansion by 
				Fuzzy Inference.................643
4.3		Experiments of Using User Profiles..............644
4.4		Extended Profiles and Fuzzy Clustering..........646
5. 		Conclusions.....................................646
Acknowledgements................................................647
References......................................................648
Authors’ Biographical Statements................................652

Move UP to the Top of the Webpage



Chapter  19
STATISTICAL RULE INDUCTION IN THE PRESENCE OF 
PRIOR INFORMATION:  THE BAYESIAN RECORD 
LINKAGE PROBLEM, by D.H. Judson.................................655
Click here for the abstract of this Chapter in PDF format
1. 	Introduction............................................656
2. 	Why is Record Linkage Challenging?......................657
3. 	The Fellegi-Sunter Model of Record Linkage..............658
4. 	How Estimating Match Weights and Setting Thresholds is
			Equivalent to Specifying a 
			Decision Rule...........................660
5. 	Dealing with Stochastic Data:  
			A Logistic Regression Approach..........661
5.1 		Estimation of the Model.........................665
5.2 		Finding the Implied Threshold and 
			Interpreting Coefficients...............665
6.  	Dealing with Unlabeled Data in the 
			Logistic Regression Approach............668
7. 	Brief Description of the Simulated Data.................669
8. 	Brief Description of the CPS/NHIS to 
			Census Record  Linkage Project..........670
9.	Results of the Bayesian Latent Class Method with 
			Simulated Data..........................672
9.1 		Case 1: Uninformative...........................673
9.2		Case 2:	Informative.............................677
9.3	False Link and Non-Link Rates in the 
			Population of All Possible Pairs........678
10.	Results from the Bayesian Latent Class Method with  
     	Real Data...............................................679
10.1 		Steps in Preparing the Data.....................679
10.2 		Priors and Constraints..........................681
10.3  		Results.........................................682
11. 	Conclusions and Future Research.........................690
References......................................................691
Author’s Biographical Statement.................................694

Move UP to the Top of the Webpage



Chapter 20 
FUTURE TRENDS IN SOME DATA MINING AREAS, 
by X. Wang, P. Zhu, G. Felici, and E. Triantaphyllou............695
Click here for the abstract of this Chapter in PDF format
1.	Introduction............................................696
2.	Web Mining..............................................696
2.1		Web Content Mining..............................697
2.2 		Web Usage Mining................................698
2.3 		Web Structure Mining............................698
2.4 		Current Obstacles and Future Trends.............699
3. 	Text Mining.............................................700
3.1 		Text Mining and Information Access..............700
3.2 		A Simple Framework of Text Mining...............701
3.3 		Fields of Text Mining...........................701
3.4 		Current Obstacles and Future Trends.............702
4. 	Visual Data Mining......................................703
4.1 		Data Visualization..............................704
4.2 		Visualizing Data Mining Models..................705
4.3 		Current Obstacles and Future Trends.............705
5. 	Distributed Data Mining.................................706
5.1		The Basic Principle of DDM......................707
5.2 		Grid Computing..................................707
5.3 		Current Obstacles and Future Trends.............708
6. 	Summary.................................................708
References......................................................710
Authors’ Biographical Statements................................715

Move UP to the Top of the Webpage



Subject Index...................................................717


Author Index....................................................727


Contributor Index...............................................739


About the Editors...............................................747

Move UP to the Top of the Webpage


Visit Dr. Triantaphyllou's Homepage
Dr. Triantaphyllou's Books / Special Issues web site     a new site!

Send suggestions / comments to Dr. E. Triantaphyllou (trianta@lsu.edu).