Contents 1 Introduction 1 1.1 The Background and Applications 1 1.2 The Evolution and Development 5 1.3 The Challenges and Issues 7 1.4 Content and Organization of the Book 8 2 Maximal Prevalent Co-location Patterns 11 2.1 Introduction 11 2.2 Why the MCHT Method Is Proposed for Mining MPCPs 12 2.3 Formal Problem Statement and Appropriate Mining Framework 17 2.3.1 Co-Location Patterns 17 2.3.2 Related Work 19 2.3.3 Contributions and Novelties 21 2.4 The Novel Mining Solution 22 2.4.1 The Overall Mining Framework 22 2.4.2 Bit-String-Based Maximal Clique Enumeration 23 2.4.3 Constructing the Participating Instance Hash Table 28 2.4.4 Calculating Participation Indexes and Filtering MPCPs 30 2.4.5 The Analysis of Time and Space Complexities 32 2.5 Experiments 33 2.5.1 Data Sets 33 2.5.2 Experimental Objectives 34 2.5.3 Experimental Results and Analysis 34 2.6 Chapter Summary 47 3 Maximal Sub-prevalent Co-location Patterns 49 3.1 Introduction 49 3.2 Basic Concepts and Properties 51 3.3 A Prefix-Tree-Based Algorithm (PTBA) 54 3.3.1 Basic Idea 54 3.3.2 Algorithm 56 3.3.3 Analysis and Pruning 57 3.4 A Partition-Based Algorithm (PBA) 58 3.4.1 Basic Idea 58 3.4.2 Algorithm 62 3.4.3 Analysis of Computational Complexity 64 3.5 Comparison of PBA and PTBA 64 3.6 Experimental Evaluation 66 3.6.1 Synthetic Data Generation 67 3.6.2 Comparison of Computational Complexity Factors 67 3.6.3 Comparison of Expected Costs Involved in Identifying Candidates 69 3.6.4 Comparison of Candidate Pruning Ratio 69 3.6.5 Effects of the Parameter Clumpy 70 3.6.6 Scalability Tests 70 3.6.7 Evaluation with Real Data Sets 72 3.7 Related Work 75 3.8 Chapter Summary 77 4 SPI-Closed Prevalent Co-location Patterns 79 4.1 Introduction 79 4.2 Why SPI-Closed Prevalent Co-locations Improve Mining 81 4.3 The Concept of SPI-Closed and Its Properties 83 4.3.1 Classic Co-location Pattern Mining 83 4.3.2 The Concept of SPI-Closed 85 4.3.3 The Properties of SPI-Closed 86 4.4 SPI-Closed Miner 89 4.4.1 Preprocessing and Candidate Generation 89 4.4.2 Computing Co-location Instances and Their PI Values 93 4.4.3 The SPI-Closed Miner 93 4.5 Qualitative Analysis of the SPI-Closed Miner 95 4.5.1 Discovering the Correct SPI-Closed Co-location Set Ω 96 4.5.2 The Running Time of SPI-Closed Miner 96 4.6 Experimental Evaluation 96 4.6.1 Experiments on Real-life Data Sets 97 4.6.2 Experiments with Synthetic Data Sets 100 4.7 Related Work 104 4.8 Chapter Summary 105 5 Top-k Probabilistically Prevalent Co-location Patterns 107 5.1 Introduction 107 5.2 Why Mining Top-k Probabilistically Prevalent Co-location Patterns (Top-k PPCPs) 108 5.3 Definitions 110 5.3.1 Spatially Uncertain Data 110 5.3.2 Prevalent Co-locations 112 5.3.3 Prevalence Probability 113 5.3.4 Min_PI-Prevalence Probabilities 114 5.3.5 Top-k PPCPs 115 5.4 A Framework of Mining Top-k PPCPs 115 5.4.1 Basic Algorithm 115 5.4.2 Analysis and Pruning of Algorithm 5.1 116 5.5 Improved Computation of P(c, min_PI) 117 5.5.1 0-1-Optimization 117 5.5.2 The Matrix Method 118 5.5.3 Polynomial Matrices 122 5.6 Approximate Computation of P(c, min_PI) 125 5.7 Experimental Evaluations 128 5.7.1 Evaluation on Synthetic Data Sets 128 5.7.2 Evaluation on Real Data Sets 134 5.8 Chapter Summary 136 6 Non-redundant Prevalent Co-location Patterns 137 6.1 Introduction 137 6.2 Why We Need to Explore Non-redundant Prevalent Co-locations 139 6.3 Problem Definition 141 6.3.1 Semantic Distance 141 6.3.2 δ-Covered 143 6.3.3 The Problem Definition and Analysis 145 6.4 The RRclosed Method 148 6.5 The RRnull Method 150 6.5.1 The Method 150 6.5.2 The Algorithm 153 6.5.3 The Correctness Analysis 155 6.5.4 The Time Complexity Analysis 156 6.5.5 Comparative Analysis 157 6.6 Experimental Results 158 6.6.1 On the Three Real Data Sets 158 6.6.2 On the Synthetic Data Sets 161 6.7 Related Work 165 6.8 Chapter Summary 166 7 Dominant Spatial Co-location Patterns 167 7.1 Introduction 167 7.2 Why Dominant SCPs Are Useful to Mine 168 7.3 Related Work 171 7.4 Preliminaries and Problem Formulation 172 7.4.1 Preliminaries 173 7.4.2 Definitions 174 7.4.3 Formal Problem Formulation 179 7.4.4 Discussion of Progress 179 7.5 Proposed Algorithm for Mining Dominant SCPs 180 7.5.1 Basic Algorithm for Mining Dominant SCPs 180 7.5.2 Pruning Strategies 182 7.5.3 An Improved Algorithm 186 7.5.4 Comparison of Complexity 187 7.6 Experimental Study 188 7.6.1 Data Sets 188 7.6.2 Efficiency 189 7.6.3 Effectiveness 193 7.6.4 Real Applications 196 7.7 Chapter Summary 198 8 High Utility Co-location Patterns 201 8.1 Introduction 201 8.2 Why We Need High Utility Co-location Pattern Mining 202 8.3 Related Work 204 8.3.1 Spatial Co-location Pattern Mining 204 8.3.2 Utility Itemset Mining 205 8.4 Problem Definition 206 8.5 A Basic Mining Approach 208 8.6 Extended Pruning Approach 208 8.6.1 Related Definitions 209 8.6.2 Extended Pruning Algorithm (EPA) 210 8.7 Partial Pruning Approach 212 8.7.1 Related Definitions 212 8.7.2 Partial Pruning Algorithm (PPA) 217 8.8 Experiments 218 8.8.1 Differences Between Mining Prevalent SCPs and High Utility SCPs 218 8.8.2 Effect of the Number of Total Instances n 219 8.8.3 Effect of the Distance Threshold d 219 8.8.4 Effect of the Pattern Utility Ratio Threshold ξ 219 8.8.5 Effect of s in vss 219 8.8.6 Comparing PPA and EPA with a Different Utility Ratio Threshold ξ 220 8.9 Chapter Summary 221 9 High Utility Co-location Patterns with Instance Utility 223 9.1 Introduction 223 9.2 Why We Need Instance Utility with Spatial Data 224 9.3 Related Work 226 9.4 Related Concepts 228 9.5 A Basic Algorithm 231 9.6 Pruning Strategies 232 9.7 Experimental Analysis 236 9.7.1 Data Sets 236 9.7.2 The Quality of Mining Results 236 9.7.3 Evaluation of Pruning Strategies 237 9.8 Chapter Summary 240 10 Interactively Post-mining User-Preferred Co-location Patterns with a Probabilistic Model 241 10.1 Introduction 241 10.2 Why We Need Interactive Probabilistic Post-mining 242 10.3 Related Work 245 10.4 Problem Statement 246 10.4.1 Basic Concept 246 10.4.2 Subjective Preference Measure 247 10.4.3 Formal Problem Statement 247 10.5 Probabilistic Model 248 10.5.1 Basic Assumptions 248 10.5.2 Probabilistic Model 248 10.5.3 Discussion 251 10.6 The Complete Algorithm 252 10.6.1 The Algorithm 252 10.6.2 Two Optimization Strategies 253 10.6.3 The Time Complexity Analysis 254 10.7 Experimental Results 255 10.7.1 Experimental Setting 255 10.7.2 The Simulator 255 10.7.3 Accuracy Evaluation on Real Data Sets 257 10.7.4 Accuracy Evaluation on Synthetic Data Sets 262 10.7.5 Sample Co-location Selection 263 10.8 Chapter Summary 264 11 Vector-Degree: A General Similarity Measure for Co-location Patterns 265 11.1 Introduction 265 11.2 Why We Measure the Similarity Between SCPs 266 11.3 Preliminaries 268 11.3.1 Spatial Co-location Pattern (SCP) 268 11.3.2 A Toy Example 269 11.3.3 Problem Statement 270 11.4 The Method 270 11.4.1 Maximal Cliques Enumeration Algorithm 270 11.4.2 A Representation Model of SCPs 274 11.4.3 Vector-Degree: the Similarity Measure of SCPs 278 11.4.4 Grouping SCPs Based on Vector-Degree 279 11.5 Experimental Evaluations 279 11.5.1 Data Sets 280 11.5.2 Results 280 11.6 Chapter Summary 284 References 285