图书介绍
Mining of massive datasetspdf电子书版本下载
- Anand Rajaraman ; Jeffrey D. Ullman 著
- 出版社: Cambridge University Press
- ISBN:1107015357
- 出版时间:2012
- 标注页数:316页
- 文件大小:39MB
- 文件页数:328页
- 主题词:
PDF下载
下载说明
Mining of massive datasetsPDF格式电子书版下载
下载的文件为RAR压缩包。需要使用解压软件进行解压得到PDF格式图书。建议使用BT下载工具Free Download Manager进行下载,简称FDM(免费,没有广告,支持多平台)。本站资源全部打包为BT种子。所以需要使用专业的BT下载软件进行下载。如 BitComet qBittorrent uTorrent等BT下载工具。迅雷目前由于本站不是热门资源。不推荐使用!后期资源热门了。安装了迅雷也可以迅雷进行下载!
(文件页数 要大于 标注页数,上中下等多册电子书除外)
注意:本站所有压缩包均有解压码: 点击下载压缩包解压工具
图书目录
1 Data Mining 1
1.1 What is Data Mining? 1
1.2 Statistical Limits on Data Mining 4
1.3 Things Useful to Know 7
1.4 Outline of the Book 15
1.5 Summary of Chapter 1 16
1.6 References for Chapter 1 17
2 Large-Scale File Systems and Map-Reduce 18
2.1 Distributed File Systems 18
2.2 Map-Reduce 21
2.3 Algorithms Using Map-Reduce 26
2.4 Extensions to Map-Reduce 37
2.5 Efficiency of Cluster-Computing Algorithms 42
2.6 Summary of Chapter 2 49
2.7 References for Chapter 2 51
3 Finding Similar Items 53
3.1 Applications of Near-Neighbor Search 53
3.2 Shingling of Documents 57
3.3 Similarity-Preserving Summaries of Sets 60
3.4 Locality-Sensitive Hashing for Documents 67
3.5 Distance Measures 71
3.6 The Theory of Locality-Sensitive Functions 77
3.7 LSH Families for Other Distance Measures 83
3.8 Applications of Locality-Sensitive Hashing 88
3.9 Methods for High Degrees of Similarity 96
3.10 Summary of Chapter 3 104
3.11 References for Chapter 3 106
4 Mining Data Streams 108
4.1 The Stream Data Model 108
4.2 Sampling Data in a Stream 112
4.3 Filtering Streams 115
4.4 Counting Distinct Elements in a Stream 118
4.5 Estimating Moments 122
4.6 Counting Ones in a Window 127
4.7 Decaying Windows 133
4.8 Summary of Chapter 4 136
4.9 References for Chapter 4 137
5 Link Analysis 139
5.1 PageRank 139
5.2 Efficient Computation of PageRank 153
5.3 Topic-Sensitive PageRank 159
5.4 Link Spam 163
5.5 Hubs and Authorities 167
5.6 Summary of Chapter 5 172
5.7 References for Chapter 5 175
6 Frequent Itemsets 176
6.1 The Market-Basket Model 176
6.2 Market Baskets and the A-Priori Algorithm 183
6.3 Handling Larger Datasets in Main Memory 192
6.4 Limited-Pass Algorithms 199
6.5 Counting Frequent Items in a Stream 205
6.6 Summary of Chapter 6 209
6.7 References for Chapter 6 211
7 Clustering 213
7.1 Introduction to Clustering Techniques 213
7.2 Hierarchical Clustering 217
7.3 K-means Algorithms 226
7.4 The CURE Algorithm 234
7.5 Clustering in Non-Euclidean Spaces 237
7.6 Clustering for Streams and Parallelism 241
7.7 Summary of Chapter 7 247
7.8 References for Chapter 7 250
8 Advertising on the Web 252
8.1 Issues in On-Line Advertising 252
8.2 On-Line Algorithms 255
8.3 The Matching Problem 258
8.4 The Adwords Problem 261
8.5 Adwords Implementation 270
8.6 Summary of Chapter 8 273
8.7 References for Chapter 8 275
9 Recommendation Systems 277
9.1 A Model for Recommendation Systems 277
9.2 Content-Based Recommendations 281
9.3 Collaborative Filtering 291
9.4 Dimensionality Reduction 297
9.5 The NetFlix Challenge 305
9.6 Summary of Chapter 9 306
9.7 References for Chapter 9 308
Index 310
精品推荐
- Northanger Abbey(1818)
- Emma(1815)
- Sense And Sensibility(1811)
- Mansfield Park(1814)
- HUMANITIES THE EVOLUTION OF VALUES
- Pride And Drejudice(1812)
- English
- 企鹅经济学词典 经济学
- 大人的友情 河合隼雄谈友谊
- Computing Concepts
- Advanced Compilpr Design and lmplementation
- 中国商事法律要览
- Introduction to polymers
- CONFICT OF LAWS IN THE WESTERN SOCIALIST AND DEVELOPING COUNTRIES
- Measurement and Research Methods in International Marketing