图书介绍

数据挖掘 实用机器学习工具与技术 英文版 原书第3版2025|PDF|Epub|mobi|kindle电子书版本百度云盘下载

数据挖掘 实用机器学习工具与技术 英文版 原书第3版
  • (新西兰)威滕,(新西兰)弗兰克著 著
  • 出版社: 北京:机械工业出版社
  • ISBN:9787111374176
  • 出版时间:2012
  • 标注页数:629页
  • 文件大小:158MB
  • 文件页数:655页
  • 主题词:数据采集-英文

PDF下载


点此进入-本书在线PDF格式电子书下载【推荐-云解压-方便快捷】直接下载PDF格式图书。移动端-PC端通用
种子下载[BT下载速度快]温馨提示:(请使用BT下载软件FDM进行下载)软件下载地址页直链下载[便捷但速度慢]  [在线试读本书]   [在线获取解压码]

下载说明

数据挖掘 实用机器学习工具与技术 英文版 原书第3版PDF格式电子书版下载

下载的文件为RAR压缩包。需要使用解压软件进行解压得到PDF格式图书。

建议使用BT下载工具Free Download Manager进行下载,简称FDM(免费,没有广告,支持多平台)。本站资源全部打包为BT种子。所以需要使用专业的BT下载软件进行下载。如BitComet qBittorrent uTorrent等BT下载工具。迅雷目前由于本站不是热门资源。不推荐使用!后期资源热门了。安装了迅雷也可以迅雷进行下载!

(文件页数 要大于 标注页数,上中下等多册电子书除外)

注意:本站所有压缩包均有解压码: 点击下载压缩包解压工具

图书目录

PART Ⅰ INTRODUCTION TO DATA MINING3

CHAPTER 1 What's It All About?3

1.1 Data Mining and Machine Learning3

Describing Structural Patterns5

Machine Learning7

Data Mining8

1.2 Simple Examples:The Weather Problem and Others9

The Weather Problem9

Contact Lenses:An Idealized Problem12

Irises:A Classic Numeric Dataset13

CPU Performance:Introducing Numeric Prediction15

Labor Negotiations:A More Realistic Example15

Soybean Classification:A Classic Machine Learning Success19

1.3 Fielded Applications21

Web Mining21

Decisions Involving Judgment22

Screening Images23

Load Forecasting24

Diagnosis25

Marketing and Sales26

Other Applications27

1.4 Machine Learning and Statistics28

1.5 Generalization as Search29

1.6 Data Mining and Ethics33

Reidentification33

Using Personal Information34

Wider Issues35

1.7 Further Reading36

CHAPTER 2 Input:Concepts,Instances,and Attributes39

2.1 What's a Concept?40

2.2 What's in an Example?42

Relations43

Other Example Types46

2.3 What's in an Attribute?49

2.4 Preparing the Input51

Gathering the Data Together51

ARFF Format52

Sparse Data56

Attribute Types56

Missing Values58

Inaccurate Values59

Getting to Know Your Data60

2.5 Further Reading60

CHAPTER 3 Output:Knowledge Representation61

3.1 Tables61

3.2 Linear Models62

3.3 Trees64

3.4 Rules67

Classification Rules69

Association Rules72

Rules with Exceptions73

More Expressive Rules75

3.5 Instance-Based Representation78

3.6 Clusters81

3.7 Further Reading83

CHAPTER 4 Algorithms:The Basic Methods85

4.1 Inferring Rudimentary Rules86

Missing Values and Numeric Attributes87

Discussion89

4.2 Statistical Modeling90

Missing Values and Numeric Attributes94

Na?ve Bayes for Document Classification97

Discussion99

4.3 Divide-and-Conquer:Constructing Decision Trees99

Calculating Information103

Highly Branching Attributes105

Discussion107

4.4 Covering Algorithms:Constructing Rules108

Rules versus Trees109

A Simple Covering Algorithm110

Rules versus Decision Lists115

4.5 Mining Association Rules116

Item Sets116

Association Rules119

Generating Rules Efficiently122

Discussion123

4.6 Linear Models124

Numeric Prediction:Linear Regression124

Linear Classification:Logistic Regression125

Linear Classification Using the Perceptron127

Linear Classification Using Winnow129

4.7 Instance-Based Learning131

Distance Function131

Finding Nearest Neighbors Efficiently132

Discussion137

4.8 Clustering138

Iterative Distance-Based Clustering139

Faster Distance Calculations139

Discussion141

4.9 Multi-Instance Learning141

Aggregating the Input142

Aggregating the Output142

Discussion142

4.10 Further Reading143

4.11 Weka Implementations145

CHAPTER 5 Credibility:Evaluating What's Been Learned147

5.1 Training and Testing148

5.2 Predicting Performance150

5.3 Cross-Validation152

5.4 Other Estimates154

Leave-One-Out Cross-Validation154

The Bootstrap155

5.5 Comparing Data Mining Schemes156

5.6 Predicting Probabilities159

Quadratic Loss Function160

Informational Loss Function161

Discussion162

5.7 Counting the Cost163

Cost-Sensitive Classification166

Cost-Sensitive Learning167

Lift Charts168

ROC Curves172

Recall-precision Curves174

Discussion175

Cost Curves177

5.8 Evaluating Numeric Prediction180

5.9 Minimum Description Length Principle183

5.10 Applying the MDL Principle to Clustering186

5.11 Further Reading187

PART Ⅱ ADVANCED DATA MINING191

CHAPTER 6 Implementations:Real Machine Learning Schemes191

6.1 Decision Trees192

Numeric Attributes193

Missing Values194

Pruning195

Estimating Error Rates197

Complexity of Decision Tree Induction199

From Trees to Rules200

C4.5:Choices and Options201

Cost-Complexity Pruning202

Discussion202

6.2 Classification Rules203

Criteria for Choosing Tests203

Missing Values,Numeric Attributes204

Generating Good Rules205

Using Global Optimization208

Obtaining Rules from Partial Decision Trees208

Rules with Exceptions212

Discussion215

6.3 Association Rules216

Building a Frequent-Pattern Tree216

Finding Large Item Sets219

Discussion222

6.4 Extending Linear Models223

Maximum-Margin Hyperplane224

Nonlinear Class Boundaries226

Support Vector Regression227

Kernel Ridge Regression229

Kernel Perceptron231

Multilayer Perceptrons232

Radial Basis Function Networks241

Stochastic Gradient Descent242

Discussion243

6.5 Instance-Based Learning244

Reducing the Number of Exemplars245

Pruning Noisy Exemplars245

Weighting Attributes246

Generalizing Exemplars247

Distance Functions for Generalized Exemplars248

Generalized Distance Functions249

Discussion250

6.6 Numeric Prediction with Local Linear Models251

Model Trees252

Building the Tree253

Pruning the Tree253

Nominal Attributes254

Missing Values254

Pseudocode for Model Tree Induction255

Rules from Model Trees259

Locally Weighted Linear Regression259

Discussion261

6.7 Bayesian Networks261

Making Predictions262

Learning Bayesian Networks266

Specific Algorithms268

Data Structures for Fast Learning270

Discussion273

6.8 Clustering273

Choosing the Number of Clusters274

Hierarchical Clustering274

Example of Hierarchical Clustering276

Incremental Clustering279

Category Utility284

Probability-Based Clustering285

The EM Algorithm287

Extending the Mixture Model289

Bayesian Clustering290

Discussion292

6.9 Semisupervised Learning294

Clustering for Classification294

Co-training296

EM and Co-training297

Discussion297

6.10 Multi-Instance Learning298

Converting to Single-Instance Learning298

Upgrading Learning Algorithms300

Dedicated Multi-Instance Methods301

Discussion302

6.11 Weka Implementations303

CHAPTER 7 Data Transformations305

7.1 Attribute Selection307

Scheme-Independent Selection308

Searching the Attribute Space311

Scheme-Specific Selection312

7.2 Discretizing Numeric Attributes314

Unsupervised Discretization316

Entropy-Based Discretization316

Other Discretization Methods320

Entropy-Based versus Error-Based Discretization320

Converting Discrete Attributes to Numeric Attributes322

7.3 Projections322

Principal Components Analysis324

Random Projections326

Partial Least-Squares Regression326

Text to Attribute Vectors328

Time Series330

7.4 Sampling330

Reservoir Sampling330

7.5 Cleansing331

Improving Decision Trees332

Robust Regression333

Detecting Anomalies334

One-Class Learning335

7.6 Transforming Multiple Classes to Binary Ones338

Simple Methods338

Error-Correcting Output Codes339

Ensembles of Nested Dichotomies341

7.7 Calibrating Class Probabilities343

7.8 Further Reading346

7.9 Weka Implementations348

CHAPTER 8 Ensemble Learning351

8.1 Combining Multiple Models351

8.2 Bagging352

Bias-Variance Decomposition353

Bagging with Costs355

8.3 Randomization356

Randomization versus Bagging357

Rotation Forests357

8.4 Boosting358

AdaBoost358

The Power of Boosting361

8.5 Additive Regression362

Numeric Prediction362

Additive Logistic Regression364

8.6 Interpretable Ensembles365

Option Trees365

Logistic Model Trees368

8.7 Stacking369

8.8 Further Reading371

8.9 Weka Implementations372

Chapter 9 Moving on:Applications and Beyond375

9.1 Applying Data Mining375

9.2 Learning from Massive Datasets378

9.3 Data Stream Learning380

9.4 Incorporating Domain Knowledge384

9.5 Text Mining386

9.6 Web Mining389

9.7 Adversarial Situmions393

9.8 Ubiquitous Data Mining395

9.9 Further Reading397

PART Ⅲ THE WEKA DATA MINING WORKBENCH403

CHAPTER 10 Introduction to Weka403

10.1 What's in Weka?403

10.2 How Do You Use It?404

10.3 What Else Can You Do?405

10.4 How Do You Get It?406

CHAPTER 11 The Explorer407

11.1 Getting Started407

Preparing the Data407

Loading the Data into the Explorer408

Building a Decision Tree410

Examining the Output411

Doing It Again413

Working with Models414

When Things Go Wrong415

11.2 Exploring the Explorer416

Loading and Filtering Files416

Training and Testing Learning Schemes422

Do It Yourself:The User Classifier424

Using a Metalearner427

Clustering and Association Rules429

Attribute Selection430

Visualization430

11.3 Filtering Algorithms432

Unsupervised Attribute Filters432

Unsupervised Instance Filters441

Supervised Filters443

11.4 Learning Algorithms445

Bayesian Classifiers451

Trees454

Rules457

Functions459

Neural Networks469

Lazy Classifiers472

Multi-Instance Classifiers472

Miscellaneous Classifiers474

11.5 Metalearning Algorithms474

Bagging and Randomization474

Boosting476

Combining Classifiers477

Cost-Sensitive Learning477

Optimizing Performance478

Retargeting Classifiers for Different Tasks479

11.6 Clustering Algorithms480

11.7 Association-Rule Learners485

11.8 Attribute Selection487

Attribute Subset Evaluators488

Single-Attribute Evaluators490

Search Methods492

CHAPTER 12 The Knowledge Flow Interface495

12.1 Getting Started495

12.2 Components498

12.3 Configuring and Connecting the Components500

12.4 Incremental Learning502

CHAPTER 13 The Experimenter505

13.11 Getting Started505

Running an Experiment506

Analyzing the Results509

13.2 Simple Setup510

13.3 Advanced Setup511

13.4 The Analyze Panel512

13.5 Distributing Processing over Several Machines515

CHAPTER 14 The Command-Line Interface519

14.1 Getting Started519

14.2 The Structure of Weka519

Classes,Instances,and Packages520

The weka.core Package520

The weka.classifiers Package523

Other Packages525

Javadoc Indexes525

14.3 Command-Line Options526

Genefic Options526

Scheme-Specific Options529

CHAPTER 15 Embedded Machine Learning531

15.1 A Simple Data Mining Application531

MessageClassifier()536

uppdateData()536

classifyMessage()537

CHAPTER 16 Writing New Learning Schemes539

16.1 An Example Classifier539

buildClassifier()540

makeTree()540

computeInfoGain()549

classifyInstance()549

toSource()550

main()553

16.2 Conventions for Implementing Classifiers555

Capabilities555

CHAPTER 17 Tutorial Exercises for the Weka Explorer559

17.1 Introduction to the Explorer Interface559

Loading a Dataset559

The Dataset Editor560

Applying a Filter561

The Visualize Panel562

The Classify Panel562

17.2 Nearest-Neighbor Learning and Decision Trees566

The Glass Dataset566

Attribute Selection567

Class Noise and Nearest-Neighbor Learning568

Varying the Amount of Training Data569

Interactive Decision Tree Construction569

17.3 Classification Boundaries571

Visualizing 1R571

Visualizing Nearest-Neighbor Learning572

Visualizing Na?ve Bayes573

Visualizing Decision Trees and Rule Sets573

Messing with the Data574

17.4 Preprocessing and Parameter Tuning574

Discretization574

More on Discretization575

Automatic Attribute Selection575

More on Automatic Attribute Selection576

Automatic Parameter Tuning577

17.5 Document Classification578

Data with String Attributes579

Classifying Actual Documents580

Exploring the StringToWordVector Filter581

17.6 Mining Association Rules582

Association-Rule Mining582

Mining a Real-Worid Dataset584

Market Basket Analysis584

REFERENCES587

INDEX607

热门推荐