图书介绍
数据库系统实现 英文2025|PDF|Epub|mobi|kindle电子书版本百度云盘下载
data:image/s3,"s3://crabby-images/ed7f3/ed7f3e2a992f8e81a411b65daacf9dff46af978c" alt="数据库系统实现 英文"
- (美)HectorGarcia-Molina,JenniferWidom,JeffreyD.Ullman编著 著
- 出版社: 北京:机械工业出版社
- ISBN:9787111288602
- 出版时间:2010
- 标注页数:1184页
- 文件大小:28MB
- 文件页数:666页
- 主题词:数据库系统-英文
PDF下载
下载说明
数据库系统实现 英文PDF格式电子书版下载
下载的文件为RAR压缩包。需要使用解压软件进行解压得到PDF格式图书。建议使用BT下载工具Free Download Manager进行下载,简称FDM(免费,没有广告,支持多平台)。本站资源全部打包为BT种子。所以需要使用专业的BT下载软件进行下载。如BitComet qBittorrent uTorrent等BT下载工具。迅雷目前由于本站不是热门资源。不推荐使用!后期资源热门了。安装了迅雷也可以迅雷进行下载!
(文件页数 要大于 标注页数,上中下等多册电子书除外)
注意:本站所有压缩包均有解压码: 点击下载压缩包解压工具
图书目录
1 The Worlds of Database Systems1
1.1 The Evolution of Database Systems1
1.1.1 Early Database Management Systems2
1.1.2 Relational Database Systems3
1.1.3 Smaller and Smaller Systems3
1.1.4 Bigger and Bigger Systems4
1.1.5 Information Integration4
1.2 Overview of a Database Management System5
1.2.1 Data-Definition Language Commands5
1.2.2 Overview of Query Processing5
1.2.3 Storage and Buffer Management7
1.2.4 Transaction Processing8
1.2.5 The Query Processor9
1.3 Outline of Database-System Studies10
1.4 References for Chapter 112
Ⅰ Relational Database Modeling15
2 The Relational Model of Data17
2.1 An Overview of Data Models17
2.1.1 What is a Data Model?17
2.1.2 Important Data Models18
2.1.3 The Relational Model in Brief18
2.1.4 The Semistructured Model in Brief19
2.1.5 Other Data Models20
2.1.6 Comparison of Modeling Approaches21
2.2 Basics of the Relational Model21
2.2.1 Attributes22
2.2.2 Schemas22
2.2.3 Tuples22
2.2.4 Domains23
2.2.5 Equivalent Representations of a Relation23
2.2.6 Relation Instances24
2.2.7 Keys of Relations25
2.2.8 An Example Database Schema26
2.2.9 Exercises for Section 2.228
2.3 Defining a Relation Schema in SQL29
2.3.1 Relations in SQL29
2.3.2 Data Types30
2.3.3 Simple Table Declarations31
2.3.4 Modifying Relation Schemas33
2.3.5 Default Values34
2.3.6 Declaring Keys34
2.3.7 Exercises for Section 2.336
2.4 An Algebraic Query Language38
2.4.1 Why Do We Need a Special Query Language?38
2.4.2 What is an Algebra?38
2.4.3 Overview of Relational Algebra39
2.4.4 Set Operations on Relations39
2.4.5 Projection41
2.4.6 Selection42
2.4.7 Cartesian Product43
2.4.8 Natural Joins43
2.4.9 Theta-Joins45
2.4.10 Combining Operations to Form Queries47
2.4.11 Naming and Renaming49
2.4.12 Relationships Among Operations50
2.4.13 A Linear Notation for Algebraic Expressions51
2.4.14 Exercises for Section 2.452
2.5 Constraints on Relations58
2.5.1 Relational Algebra as a Constraint Language59
2.5.2 Referential Integrity Constraints59
2.5.3 Key Constraints60
2.5.4 Additional Constraint Examples61
2.5.5 Exercises for Section 2.562
2.6 Summary of Chapter 263
2.7 References for Chapter 265
3 Design Theory for Relational Databases67
3.1 Functional Dependencies67
3.1.1 Definition of Functional Dependency68
3.1.2 Keys of Relations70
3.1.3 Superkeys71
3.1.4 Exercises for Section 3.171
3.2 Rules About Functional Dependencies72
3.2.1 Reasoning About Functional Dependencies72
3.2.2 The Splitting/Combining Rule73
3.2.3 Trivial Functional Dependencies74
3.2.4 Computing the Closure of Attributes75
3.2.5 Why the Closure Algorithm Works77
3.2.6 The Transitive Rule79
3.2.7 Closing Sets of Functional Dependencies80
3.2.8 Projecting Functional Dependencies81
3.2.9 Exercises for Section 3.283
3.3 Design of Relational Database Schemas85
3.3.1 Anomalies86
3.3.2 Decomposing Relations86
3.3.3 Boyce-Codd Normal Form88
3.3.4 Decomposition into BCNF89
3.3.5 Exercises for Section 3.392
3.4 Decomposition:The Good,Bad,and Ugly93
3.4.1 Recovering Information from a Decomposition94
3.4.2 The Chase Test for Lossless Join96
3.4.3 Why the Chase Works99
3.4.4 Dependency Preservation100
3.4.5 Exercises for Section 3.4102
3.5 Third Normal Form102
3.5.1 Definition of Third Normal Form102
3.5.2 The Synthesis Algorithm for 3NF Schemas103
3.5.3 Why the 3NF Synthesis Algorithm Works104
3.5.4 Exercises for Section 3.5105
3.6 Multivalued Dependencies105
3.6.1 Attribute Independence and Its Consequent Redundancy106
3.6.2 Definition of Multivalued Dependencies107
3.6.3 Reasoning About Multivalued Dependencies108
3.6.4 Fourth Normal Form110
3.6.5 Decomposition into Fourth Normal Form111
3.6.6 Relationships Among Normal Forms113
3.6.7 Exercises for Section 3.6113
3.7 An Algorithm for Discovering MVD's115
3.7.1 The Closure and the Chase115
3.7.2 Extending the Chase to MVD's116
3.7.3 Why the Chase Works for MVD's118
3.7.4 Projecting MVD's119
3.7.5 Exercises for Section 3.7120
3.8 Summary of Chapter 3121
3.9 References for Chapter 3122
4 High-Level Database Models125
4.1 The Entity/Relationship Model126
4.1.1 Entity Sets126
4.1.2 Attributes126
4.1.3 Relationships127
4.1.4 Entity-Relationship Diagrams127
4.1.5 Instances of an E/R Diagram128
4.1.6 Multiplicity of Binary E/R Relationships129
4.1.7 Multiway Relationships130
4.1.8 Roles in Relationships131
4.1.9 Attributes on Relationships134
4.1.10 Converting Multiway Relationships to Binary134
4.1.11 Subclasses in the E/R Model135
4.1.12 Exercises for Section 4.1138
4.2 Design Principles140
4.2.1 Faithfulness140
4.2.2 Avoiding Redundancy141
4.2.3 Simplicity Counts142
4.2.4 Choosing the Right Relationships142
4.2.5 Picking the Right Kind of Element144
4.2.6 Exercises for Section 4.2145
4.3 Constraints in the E/R Model148
4.3.1 Keys in the E/R Model148
4.3.2 Representing Keys in the E/R Model149
4.3.3 Referential Integrity150
4.3.4 Degree Constraints151
4.3.5 Exercises for Section 4.3151
4.4 weak Entity Sets152
4.4.1 Causes of Weak Entity Sets152
4.4.2 Requirements for Weak Entity Sets153
4.4.3 Weak Entity Set Notation155
4.4.4 Exercises for Section 4.4156
4.5 From E/R Diagrams to Relational Designs157
4.5.1 From Entity Sets to Relations157
4.5.2 From E/R Relationships to Relations158
4.5.3 Combining Relations160
4.5.4 Handling Weak Entity Sets161
4.5.5 Exercises for Section 4.5163
4.6 Converting Subclass Structures to Relations165
4.6.1 E/R-Style Conversion166
4.6.2 An Object-Oriented Approach167
4.6.3 Using Null Values to Combine Relations168
4.6.4 Comparison of Approaches169
4.6.5 Exercises for Section 4.6171
4.7 Unified Modeling Language171
4.7.1 UML Classes172
4.7.2 Keys for UML classes173
4.7.3 Associations173
4.7.4 Self-Associations175
4.7.5 Association Classes175
4.7.6 Subclasses in UML176
4.7.7 Aggregations and Compositions177
4.7.8 Exercises for Section 4.7179
4.8 From UML Diagrams to Relations179
4.8.1 UML-to-Relations Basics179
4.8.2 From UML Subclasses to Relations180
4.8.3 From Aggregations and Compositions to Relations181
4.8.4 The UML Analog of Weak Entity Sets181
4.8.5 Exercises for Section 4.8183
4.9 Object Definition Language183
4.9.1 Class Declarations184
4.9.2 Attributes in ODL184
4.9.3 Relationships in ODL185
4.9.4 Inverse Relationships186
4.9.5 Multiplicity of Relationships186
4.9.6 Types in ODL188
4.9.7 Subclasses in ODL190
4.9.8 Declaring Keys in ODL191
4.9.9 Exercises for Section 4.9192
4.10 From ODL Designs to Relational Designs193
4.10.1 From ODL Classes to Relations193
4.10.2 Complex Attributes in Classes194
4.10.3 Representing Set-Valued Attributes195
4.10.4 Representing Other Type Constructors196
4.10.5 Representing ODL Relationships198
4.10.6 Exercises for Section 4.10198
4.11 Summary of Chapter 4200
4.12 References for Chapter 4202
Ⅱ Relational Database Programming203
5 Algebraic and Logical Query Languages205
5.1 Relational Operations on Bags205
5.1.1 Why Bags?206
5.1.2 Union,Intersection,and Difference of Bags207
5.1.3 Projection of Bags208
5.1.4 Selection on Bags209
5.1.5 Product of Bags210
5.1.6 Joins of Bags210
5.1.7 Exercises for Section 5.1212
5.2 Extended Operators of Relational Algebra213
5.2.1 Duplicate Elimination214
5.2.2 Aggregation Operators214
5.2.3 Grouping215
5.2.4 The Grouping Operator216
5.2.5 Extending the Projection Operator217
5.2.6 The Sorting Operator219
5.2.7 Outerjoins219
5.2.8 Exercises for Section 5.2222
5.3 A Logic for Relations222
5.3.1 Predicates and Atoms223
5.3.2 Arithmetic Atoms223
5.3.3 Datalog Rules and Queries224
5.3.4 Meaning of Datalog Rules225
5.3.5 Extensional and Intensional Predicates228
5.3.6 Datalog Rules Applied to Bags228
5.3.7 Exercises for Section 5.3230
5.4 Relational Algebra and Datalog230
5.4.1 Boolean Operations231
5.4.2 Projection232
5.4.3 Selection232
5.4.4 Product235
5.4.5 Joins235
5.4.6 Simulating Multiple Operations with Datalog236
5.4.7 Comparison Between Datalog and Relational Algebra238
5.4.8 Exercises for Section 5.4238
5.5 Summary of Chapter 5240
5.6 References for Chapter 5241
6 The Database Language SQL243
6.1 Simple Queries in SQL244
6.1.1 Projection in SQL246
6.1.2 Selection in SQL248
6.1.3 Comparison of Strings250
6.1.4 Pattern Matching in SQL250
6.1.5 Dates and Times251
6.1.6 Null Values and Comparisons Involving NULL252
6.1.7 The Truth-Value UNKNOWN253
6.1.8 Ordering the Output255
6.1.9 Exercises for Section 6.1256
6.2 Queries Involving More Than One Relation258
6.2.1 Products and Joins in SQL259
6.2.2 Disambiguating Attributes260
6.2.3 Tuple Variables261
6.2.4 Interpreting Multirelation Queries262
6.2.5 Union,Intersection,and Difference of Queries265
6.2.6 Exercises for Section 6.2267
6.3 Subqueries268
6.3.1 Subqueries that Produce Scalar Values269
6.3.2 Conditions Involving Relations270
6.3.3 Conditions Involving Tuples271
6.3.4 Correlated Subqueries273
6.3.5 Subqueries in FROM Clauses274
6.3.6 SQL Join Expressions275
6.3.7 Natural Joins276
6.3.8 Outerjoins277
6.3.9 Exercises for Section 6.3279
6.4 Full-Relation Operations281
6.4.1 Eliminating Duplicates281
6.4.2 Duplicates in Unions,Intersections,and Differences282
6.4.3 Grouping and Aggregation in SQL283
6.4.4 Aggregation Operators284
6.4.5 Grouping285
6.4.6 Grouping,Aggregation,and Nulls287
6.4.7 HAVING Clauses288
6.4.8 Exercises for Section 6.4289
6.5 Database Modifications291
6.5.1 Insertion291
6.5.2 Deletion292
6.5.3 Updates294
6.5.4 Exercises for Section 6.5295
6.6 Transactions in SQL296
6.6.1 Serializability296
6.6.2 Atomicity298
6.6.3 Transactions299
6.6.4 Read-Only Transactions300
6.6.5 Dirty Reads302
6.6.6 Other Isolation Levels304
6.6.7 Exercises for Section 6.6306
6.7 Summary of Chapter 6307
6.8 References for Chapter 6308
7 Constraints and Triggers311
7.1 Keys and Foreign Keys311
7.1.1 Declaring Foreign-Key Constraints312
7.1.2 Maintaining Referential Integrity313
7.1.3 Deferred Checking of Constraints315
7.1.4 Exercises for Section 7.1318
7.2 Constraints on Attributes and Tuples319
7.2.1 Not-Null Constraints319
7.2.2 Attribute-Based CHECK Constraints320
7.2.3 Tuple-Based CHECK Constraints321
7.2.4 Comparison of Tuple-and Attribute-Based Constraints323
7.2.5 Exercises for Section 7.2323
7.3 Modification of Constraints325
7.3.1 Giving Names to Constraints325
7.3.2 Altering Constraints on Tables326
7.3.3 Exercises for Section 7.3327
7.4 Assertions328
7.4.1 Creating Assertions328
7.4.2 Using Assertions329
7.4.3 Exercises for Section 7.4330
7.5 Triggers332
7.5.1 Triggers in SQL332
7.5.2 The Options for Trigger Design334
7.5.3 Exercises for Section 7.5337
7.6 Summary of Chapter 7339
7.7 References for Chapter 7339
8 Views and Indexes341
8.1 Virtual Views341
8.1.1 Declaring Views341
8.1.2 Querying Views343
8.1.3 Renaming Attributes343
8.1.4 Exercises for Section 8.1344
8.2 Modifying Views344
8.2.1 View Removal345
8.2.2 Updatable Views345
8.2.3 Instead-Of Triggers on Views347
8.2.4 Exercises for Section 8.2349
8.3 Indexes in SQL350
8.3.1 Motivation for Indexes350
8.3.2 Declaring Indexes351
8.3.3 Exercises for Section 8.3352
8.4 Selection of Indexes352
8.4.1 A Simple Cost Model352
8.4.2 Some Useful Indexes353
8.4.3 Calculating the Best Indexes to Create355
8.4.4 Automatic Selection of Indexes to Create357
8.4.5 Exercises for Section 8.4359
8.5 Materialized Views359
8.5.1 Maintaining a Materialized View360
8.5.2 Periodic Maintenance of Materialized Views362
8.5.3 Rewriting Queries to Use Materialized Views362
8.5.4 Automatic Creation of Materialized Views364
8.5.5 Exercises for Section 8.5365
8.6 Summary of Chapter 8366
8.7 References for Chapter 8367
9 SQL in a Server Environment369
9.1 The Three-Tier Architecture369
9.1.1 The Web-Server Tier370
9.1.2 The Application Tier371
9.1.3 The Database Tier372
9.2 The SQL Environment372
9.2.1 Environments373
9.2.2 Schemas374
9.2.3 Catalogs375
9.2.4 Clients and Servers in the SQL Environment375
9.2.5 Connections376
9.2.6 Sessions377
9.2.7 Modules378
9.3 The SQL/Host-Language Interface378
9.3.1 The Impedance Mismatch Problem380
9.3.2 Connecting SQL to the Host Language380
9.3.3 The DECLARE Section381
9.3.4 Using Shared Variables382
9.3.5 Single-Row Select Statements383
9.3.6 Cursors383
9.3.7 Modifications by Cursor386
9.3.8 Protecting Against Concurrent Updates387
9.3.9 Dynamic SQL388
9.3.10 Exercises for Section 9.3390
9.4 Stored Procedures391
9.4.1 Creating PSM Functions and Procedures391
9.4.2 Some Simple Statement Forms in PSM392
9.4.3 Branching Statements394
9.4.4 Queries in PSM395
9.4.5 Loops in PSM396
9.4.6 For-Loops398
9.4.7 Exceptions in PSM400
9.4.8 Using PSM Functions and Procedures402
9.4.9 Exercises for Section 9.4402
9.5 Using a call-Level Interface404
9.5.1 Introduction to SQL/CLI405
9.5.2 Processing Statements407
9.5.3 Fetching Data From a Query Result408
9.5.4 Passing Parameters to Queries410
9.5.5 Exercises for Section 9.5412
9.6 JDBC412
9.6.1 Introduction to JDBC412
9.6.2 Creating Statements in JDBC413
9.6.3 Cursor Operations in JDBC415
9.6.4 Parameter Passing416
9.6.5 Exercises for Section 9.6416
9.7 PHP416
9.7.1 PHP Basics417
9.7.2 Arrays418
9.7.3 The PEAR DB Library419
9.7.4 Creating a Database Connection Using DB419
9.7.5 Executing SQL Statements419
9.7.6 Cursor Operations in PHP420
9.7.7 Dynamic SQL in PHP421
9.7.8 Exercises for Section 9.7422
9.8 Summary of Chapter 9422
9.9 References for Chapter 9423
10 Advanced Topics in Relational Databases425
10.1 Security and User Authorization in SQL425
10.1.1 Privileges426
10.1.2 Creating Privileges427
10.1.3 The Privilege-Checking Process428
10.1.4 Granting Privileges430
10.1.5 Grant Diagrams431
10.1.6 Revoking Privileges433
10.1.7 Exercises for Section 10.1436
10.2 Recursion in SQL437
10.2.1 Defining Recursive Relations in SQL437
10.2.2 Problematic Expressions in Recursive SQL440
10.2.3 Exercises for Section 10.2443
10.3 The Object-Relational Model445
10.3.1 From Relations to Object-Relations445
10.3.2 Nested Relations446
10.3.3 References447
10.3.4 Object-Oriented Versus Object-Relational449
10.3.5 Exercises for Section 10.3450
10.4 User-Defined Types in SQL451
10.4.1 Defining Types in SQL451
10.4.2 Method Declarations in UDT's452
10.4.3 Method Definitions453
10.4.4 Declaring Relations with a UDT454
10.4.5 References454
10.4.6 Creating Object ID's for Tables455
10.4.7 Exercises for Section 10.4457
10.5 Operations on Object-Relational Data457
10.5.1 Following References457
10.5.2 Accessing Components of Tuples with a UDT458
10.5.3 Generator and Mutator Functions460
10.5.4 Ordering Relationships on UDT's461
10.5.5 Exercises for Section 10.5463
10.6 On-Line Analytic Processing464
10.6.1 OLAP and Data Warehouses465
10.6.2 OLAP Applications465
10.6.3 A Multidimensional View of OLAP Data466
10.6.4 Star Schemas467
10.6.5 Slicing and Dicing469
10.6.6 Exercises for Section 10.6472
10.7 Data Cubes473
10.7.1 The Cube Operator473
10.7.2 The Cube Operator in SQL475
10.7.3 Exercises for Section 10.7477
10.8 Summary of Chapter 10478
10.9 References for Chapter 10480
Ⅲ Modeling and Programming for semistructured Data481
11 The Semistructured-Data Model483
11.1 Semistructured Data483
11.1.1 Motivation for the Semistructured-Data Model483
11.1.2 Semistructured Data Representation484
11.1.3 Information Integration Via Semistructured Data486
11.1.4 Exercises for Section 11.1487
11.2 XML488
11.2.1 Semantic Tags488
11.2.2 XML With and Without a Schema489
11.2.3 Well-Formed XML489
11.2.4 Attributes490
11.2.5 Attributes That Connect Elements491
11.2.6 Namespaces493
11.2.7 XML and Databases493
11.2.8 Exercises for Section 11.2495
11.3 Document Type Definitions495
11.3.1 The Form of a DTD495
11.3.2 Using a DTD499
11.3.3 Attribute Lists499
11.3.4 Identifiers and References500
11.3.5 Exercises for Section 11.3502
11.4 XML Schema502
11.4.1 The Form of an XML Schema502
11.4.2 Elements503
11.4.3 Complex Types504
11.4.4 Attributes506
11.4.5 Restricted Simple Types507
11.4.6 Keys in XML Schema509
11.4.7 Foreign Keys in XML Schema510
11.4.8 Exercises for Section 11.4512
11.5 Summary of Chapter 11514
11.6 References for Chapter 11515
12 Programming Languages for XML517
12.1 XPath517
12.1.1 The XPath Data Model518
12.1.2 Document Nodes519
12.1.3 Path Expressions519
12.1.4 Relative Path Expressions521
12.1.5 Attributes in Path Expressions521
12.1.6 Axes521
12.1.7 Context of Expressions522
12.1.8 Wildcards523
12.1.9 Conditions in Path Expressions523
12.1.10 Exercises for Section 12.1526
12.2 XQuery528
12.2.1 XQuery Basics530
12.2.2 FLWR Expressions530
12.2.3 Replacement of Variables by Their Values534
12.2.4 Joins in XQuery536
12.2.5 XQuery Comparison Operators537
12.2.6 Elimination of Duplicates538
12.2.7 Quantification in XQuery539
12.2.8 Aggregations540
12.2.9 Branching in XQuery Expressions540
12.2.10 Ordering the Result of a Query541
12.2.11 Exercises for Section 12.2543
12.3 Extensible Stylesheet Language544
12.3.1 XSLT Basics544
12.3.2 Templates544
12.3.3 Obtaining Values From XML Data545
12.3.4 Recursive Use of Templates546
12.3.5 Iteration in XSLT549
12.3.6 Conditionals in XSLT551
12.3.7 Exercises for Section 12.3551
12.4 Summary of Chapter 12553
12.5 References for Chapter 12554
Ⅳ Database System Implementation555
13 Secondary Storage Management557
13.1 The Memory Hierarchy557
13.1.1 The Memory Hierarchy557
13.1.2 Transfer of Data Between Levels560
13.1.3 Volatile and Nonvolatile Storage560
13.1.4 Virtual Memory560
13.1.5 Exercises for Section 13.1561
13.2 Disks562
13.2.1 Mechanics of Disks562
13.2.2 The Disk Controller564
13.2.3 Disk Access Characteristics564
13.2.4 Exercises for Section 13.2567
13.3 Accelerating Access to Secondary Storage568
13.3.1 The I/O Model of Computation568
13.3.2 Organizing Data by Cylinders569
13.3.3 Using Multiple Disks570
13.3.4 Mirroring Disks571
13.3.5 Disk Scheduling and the Elevator Algorithm571
13.3.6 Prefetching and Large-Scale Buffering573
13.3.7 Exercises for Section 13.3573
13.4 Disk Failures575
13.4.1 Intermittent Failures576
13.4.2 Checksums576
13.4.3 Stable Storage577
13.4.4 Error-Handling Capabilities of Stable Storage578
13.4.5 Recovery from Disk Crashes578
13.4.6 Mirroring as a Redundancy Technique579
13.4.7 Parity Blocks580
13.4.8 An Improvement:RAID 5583
13.4.9 Coping With Multiple Disk Crashes584
13.4.10 Exercises for Section 13.4587
13.5 Arranging Data on Disk590
13.5.1 Fixed-Length Records590
13.5.2 Packing Fixed-Length Records into Blocks592
13.5.3 Exercises for Section 13.5593
13.6 Representing Block and Record Addresses593
13.6.1 Addresses in Client-Server Systems593
13.6.2 Logical and Structured Addresses595
13.6.3 Pointer Swizzling596
13.6.4 Returning Blocks to Disk600
13.6.5 Pinned Records and Blocks600
13.6.6 Exercises for Section 13.6602
13.7 Variable-Length Data and Records603
13.7.1 Records With Variable-Length Fields604
13.7.2 Records With Repeating Fields605
13.7.3 Variable-Format Records607
13.7.4 Records That Do Not Fit in a Block608
13.7.5 BLOBs608
13.7.6 Column Stores609
13.7.7 Exercises for Section 13.7610
13.8 Record Modifications612
13.8.1 Insertion612
13.8.2 Deletion614
13.8.3 Update615
13.8.4 Exercises for Section 13.8615
13.9 Summary of Chapter 13615
13.10 References for Chapter 13617
14 Index Structures619
14.1 Index-Structure Basics620
14.1.1 Sequential Files621
14.1.2 Dense Indexes621
14.1.3 Sparse Indexes622
14.1.4 Multiple Levels of Index623
14.1.5 Secondary Indexes624
14.1.6 Applications of Secondary Indexes625
14.1.7 Indirection in Secondary Indexes626
14.1.8 Document Retrieval and Inverted Indexes628
14.1.9 Exercises for Section 14.1631
14.2 B-Trees633
14.2.1 The Structure of B-trees634
14.2.2 Applications of B-trees637
14.2.3 Lookup in B-Trees639
14.2.4 Range Queries639
14.2.5 Insertion Into B-Trees640
14.2.6 Deletion From B-Trees642
14.2.7 Efficiency of B-Trees645
14.2.8 Exercises for Section 14.2646
14.3 Hash Tables648
14.3.1 Secondary-Storage Hash Tables649
14.3.2 Insertion Into a Hash Table649
14.3.3 Hash-Table Deletion650
14.3.4 Efficiency of Hash Table Indexes651
14.3.5 Extensible Hash Tables652
14.3.6 Insertion Into Extensible Hash Tables653
14.3.7 Linear Hash Tables655
14.3.8 Insertion Into Linear Hash Tables657
14.3.9 Exercises for Section 14.3659
14.4 Multidimensional Indexes661
14.4.1 Applications of Multidimensional Indexes661
14.4.2 Executing Range Queries Using Conventional Indexes663
14.4.3 Executing Nearest-Neighbor Queries Using Conventional Indexes664
14.4.4 Overview of Multidimensional Index Structures664
14.5 Hash Structures for Multidimensional Data665
14.5.1 Grid Files665
14.5.2 Lookup in a Grid File666
14.5.3 Insertion Into Grid Files667
14.5.4 Performance of Grid Files669
14.5.5 Partitioned Hash Functions671
14.5.6 Comparison of Grid Files and Partitioned Hashing673
14.5.7 Exercises for Section 14.5673
14.6 Tree Structures for Multidimensional Data675
14.6.1 Multiple-Key Indexes675
14.6.2 Performance of Multiple-Key Indexes676
14.6.3 kd-Trees677
14.6.4 Operations on kd-Trees679
14.6.5 Adapting kd-Trees to Secondary Storage681
14.6.6 Quad Trees681
14.6.7 R-Trees683
14.6.8 Operations on R-Trees684
14.6.9 Exercises for Section 14.6686
14.7 Bitmap Indexes688
14.7.1 Motivation for Bitmap Indexes689
14.7.2 Compressed Bitmaps691
14.7.3 Operating on Run-Length-Encoded Bit-Vectors693
14.7.4 Managing Bitmap Indexes693
14.7.5 Exercises for Section 14.7695
14.8 Summary of Chapter 14695
14.9 References for Chapter 14697
15 Query Execution701
15.1 Introduction to Physical-Query-Plan Operators703
15.1.1 Scanning Tables703
15.1.2 Sorting While Scanning Tables704
15.1.3 The Computation Model for Physical Operators704
15.1.4 Parameters for Measuring Costs705
15.1.5 I/O Cost for Scan Operators706
15.1.6 Iterators for Implementation of Physical Operators707
15.2 One-Pass Algorithms709
15.2.1 Ohe-Pass Algorithms for Tuple-at-a-Time Operations711
15.2.2 One-Pass Algorithms for Unary,Full-Relation Operations712
15.2.3 One-Pass Algorithms for Binary Operations715
15.2.4 Exercises for Section 15.2718
15.3 Nested-Loop Joins718
15.3.1 Tuple-Based Nested-Loop Join719
15.3.2 An Iterator for Tuple-Based Nested-Loop Join719
15.3.3 Block-Based Nested-Loop Join Algorithm719
15.3.4 Analysis of Nested-Loop Join721
15.3.5 Summary of Algorithms so Far722
15.3.6 Exercises for Section 15.3722
15.4 Two-Pass Algorithms Based on Sorting723
15.4.1 Two-Phase,Multiway Merge-Sort723
15.4.2 Duplicate Elimination Using Sorting725
15.4.3 Grouping and Aggregation Using Sorting726
15.4.4 A Sort-Based Union Algorithm726
15.4.5 Sort-Based Intersection and Diffefence727
15.4.6 A Simple Sort-Based Join Algorithm728
15.4.7 Analysis of Simple Sort-Join729
15.4.8 A More Efficient Sort-Based Join729
15.4.9 Summary of Sort-Based Algorithms730
15.4.10 Exercises for Section 15.4730
15.5 Two-Pass Algorithms Based on Hashing732
15.5.1 Partitioning Relations by Hashing732
15.5.2 A Hash-Based Algorithm for Duplicate Elimination732
15.5.3 Hash-Based Grouping and Aggregation733
15.5.4 Hash-Based Union,Intersection,and Difference734
15.5.5 The Hash-Join Algorithm734
15.5.6 Saving Some Disk I/O's735
15.5.7 Summary of Hash-Based Algorithms737
15.5.8 Exercises for Section 15.5738
15.6 Index-Based Algorithms739
15.6.1 Clustering and Nonclustering Indexes739
15.6.2 Index-Based Selection740
15.6.3 Joining by Using an Index742
15.6.4 Joins Using a Sorted Index743
15.6.5 Exercises for Section 15.6745
15.7 Buffer Management746
15.7.1 Buffer Management Architecture746
15.7.2 Buffer Management Strategies747
15.7.3 The Relationship Between Physical Operator Selection and Buffer Management750
15.7.4 Exercises for Section 15.7751
15.8 Algorithms Using More Than Two Passes752
15.8.1 Multipass Sort-Based Algorithms752
15.8.2 Performance of Multipass,Sort-Based Algorithms753
15.8.3 Multipass Hash-Based Algorithms754
15.8.4 Performance of Multipass Hash-Based Algorithms754
15.8.5 Exercises for Section 15.8755
15.9 Summary of Chapter 15756
15.10 References for Chapter 15757
16 The Query Compiler759
16.1 Parsing and Preprocessing760
16.1.1 Syntax Analysis and Parse Trees760
16.1.2 A Grammar for a Simple Subset of SQL761
16.1.3 The Preprocessor764
16.1.4 Preprocessing Queries Involving Views765
16.1.5 Exercises for Section 16.1767
16.2 Algebraic Laws for Improving Query Plans768
16.2.1 Commutative and Associative Laws768
16.2.2 Laws Involving Selection770
16.2.3 Pushing Selections772
16.2.4 Laws Involving Projection774
16.2.5 Laws About Joins and Products776
16.2.6 Laws Involving Duplicate Elimination777
16.2.7 Laws Involving Grouping and Aggregation777
16.2.8 Exercises for Section 16.2780
16.3 From Parse Trees to Logical Query Plans781
16.3.1 Conversion to Relational Algebra782
16.3.2 Removing Subqueries From Conditions783
16.3.3 Improving the Logical Query Plan788
16.3.4 Grouping Associative/Commutative Operators790
16.3.5 Exercises for Section 16.3791
16.4 Estimating the Cost of Operations792
16.4.1 Estimating Sizes of Intermediate Relations793
16.4.2 Estimating the Size of a Projection794
16.4.3 Estimating the Size of a Selection794
16.4.4 Estimating the Size of a Join797
16.4.5 Natural Joins With Multiple Join Attributes799
16.4.6 Joins of Many Relations800
16.4.7 Estimating Sizes for Other Operations801
16.4.8 Exercises for Section 16.4802
16.5 Introduction to Cost-Based Plan Selection803
16.5.1 Obtaining Estimates for Size Parameters804
16.5.2 Computation of Statistics807
16.5.3 Heuristics for Reducing the Cost of Logical Query Plans808
16.5.4 Approaches to Enumerating Physical Plans810
16.5.5 Exercises for Section 16.5813
16.6 Choosing an Order for Joins814
16.6.1 Significance of Left and Right Join Arguments815
16.6.2 Join Trees815
16.6.3 Left-Deep Join Trees816
16.6.4 Dynamic Programming to Select a Join Order and Grouping819
16.6.5 Dynamic Programming With More Detailed Cost Functions823
16.6.6 A Greedy Algorithm for Selecting a Join Order824
16.6.7 Exercises for Section 16.6825
16.7 Completing the Physical-Query-Plan826
16.7.1 Choosing a Selection Method827
16.7.2 Choosing a Join Method829
16.7.3 Pipelining Versus Materialization830
16.7 4 Pipelining Unary Operations830
16.7.5 Pipelining Binary Operations830
16.7.6 Notation for Physical Query Plans834
16.7.7 Ordering of Physical Operations837
16.7.8 Exercises for Section 16.7838
16.8 Summary of Chapter 16839
16.9 References for Chapter 16841
17 Coping With System Failures843
17.1 Issues and Models for Resilient Operation843
17.1.1 Failure Modes844
17.1.2 More About Transactions845
17.1.3 Correct Execution of Transactions846
17.1.4 The Primitive Operations of Transactions848
17.1.5 Exercises for Section 17.1851
17.2 Undo Logging851
17.2.1 Log Records851
17.2.2 The Undo-Logging Rules853
17.2.3 Recovery Using Undo Logging855
17.2.4 Checkpointing857
17.2.5 Nonquiescent Checkpointing858
17.2.6 Exercises for Section 17.2862
17.3 Redo Logging863
17.3.1 The Redo-Logging Rule863
17.3.2 Recovery With Redo Logging864
17.3.3 Checkpointing a Redo Log866
17.3.4 Recovery With a Checkpointed Redo Log867
17.3.5 Exercises for Section 17.3868
17.4 Undo/Redo Logging869
17.4.1 The Undo/Redo Rules870
17.4.2 Recovery With Undo/Redo Logging870
17.4.3 Checkpointing an Undo/Redo Log872
17.4.4 Exercises for Section 17.4874
17.5 Protecting Against Media Failures875
17.5.1 The Archive875
17.5.2 Nonquiescent Archiving875
17.5.3 Recovery Using an Archive and Log878
17.5.4 Exercises for Section 17.5879
17.6 Summary of Chapter 17879
17.7 References for Chapter 17881
18 Concurrency Control883
18.1 Serial and Serializable Schedules884
18.1.1 Schedules884
18.1.2 Serial Schedules885
18.1.3 Serializable Schedules886
18.1.4 The Effect of Transaction Semantics887
18.1.5 A Notation for Transactions and Schedules889
18.1.6 Exercises for Section 18.1889
18.2 Conflict-Serializability890
18.2.1 Conflicts890
18.2.2 Precedence Graphs and a Test for Conflict-Serializability892
18.2.3 Why the Precedence-Graph Test Works894
18.2.4 Exercises for Section 18.2895
18.3 Enforcing Serializability by Locks897
18.3.1 Locks898
18.3.2 The Locking Scheduler900
18.3.3 Two-Phase Locking900
18.3.4 Why Two-Phase Locking Works901
18.3.5 Exercises for Section 18.3903
18.4 Locking Systems With Several Lock Modes905
18.4.1 Shared and Exclusive Locks905
18.4.2 Compatibility Matrices907
18.4.3 Upgrading Locks908
18.4.4 Update Locks909
18.4.5 Increment Locks911
18.4.6 Exercises for Section 18.4913
18.5 An Architecture for a Locking Scheduler915
18.5.1 A Scheduler That Inserts Lock Actions915
18.5.2 The Lock Table918
18.5.3 Exercises for Section 18.5921
18.6 Hierarchies of Database Elements921
18.6.1 Locks With Multiple Granularity921
18.6.2 Warning Locks922
18.6.3 Phantoms and Handling Insertions Correctly926
18.6.4 Exercises for Section 18.6927
18.7 The Tree Protocol927
18.7.1 Motivation for Tree-Based Locking927
18.7.2 Rules for Access to Tree-Structured Data928
18.7.3 Why the Tree Protocol Works929
18.7.4 Exercises for Section 18.7932
18.8 Concurrency Control by Timestamps933
18.8.1 Timestamps934
18.8.2 Physically Unrealizable Behaviors934
18.8.3 Problems With Dirty Data935
18.8.4 The Rules for Timestamp-Based Scheduling937
18.8.5 Multiversion Timestamps939
18.8.6 Timestamps Versus Locking941
18.8.7 Exercises for Section 18.8942
18.9 Concurrency Control by Validation942
18.9.1 Architecture of a Validation-Based Scheduler942
18.9.2 The Validation Rules943
18.9.3 Comparison of Three Concurrency-Control Mechanisms946
18.9.4 Exercises for Section 18.9948
18.10 Summary of Chapter 18948
18.11 References for Chapter 18950
19 More About Transaction Management953
19.1 Serializability and Recoverability953
19.1.1 The Dirty-Data Problem954
19.1.2 Cascading Rollback955
19.1.3 Recoverable Schedules956
19.1.4 Schedules That Avoid Cascading Rollback957
19.1.5 Managing Rollbacks Using Locking957
19.1.6 Group Commit959
19.1.7 Logical Logging960
19.1.8 Recovery From Logical Logs963
19.1.9 Exercises for Section 19.1965
19.2 Deadlocks966
19.2.1 Deadlock Detection by Timeout967
19.2.2 The Waits-For Graph967
19.2.3 Deadlock Prevention by Ordering Elements970
19.2.4 Detecting Deadlocks by Timestamps970
19.2.5 Comparison of Deadlock-Management Methods972
19.2.6 Exercises for Section 19.2974
19.3 Long-Duration Transactions975
19.3.1 Problems of Long Transactions976
19.3.2 Sagas978
19.3.3 Compensating Transactions979
19.3.4 Why Compensating Transactions Work980
19.3.5 Exercises for Section 19.3981
19.4 Summary of Chapter 19982
19.5 References for Chapter 19983
20 Parallel and Distributed Databases985
20.1 Parallel Algorithms on Relations985
20.1.1 Models of Parallelism986
20.1.2 Tuple-at-a-Time Operations in Parallel989
20.1.3 Parallel Algorithms for Full-Relation Operations989
20.1.4 Performance of Parallel Algorithms990
20.1.5 Exercises for Section 20.1993
20.2 The Map-Reduce Parallelism Framework993
20.2.1 The Storage Model993
20.2.2 The Map Function994
20.2.3 The Reduce Function995
20.2.4 Exercises for Section 20.2996
20.3 Distributed Databases997
20.3.1 Distribution of Data997
20.3.2 Distributed Transactions998
20.3.3 Data Replication999
20.3.4 Exercises for Section 20.31000
20.4 Distributed Query Processing1000
20.4.1 The Distributed Join Problem1000
20.4.2 Semijoin Reductions1001
20.4.3 Joins of Many Relations1002
20.4.4 Acyclic Hypergraphs1003
20.4.5 Full Reducers for Acyclic Hypergraphs1005
20.4.6 Why the Full-Reducer Algorithm Works1006
20.4.7 Exercises for Section 20.41007
20.5 Distributed Commit1008
20.5.1 Supporting Distributed Atomicity1008
20.5.2 Two-Phase Commit1009
20.5.3 Recovery of Distributed Transactions1011
20.5.4 Exercises for Section 20.51013
20.6 Distributed Locking1014
20.6.1 Centralized Lock Systems1015
20.6.2 A Cost Model for Distributed Locking Algorithms1015
20.6.3 Locking Replicated Elements1016
20.6.4 Primary-Copy Locking1017
20.6.5 Global Locks From Local Locks1017
20.6.6 Exercises for Section 20.61019
20.7 Peer-to-Peer Distributed Search1020
20.7.1 Peer-to-Peer Networks1020
20.7.2 The Distributed-Hashing Problem1021
20.7.3 Centralized Solutions for Distributed Hashing1022
20.7.4 Chord Circles1022
20.7.5 Links in Chord Circles1024
20.7.6 Search Using Finger Tables1024
20.7.7 Adding New Nodes1027
20.7.8 When a Peer Leaves the Network1030
20.7.9 When a Peer Fails1030
20.7.10 Exercises for Section 20.71031
20.8 Summary of Chapter 201031
20.9 References for Chapter 201033
Ⅴ Other Issues in Management of Massive Data1035
21 Information Integration1037
21.1 Introduction to Information Integration1037
21.1.1 Why Information Integration?1038
21.1.2 The Heterogeneity Problem1040
21.2 Modes of Information Integration1041
21.2.1 Federated Database Systems1042
21.2.2 Data Warehouses1043
21.2.3 Mediators1046
21.2.4 Exercises for Section 21.21048
21.3 Wrappers in Mediator-Based Systems1049
21.3.1 Templates for Query Patterns1050
21.3.2 Wrapper Generators1051
21.3.3 Filters1052
21.3.4 Other Operations at the Wrapper1053
21.3.5 Exercises for Section 21.31054
21.4 Capability-Based Optimization1056
21.4.1 The Problem of Limited Source Capabilities1056
21.4.2 A Notation for Describing Source Capabilities1057
21.4.3 Capability-Based Query-Plan Selection1058
21.4.4 Adding Cost-Based Optimization1060
21.4.5 Exercises for Section 21.41060
21.5 Optimizing Mediator Queries1061
21.5.1 Simplified Adornment Notation1061
21.5.2 Obtaining Answers for Subgoals1062
21.5.3 The Chain Algorithm1063
21.5.4 Incorporating Union Views at the Mediator1067
21.5.5 Exercises for Section 21.51068
21.6 Local-as-View Mediators1069
21.6.1 Motivation for LAV Mediators1069
21.6.2 Terminology for LAV Mediation1070
21.6.3 Expanding Solutions1071
21.6.4 Containment of Conjunctive Queries1073
21.6.5 Why the Containment-Mapping Test Works1075
21.6.6 Finding Solutions to a Mediator Query1076
21.6.7 Why the LMSS Theorem Holds1077
21.6.8 Exercises for Section 21.61078
21.7 Entity Resolution1078
21.7.1 Deciding Whether Records Represent a Common Entity1079
21.7.2 Merging Similar Records1081
21.7.3 Useful Properties of Similarity and Merge Functions1082
21.7.4 The R-Swoosh Algorithm for ICAR Records1083
21.7 5 Why R-Swoosh Works1086
21.7 6 Other Approaches to Entity Resolution1086
21.7.7 Exercises for Section 21.71087
21.8 Summary of Chapter 211089
21.9 References for Chapter 211091
22 Data Mining1093
22.1 Frequent-Itemset Mining1093
22.1.1 The Market-Basket Model1094
22.1.2 Basic Definitions1095
22.1.3 Association Rules1097
22.1.4 The Computation Model for Frequent Itemsets1098
22.1.5 Exercises for Section 22.11099
22.2 Algorithms for Finding Frequent Itemsets1100
22.2.1 The Distribution of Frequent Itemsets1100
22.2.2 The Naive Algorithm for Finding Frequent Itemsets1101
22.2.3 The A-Priori Algorithm1102
22.2.4 Implementation of the A-Priori Algorithm1104
22.2.5 Making Better Use of Main Memory1105
22.2.6 When to Use the PCY Algorithm1106
22.2.7 The Multistage Algorithm1107
22.2.8 Exercises for Section 22.21109
22.3 Finding Similar Items1110
22.3.1 The Jaccard Measure of Similarity1110
22.3.2 Applications of Jaccard Similarity1110
22.3.3 Minhashing1112
22.3.4 Minhashing and Jaccard Distance1113
22.3.5 Why Minhashing Works1113
22.3.6 Implementing Minhashing1114
22.3.7 Exercises for Section 22.31115
22.4 Locality-Sensitive Hashing1116
22.4.1 Entity Resolution as an Example of LSH1117
22.4.2 Locality-Sensitive Hashing of Signatures1118
22.4.3 Combining Minhashing and Locality-Sensitive Hashing1121
22.4.4 Exercises for Section 22.41122
22.5 Clustering of Large-Scale Data1123
22.5.1 Applications of Clustering1123
22.5.2 Distance Measures1125
22.5.3 Agglomerative Clustering1128
22.5.4 k-Means Algorithms1130
22.5.5 k-Means for Large-Scale Data1132
22.5.6 Processing a Memory Load of Points1133
22.5.7 Exercises for Section 22.51136
22.6 Summary of Chapter 221137
22.7 References for Chapter 221139
23 Database Systems and the Internet1141
23.1 The Architecture of a Search Engine1141
23.1.1 Components of a Search Engine1142
23.1.2 Web Crawlers1143
23.1.3 Query Processing in Search Engines1146
23.1.4 Ranking Pages1146
23.2 PageRank for Identifying Important Pages1147
23.2.1 The Intuition Behind PageRank1147
23.2.2 Recursive Formulation of PageRank—First Try1148
23.2.3 Spider Traps and Dead Ends1150
23.2.4 PageRank Accounting for Spider Traps and Dead Ends1153
23.2.5 Exercises for Section 23.21154
23.3 Topic-Specific PageRank1156
23.3.1 Teleport Sets1156
23.3.2 Calculating A Topic-Specific PageRank1158
23.3.3 Link Spam1159
23.3.4 Topic-Specific PageRank and Link Spam1160
23.3.5 Exercises for Section 23.31161
23.4 Data Streams1161
23.4.1 Data-Stream-Management Systems1162
23.4.2 Stream Applications1163
23.4.3 A Data-Stream Data Model1164
23.4.4 Converting Streams Into Relations1165
23.4.5 Converting Relations Into Streams1166
23.4.6 Exercises for Section 23.41168
23.5 Data Mining of Streams1169
23.5.1 Motivation1169
23.5.2 Counting Bits1171
23.5.3 Counting the Number of Distinct Elements1175
23.5.4 Exercises for Section 23.51176
23.6 Summary of Chapter 231177
23.7 References for Chapter 231179
热门推荐
- 2530950.html
- 3083550.html
- 2207283.html
- 2654534.html
- 3856360.html
- 2693435.html
- 3891966.html
- 679958.html
- 3818009.html
- 1026249.html
- http://www.ickdjs.cc/book_974673.html
- http://www.ickdjs.cc/book_1264488.html
- http://www.ickdjs.cc/book_913142.html
- http://www.ickdjs.cc/book_2098565.html
- http://www.ickdjs.cc/book_3203100.html
- http://www.ickdjs.cc/book_2666800.html
- http://www.ickdjs.cc/book_754346.html
- http://www.ickdjs.cc/book_2786663.html
- http://www.ickdjs.cc/book_1732789.html
- http://www.ickdjs.cc/book_2239717.html