Modern Database Management, Thirteenth Edition
By Jeffrey A. Hoffer, V. Ramesh and Heikki Topi
Contents:
Preface xxv
Part I The Context of Database Management 1
An Overview of Part I 1
Chapter 1 The Database Environment and Development Process 3
Learning Objectives 3
Data Matter! 4
Introduction 5
Basic Concepts and Definitions 6
Data 6
Data versus Information 7
Metadata 8
Traditional File Processing Systems 9
File Processing Systems at Pine Valley Furniture Company 9
Disadvantages of File Processing Systems 10
Program· Data Dependence 10
Duplication Of Data 1 0
Limited Data Sharing 10
Lengthy Development Times 10
Excessive Program Maintenance 11
The Database Approach 11
Data Models 11
Entities 11
Relationships 1 1
Relational Databases 12
Database Management Systems 13
Advantages of the Database Approach 13
Program· Data Independence 13
Planned Data Redundancy 14
Improved Data Consistency 14
Improved Data Sharing 14
Increased Productivity Of Application Development 14
Enforcement Of Standards 15
Improved Data Quality 15
Improved Data Accessibility And Responsiveness 15
Reduced Program Maintenance 16
Improved Decision Support 16
Cautions About Database Benefits 16
Costs And Risks Of The Database Approach 16
New, Specialized Personnel 16
Installation And Management Cost And Complexity 17
Conversion Costs 17
Need For Explicit Backup And Recovery 17
Organizational Conflict 17
Integrated Data Management Framework 17
Components of the Database Environment 18
The Database Development Process 20
Systems Development Life Cycle 21
Planning- Enterprise Modeling 21
Planning-Conceptual Data Modeling 21
Analysis-Conceptual Data Modeling 22
Design- Logical Database Design 23
Design- Physical Database Design And Definition 23
Implementation- Database Implementation 23
Maintenance-Database Maintenance 24
Alternative Information Systems Development Approaches 24
Three-Schema Architecture for Database Development 25
Managing the People Involved in Database Development 27
Evolution of Database Systems 27
1960s 29
1970s 29
1980s 29
1990s 30
2000 and Beyond 30
The Range of Database Applications 30
Personal Databases 31
Departmental Multi-Tiered Client/Server Databases 31
Enterprise Applications 32
Enterprise Systems 32
Data Warehouses 33
Data Lake 34
Developing a Database Application for Pine Valley Furniture
Company 35
Database Evolution at Pine Valley Furniture Company 36
Project Planning 36
Analyzing Database Requirements 37
Designing the Database 40
Using the Database 42
Administering the Database 43
Future of Databases at Pine Valley 43
Summary 44 • Key Terms 45 • Review Questions 45 •
Problems and Exercises 46 • Field Exercises 48 •
References 49 • Further Reading 49 •
Web Resources 50
CASE: Forondo Artist Management Excellence Inc. 51
Part II Database Analysis and Logical Design 53
An Overview of Part II 53 a Chapter 2 Modeling Data in the Organization 55
Learning Objectives 55
Introduction 55
The E-R Model: An Overview 58
Sample E-R Diagram 58
E-R Model Notation 60
Modeling the Rules of the Organization 61
Overview of Business Rules 62
The Business Rules Paradigm 62
Scope of Business Rules 63
Good Business Rules 63
Gathering Business Rules 64
Data Names And Definitions 64
Data N Ames 64
Data Definitions 65
Good Data Definitions 65
Modeling Entities and Attributes 67
Entities 67
Entity Type Versus Entity Instance 67
Entity Type Versus System Input, Output, Or User 67
Strong Versus Weak Entity Types 68
Naming And Defining Entity Types 69
Attributes 71
Required Versus Optional Attributes 7 1
Simple Versus Composite Attributes 72
Single Valued Versus Multivalued Attributes 72
Stored Versus Derived Attributes 73
Identifier Attribute 73
Naming And Defining Attributes 74
Modeling Relationships 76
Basic Concepts And Definitions In Relationships 77
Attributes On Relationships 78
Associative Entities 78
Degree Of A Relationship 80
Unary Relationship 81
Binary Relationship 82
Ternary Relationship 82
Attributes Or Entity? 83
Cardinality Constraints 85
Minimum Cardinality 85
Maximum Cardinality 86
Some Examples Of Relationships And Their Cardinalities 86
A Ternary Relationship 87
Modeling Time-Dependent Data 88
Modeling Multiple Relationships Between Entity Types 90
Naming And Defining Relationships 92
E-R Modeling Example: Pine Valley Furniture Company 93
Database Processing At Pine Valley Furniture 96
Showing Product Information 96
Showing Product Line Information 96
Showing Customer Order Status 97
Showing Product Sales 98
Summary 99 • Key Terms 100 • Review Questions 100 •
Problems And Exercises 101 • Field Exercises 111 •
References 112 • Further Reading 112 •
Web Resources 112
Case: Forondo Artist Management Excellence Inc. 113
Chapter 3 The Enhanced E-R Model 115
Learning Objectives 115
Introduction 115
Representing Super Types And Subtypes 116
Basic Concepts And Notation 117
An Example Of A Supertype /subtype Relationship 118
Attribute Inheritance 119
When To Use Supertype /subtype Relationships 119
Representing Specialization and Generalization 120
Generalization 120
Specialization 121
(Ombining Specialization And Generalization 122
Specifying Constraints In Super Type/Subtype Relationships 123
Specifying Completeness Constraints 123
Total Specialization Rule 123
Partia L Specialization Rule 123
Specifying Disjointness Constraints 124
D Isjoint Ru Le 124
Overlap Rule 12 5
Defining Subtype Discriminators 125
D Isjoint Subtypes 125
Overlapping Subtypes 126
Defining Super Type/Subtype Hierarchies 127
An Example Of A Supertypeisubtype Hierarchy 128
Summary Of Supertype/Subtype Hierarchies 128
Eer Modeling Example: Pine Valley Furniture Company 128
Entity Clustering 132
Packaged Data Models 135
A Revised Data Modeling Process With Packaged Data Models 137
Packaged Data Model Examples 139
Summary 144 • Key Terms 145 • Review Questions 145 •
Problems And Exercises 146 • Field Exercises 149 •
References 149 • Further Reading 150 • Web Resources 150
Case: Forondo Artist Management Excellence Inc. 151
Chapter 4 Logical Database Design And The Relational Model 153
Learning Objectives 153
Introduction 153
The Relational Data Model 154
Basic Definitions 154
Relational Data Structure 155
Relational Keys 155
Properties Of Relations 156
Removing Multivalued Attributes From Tables 156
Sample Database 157
Integrity Constraints 158
Domain Constraints 158
Entity Integrity 158
Referential Integrity 160
Creating Relational Tables 161
Well-Structured Relations 162
Transforming Eer Diagrams Into Relations 163
Step 1: Map Regular Entities 164
Composite Aitributes 164
Multivalued Aitributes 165
Step 2: Map Weak Entities 165
When To (Reate A Surrogate Key 166
Step 3: Map Binary Relationships 167
Map Binary One-To-Many Relationships 167
Map Binary Many-To-Many Relationships 168
Map Binary One-To-One Relationships 168
Step 4: Map Associative Entities 169
Identifier Not Assigned 169
Identifier Assigned 1 70
Step 5: Map Unary Relationships 171
Unary One-To-Many Relationships 171
Unary Many-To-Many Relationships 172
Step 6: Map Ternary (And N-Ary) Relationships 173
Step 7: Map Supertype/Subtype Relationships 174
Summary Of Eer-To-Relational Transformations 176
Introduction To Normalization 176
Steps In Normalization 177
Functional Dependencies And Keys 177
Determinants 179
Candidate Keys 179
Normalization Example: Pine Valley Furniture Company 180
Step 0: Represent The View In Tabular Form 180
Step 1: Convert To First Normal Form 181
Remove Repeating Groups 181
Select The Primary Key 182
Anomalies In 1 Nf 182
Step 2: Convert To Second Normal Form 183
Step 3: Convert To Third Normal Form 184
Removing Transitive Dependencies 184
Determinants and Normalization 185
Step 4: Further Normalization 185
Merging Relations 186
An Example 186
View Integration Problems 186
Synonyms 187
Homonyms 187
Transitive Dependencies 187
Supertype/Sustype Relationships 188
A Final Step for Defining Relational Keys 188
Summary 191 • Key Terms 191 • Review Questions 191 •
Problems and Exercises 792 • Field Exercises 201 •
References 202 • Further Reading 202 •
Web Resources 202
CASE: Forondo Artist Management Excellence Inc. 203
Part Ill Database Implementation and Use 205
An Overview of Part Ill 205 a Chapter 5 Introduction to SQL 207
Learning Objectives 207
Introduction 207
Origins of the SQL Standard 209
The SQL Environment 211
SQL Data Types 213
Defining A Database in SQL 216
Generating SQL Database Definitions 216
Creating Tables 217
Creating Data Integrity Controls 220
Changing Table Definitions 221
Removing Tables 221
Inserting, Updating, and Deleting Data 222
Batch Input 223
Deleting Database Contents 223
Updating Database Contents 224
Internal Schema Definition in RDBMSS 225
Creating Indexes 225
Processing Single Tables 226
Clauses of the SELECT Statement 226
Using Expressions 228
Using Functions 229
Using Wildcards 232
Using Comparison Operators 232
Using Null Values 233
Using Boolean Operators 233
Using Ranges for Qualification 236
Using Distinct Values 236
Using IN and NOT IN with Lists 238
Sorting Results: The ORDER BY Clause 239
Categorizing Results: The GROUP BY Clause 240
Qualifying Results by Categories: The HAVING Clause 241
Summary 243 • Key Terms 243 • Review Questions 243 •
Problems and Exercises 244 • Field Exercises 248 •
References 248 • Further Reading 249 •
Web Resources 249
CASE: Forondo Artist Management Excellence Inc. 250
Chapter 6 Advanced SQL 251
Learning Objectives 251
Introduction 251
Processing Multiple Tables 252
Equi-Join 253
Natural Join 254
Outer Join 255
Sample Join Involving Four Tables 257
Self-Join 258
Subqueries 260
Correlated Subqueries 265
Using Derived Tables 267
Combinings Queries 267
Conditional Expressions 269
More Complicated SQL Queries 270
Tips for Developing Queries 272
Guidelines f or Better Query Design 274
Using and Defining Views 275
Materialized Views 279
Triggers and Routines 279
Triggers 280
Routines and Other Programming Extensions 282
Example Routine in Oracl e’s PUSQL 284
Data Dictionary Facilities 285
Recent Enhancements and Extensions to SQL 287
Analytical and OLAP Functions 287
New Temporal Features in SQL 288
Other Enhancements 288
Summary 289 • Key Terms 290 • Review Questions 290 •
Problems and Exercises 291 • Field Exercises 294 •
References 294 • Further Reading 295 •
Web Resources 295
CASE: Forondo Artist Management Excellence Inc. 296
Chapter 7 Databases in Applications 297
Learning Objectives 297
Location, Location, Location! 297
Introduction 298
Client/Server Architectures 298
Databases in Three-Tier Applications 302
A Java Web Application 303
A Python Web Application 307
Key Considerations in Three-Tier Applications 313
Stored Procedures 313
Transactions 313
Database Connections 315
Key Benefits of Three-Tier Applications 31 S
Transaction Integrity 316
Controlling Concurrent Access 318
The Problem of Lost Updates 318
Serializability 319
Locking Mechanisms 319
Locking Level 319
Types Of Locks 320
Deadlock 321
Managing Deadlock 321
Versioning 322
Managing Data Security in an Application Context 324
Threats to Data Security 324
Establishing Client/Server Security 325
Server Security 326
Network Security 326
Application Security Issues In Three-Tier Client/Server
Environments 326
Data Privacy 327
Summary 329 • Key Terms 329 • Review Questions 329 •
Problems and Exercises 330 • Field Exercises 331 •
References 331 • Further Reading 331 •
Web Resources 331
CASE: Forondo Artist Management Excellence Inc. 332
Chapter 8 Physical Database Design and Database Infrastructure 333
Learning Objectives 333
Introduction 334
The Physical Database Design Process 335
Who Is Responsible for Physical Database Design? 335
Physical Database Design as a Basis for Regulatory Compliance 336
SOX and Databases 337
It (Hange Management 337
Logical Access To Data 337
It Operations 338
Data Volume and Usage Analysis 338
Designing Fields 340
Choosing Data Types 340
Coding Techniques 341
Controlling Data Integrity 342
Handling Missing Data 343
Denormalizing and Partitioning Data 343
Denormalization 343
Opportunities For And Types Of De Normalization 344
Denormalize With Caution 345
Partitioning 347
Designing Physical Database Files 348
File Organizations 350
Heap File Organization 350
Sequential File Organizations 350
Indexed File Organizations 352
Hashed File Organizations 353
Clustering Files 353
Designing Controls for Files 354
Using and Selecting Indexes 354
Creating a Unique Key Index 354
Creating a Secondary (Nonunique) Key Index 355
When to Use Indexes 355
Designing a Database for Optimal Query Performance 356
Parallel Query Processing 357
Overriding Automatic Query Optimization 358
Data Dictionaries and Repositories 358
Data Dictionary 3S9
Repositories 3S9
Database Software Data Security Features 361
Views 361
Integrity Controls 362
Authorization Rules 363
User-Defined Procedures 36S
Encryption 36S
Authentication Schemes 36S
Passwords 366
Strong Authentication 366
Database Backup And Recovery 367
Basic Recovery Facilities 367
Backup Facilities 367
Journalizing Facilities 368
Checkpoint Facility 368
Recovery Manager 369
Recovery and Restart Procedures 369
Disk Mirroring 369
Restore/Rerun 370
Backward Recovery 370
Forward Recovery 371
Types Of Database Failure 371
Aborted Transactions 372
Incorrect Data 372
System Failure 372
Database Destruction 372
Disaster Recovery 373
Cloud-Based Database Infrastructure 373
Cloud-Based Models for Providing Data Management
Services 373
Benefits and Downsides of Using Cloud-Based Data
Management Services 374
Summary 375 • Key Terms 376 • Review Questions 377 •
Problems and Exercises 378 • Field Exercises 382 •
References 383 • Further Reading 383 •
Web Resources 383
CASE: Forondo Artist Management Excellence Inc. 384
Part IV Advanced Database Topics 385
An Overview of Part IV 38S
Chapter 9 Data Warehousing and Data Integration 387
Learning Objectives 387
Introduction 387
Basic Concepts of Data Warehousing 390
A Brief History of Data Warehousing 390
The Need for Data Warehousing 390
Need For A Company-Wide View 390
Need To Separate Operational And Informational Systems 393
Data Warehouse Architectures 393
Independent Data Mart Data Warehousing Environment 394
Dependent Data Mart and Operational Data Store
Architecture: A Three-Level Approach 395
Logical Data Mart and Real-Time Data Warehouse
Architecture 397
Three-Layer Data Architecture 400
Role Of The Enterprise Data Model 400
Role Of Metadata 400
Some Characteristics of Data Warehouse Data 401
Status versus Event Data 401
Transient versus Periodic Data 402
An Example of Transient and Periodic Data 402
Transient Data 404
Periodic Data 404
Other Data Warehouse Changes 404
The Derived Data Layer 405
Characteristics of Derived Data 405
The Star Schema 406
Fact Tables And Dimension Tables 406
Example Star Schema 407
Surrogate Key 408
Grain Of The Fact Table 409
Duration Of The Database 41 0
Size Of The Fact Table 410
Modeling Date And Time 41 1
Variations Of The Star Schema 412
Multiple Fact Tables 412
Factless Fact Tables 413
Normalizing Dimension Tables 414
Multivalued Dimensions 414
Hierarchies 415
Slowly Changing Dimensions 417
Determining Dimensions And Facts 420
Data Integration: An Overview 422
General Approaches To Data Integration 422
Data Federation 423
Data Propagation 423
Data Integration For Data Warehousing: The Reconciled Data Layer 424
Characteristics Of Data After Etl 424
The Etl Process 425
Mapping And Metadata Management 425
Extract 426
Cleanse 427
Load And Index 429
Data Transformation 430
Data Transformation Functions 431
Record-Level Functions 431
Field-Level Functions 432
Data Warehouse Administration 434
The Future of Data Warehousing: Integration with Other Forms of Data Management and Analytics 434
Speed of Processing 435
Moving the Data Warehouse into the Cloud 435
Dealing with Unstructured Data 436
Summary 436 • Key Terms 437 • Review Questions 437 •
Problems and Exercises 438 • Field Exercises 442 •
References 442 • Further Reading 443 •
Web Resources 443
Chapter 10 Big Data Technologies 444
Learning Objectives 444
Introduction 444
Moving Beyond Transactional and Data Warehousing
Databases 446
Big Data 446
NoSQL 448
Classification Of Nosql Dbmss 450
Key-Value Stores 450
Document Stores 451
W Ide·Column Stores 451
Graph· Oriented Databases 451
Nosql Examples 451
Redis 451
Mongodb 452
Apache Cassandra 452
NEo4J 452
A NOSQL Example: MongoDB 452
Documents 452
Collections 454
Relationships 454
Querying MongoDB 455
Impact of NoSQL on Database Professionals 456
Hadoop 458
Components of Hadoop 459
THE HADOOP DISTRIBUTED FILE SYSTEM (HDFS) 459
MAPREDUCE 459
PIG 461
HIVE 461
HBASE 462
A Practical Introduction to Pig 462
Loading Data 462
Transforming Data 463
A Practical Introduction To Hive 465
Creating A Table 465
Loading Data Into The Table 465
PROCESSING THE DATA 466
Integrated Analytics and Data Science Platforms 466
HP HAVEN 466
TERADATA ASTER 467
IBM BIG DATA PLATFORM 469
Putting It All Together: Integrated Data Architecture 469
Summary 471 • Key Terms 471 • Review Questions 471 •
Problems and Exercises 472 • References 472 •
Further Reading 473 • Web Resources 473
Chapter 11 Analytics and Its Implications 474
Learning Objectives 474
Introduction 474
Analytics 475
Types of Analytics 475
Use of Descriptive Analytics 477
SQL OLAP QUERYING 478
OLAP TOOLS 480
DATA VISUALIZATION 482
BUSINESS PERFORMANCE MANAGEMENT AND DASHBOARDS 483
Use of Predictive Analytics 484
DATA MINING TOOLS 485
EXAMPLES OF PREDICTIVE ANALYTICS 486
Use of Prescriptive Analytics 487
Key User Tools for Analytics 488
ANALYTICAL AND OLAP FUNCTIONS 489
R 490
PYTHON 491
APACHE SPARK 492
Data Management Infrastructure for Analytics 493
Impact of Big Data and Analytics 495
Applications of Big Data and Analytics 495
BUSINESS 496
E· GOVERNMENT AND POLITICS 496
SCIENCE AND TECHNOLOGY 496
SMART HEALTH AND WELLBEING 497
SECURITY AND PUBLIC SAFETY 497
Implications of Big Data Analytics and Decision Making 497
PERSONAL PRIVACY VERSUS COLLECTIVE BENEFITS 498
OWNERSHIP AND ACCESS 498
QUALITY AND REUSE OF DATA AND ALGORITHMS 498
TRANSPARENCY AND VALIDATION 498
(HANGING NATURE OF WORK 499
DEMANDS FOR WORKFORCE CAPABILITIES AND EDUCATION 499
Summary 499 • Key Terms 500 • Review Questions 500 •
Problems and Exercises 500 • References 501 •
Further Reading 502
Chapter 12 Data and Database Administration with Focus on Data Quality 503
Learning Objectives 503
Introduction 503
Overview of Data and Database Administration 505
Data Administration 505
Database Administration 506
TRADITIONAL DATABASE ADMINISTRATION 506
TRENDS IN DATABASE ADMINISTRATION 508
Evolving Data Administration Roles 510
The Open Source Movement and Database Management 511
Data Governance 512
Managing Data Quality 513
Characteristics of Quality Data 514
EXTERNAL DATA SOURCES 51 5
REDUNDANT DATA STORAGE AND INCONSISTENT METADATA 516
DATA ENTRY PROBLEMS 516
LACK OF ORGANIZATIONAL COMMITMENT 516
Data Quality Improvement 516
GET THE BUSINESS BUY· IN 516
CONDUCT A DATA QUALITY AUDIT 517
ESTABLISH A DATA STEWARDSHIP PROGRAM 518
IMPROVE DATA CAPTURE PROCESSES 518
APPLY MODERN DATA MANAGEMENT PRINCIPLES AND TECHNOLOGY 519
APPLY TQM PRINCIPLES AND PRACTICES 519
Summary of Data Quality 519
Data Availability 520
Measures to Ensure Availability 521
HARDWARE FAILURES 521
LOSS OR CORRUPTION OF DATA 521
HUMAN ERROR 521
MAINTENANCE DOWNTIME 521
NETWORK· RELATED PROBLEMS 521
Master Data Management 521
Summary 523 • Key Terms 523 • Review Questions 524 •
Problems and Exercises 524 • Field Exercises 526 •
References 526 • Further Reading 527 •
Web Resources 527
Glossary of Acronyms 529
Glossary of Terms 531
Index 539