Methods of Multivariate Analysis, Third Edition
By Alvin C. Rencher and William F. Christensen
Contents:
Preface Xvii
Acknowledgments Xxi
1 Introduction 1
1.1 Why Multivariate Analysis? 1
1.2 Prerequisites 3
1.3 Objectives 3
1.4 Basic Types Of Data And Analysis 4
2 Matrix Algebra 7
2.1 Introduction 7
2.2 Notation And Basic Definitions 8
2.2.1 Matrices, Vectors, And Scalars 8
2.2.2 Equality Of Vectors And Matrices 9
2.2.3 Transpose And Symmetric Matrices 9
2.2.4 Special Matrices 10
2.3 Operations 11
2.3.1 Summation And Product Notation 11
2.3.2 Addition Of Matrices And Vectors 12
2.3.3 Multiplication Of Matrices And Vectors 13
2.4 Partitioned Matrices 22
2.5 Rank 23
2.6 Inverse 25
2.7 Positive Definite Matrices 26
2.8 Determinants 28
2.9 Trace 31
2.10 Orthogonal Vectors And Matrices 31
2.11 Eigenvalues And Eigenvectors 32
2.11.1 Definition 32
2.11.2 I + A A N D L – A 34
2.11.3 Tr(A)And|Aj 34
2.11.4 Positive Definite And Semidefinite Matrices 35
2.11.5 The Product Ab 35
2.11.6 Symmetric Matrix 35
2.11.7 Spectral Decomposition 35
2.11.8 Square Root Matrix 36
2.11.9 Square And Inverse Matrices 36
2.11.10 Singular Value Decomposition 37
2.12 Kronecker And Vec Notation 37
Problems 39
Characterizing And Displaying Multivariate Data 47
3.1 Mean And Variance Of A Univariate Random
Variable 47
3.2 Covariance And Correlation Of Bivariate
Random Variables 49
3.2.1 Covariance 49
3.2.2 Correlation 53
3.3 Scatterplots Of Bivariate Samples 55
3.4 Graphical Displays For Multivariate Samples 56
3.5 Dynamic Graphics 58
3.6 Mean Vectors 63
3.7 Covariance Matrices 66
3.8 Correlation Matrices 69
3.9 Mean Vectors And Covariance Matrices For
Subsets Of Variables 71
3.9.1 Two Subsets 71
3.9.2 Three Or More Subsets 73
3.10 Linear Combinations Of Variables 75
3.10.1 Sample Properties 75
3.10.2 Population Properties 81
3.11 Measures Of Overall Variability 81
3.12 Estimation Of Missing Values 82
3.13 Distance Between Vectors 84
Problems 85
The Multivariate Normal Distribution 91
4.1 Multivariate Normal Density Function 91
4.1.1 Univariate Normal Density 92
4.1.2 Multivariate Normal Density 92
4.1.3 Generalized Population Variance 93
4.1.4 Diversity Of Applications Of The Multivariate Normal 93
4.2 Properties Of Multivariate Normal Random
Variables 94
4.3 Estimation In The Multivariate Normal 99
4.3.1 Maximum Likelihood Estimation 99
4.3.2 Distribution Of Y And S 100
4.4 Assessing Multivariate Normality 101
4.4.1 Investigating Univariate Normality 101
4.4.2 Investigating Multivariate Normality 106
4.5 Transformations To Normality 108
4.5.1 Univariate Transformations To Normality 109
4.5.2 Multivariate Transformations To Normality 110
4.6 Outliers 111
4.6.1 Outliers In Univariate Samples 112
4.6.2 Outliers In Multivariate Samples 113
Problems 117
Tests On One Or Two Mean Vectors 125
5.1 Multivariate Versus Univariate Tests 125
5.2 Tests On Ì With Ó Known 126
5.2.1 Review Of Univariate Test For H0: Ì = Ì0 With Ó
Known 126
5.2.2 Multivariate Test For H0: Ì = Ì0 With Ó Known 127
5.3 Tests On Ì When Ó Is Unknown 130
5.3.1 Review Of Univariate ß-Test For H0: Ì = Ì0 With Ó
Unknown 130
5.3.2 Hotelling’s T2-Test For H0: Ì = Ì0 With Ó Unknown 131
5.4 Comparing Two Mean Vectors 134
5.4.1 Review Of Univariate Two-Sample I-Test 134
5.4.2 Multivariate Two-Sample T2 -Test 135
5.4.3 Likelihood Ratio Tests 139
5.5 Tests On Individual Variables Conditional On
Rejection Of H0 By The T2-Test 139
5.6 Computation Of T2 143
5.6.1 Obtaining T2 From A Manova Program 143
5.6.2 Obtaining T2 From Multiple Regression 144
5.7 Paired Observations Test 145
5.7.1 Univariate Case 145
5.7.2 Multivariate Case 147
5.8 Test For Additional Information 149
5.9 Profile Analysis 152
5.9.1 One-Sample Profile Analysis 152
5.9.2 Two-Sample Profile Analysis 154
Problems 161
Multivariate Analysis Of Variance 169
6.1 One-Way Models 169
6.1.1 Univariate One-Way Analysis Of Variance (Anova) 169
6.1.2 Multivariate One-Way Analysis Of Variance Model
(Manova) 171
6.1.3 Wilks’test Statistic 174
6.1.4 Roy’s Test 178
6.1.5 Pillai And Lawley-Hotelling Tests 179
6.1.6 Unbalanced One-Way Manova 181
6.1.7 Summary Of The Four Tests And Relationship To T2 182
6.1.8 Measures Of Multivariate Association 186
6.2 Comparison Of The Four Manova Test Statistics 189
6.3 Contrasts 191
6.3.1 Univariate Contrasts 191
6.3.2 Multivariate Contrasts 192
6.4 Tests On Individual Variables Following
Rejection Of I/0 By The Overall Manova Test 195
6.5 Two-Way Classification 198
6.5.1 Review Of Univariate Two-Way Anova 198
6.5.2 Multivariate Two-Way Manova 201
6.6 Other Models 207
6.6.1 Higher-Order Fixed Effects 207
6.6.2 Mixed Models 208
6.7 Checking On The Assumptions 210
6.8 Profile Analysis 211
6.9 Repeated Measures Designs 215
6.9.1 Multivariate Versus Univariate Approach 215
6.9.2 One-Sample Repeated Measures Model 219
6.9.3 Fc-Sample Repeated Measures Model 222
6.9.4 Computation Of Repeated Measures Tests 224
6.9.5 Repeated Measures With Two Within-Subjects Factors
And One Between-Subjects Factor 224
6.9.6 Repeated Measures With Two Within-Subjects Factors
And Two Between-Subjects Factors 230
6.9.7 Additional Topics 232
6.10 Growth Curves 232
6.10.1 Growth Curve For One Sample 232
6.10.2 Growth Curves For Several Samples 239
6.10.3 Additional Topics 241
6.11 Tests On A Sub Vector 241
6.11.1 Test For Additional Information 241
6.11.2 Stepwise Selection Of Variables 243
Problems 244
Tests On Covariance Matrices 259
7.1 Introduction 259
7.2 Testing A Specified Pattern For Ó 259
7.2.1 Testing H0: Ó = Ó0 260
7.2.2 Testing Sphericity 261
7.2.3 Testing H0: Ó = Ó2[(1 – P)L + Pj] 263
7.3 Tests Comparing Covariance Matrices 265
7.3.1 Univariate Tests Of Equality Of Variances 265
7.3.2 Multivariate Tests Of Equality Of Covariance Matrices 266
7.4 Tests Of Independence 269
7.4.1 Independence Of Two Subvectors 269
7.4.2 Independence Of Several Subvectors 271
7.4.3 Test For Independence Of All Variables 275
Problems 276
Discriminant Analysis: Description Of Group Separation 281
8.1 Introduction 281
8.2 The Discriminant Function For Two Groups 282
8.3 Relationship Between Two-Group Discriminant
Analysis And Multiple Regression 286
8.4 Discriminant Analysis For Several Groups 288
8.4.1 Discriminant Functions 288
8.4.2 A Measure Of Association For Discriminant Functions 292
8.5 Standardized Discriminant Functions 292
8.6 Tests Of Significance 294
8.6.1 Tests For The Two-Group Case 294
8.6.2 Tests For The Several-Group Case 295
8.7 Interpretation Of Discriminant Functions 298
8.7.1 Standardized Coefficients 298
8.7.2 Partial F-Values 299
8.7.3 Correlations Between Variables And Discriminant
Functions 300
8.7.4 Rotation 301
8.8 Scatterplots 301
8.9 Stepwise Selection Of Variables 303
Problems 306
Classification Analysis: Allocation Of Observations To Groups;309
9.1 Introduction 309
9.2 Classification Into Two Groups 310
9.3 Classification Into Several Groups 314
9.3.1 Equal Population Covariance Matrices: Linear
Classification Functions 315
9.3.2 Unequal Population Covariance Matrices: Quadratic
Classification Functions 317
7.3.1 Univariate Tests Of Equality Of Variances 265
7.3.2 Multivariate Tests Of Equality Of Covariance Matrices 266
7.4 Tests Of Independence 269
7.4.1 Independence Of Two Subvectors 269
7.4.2 Independence Of Several Subvectors 271
7.4.3 Test For Independence Of All Variables 275
Problems 276
Discriminant Analysis: Description Of Group Separation 281
8.1 Introduction 281
8.2 The Discriminant Function For Two Groups 282
8.3 Relationship Between Two-Group Discriminant
Analysis And Multiple Regression 286
8.4 Discriminant Analysis For Several Groups 288
8.4.1 Discriminant Functions 288
8.4.2 A Measure Of Association For Discriminant Functions 292
8.5 Standardized Discriminant Functions 292
8.6 Tests Of Significance 294
8.6.1 Tests For The Two-Group Case 294
8.6.2 Tests For The Several-Group Case 295
8.7 Interpretation Of Discriminant Functions 298
8.7.1 Standardized Coefficients 298
8.7.2 Partial F-Values 299
8.7.3 Correlations Between Variables And Discriminant
Functions 300
8.7.4 Rotation 301
8.8 Scatterplots 301
8.9 Stepwise Selection Of Variables 303
Problems 306
Classification Analysis: Allocation Of Observations To Groups;309
9.1 Introduction 309
9.2 Classification Into Two Groups 310
9.3 Classification Into Several Groups 314
9.3.1 Equal Population Covariance Matrices: Linear
Classification Functions 315
9.3.2 Unequal Population Covariance Matrices: Quadratic
Classification Functions 317
9.4 Estimating Misclassification Rates 318
9.5 Improved Estimates Of Error Rates 320
9.5.1 Partitioning The Sample 321
9.5.2 Holdout Method 322
9.6 Subset Selection 322
9.7 Nonparametric Procedures 326
9.7.1 Multinomial Data 326
9.7.2 Classification Based On Density Estimators 327
9.7.3 Nearest Neighbor Classification Rule 330
9.7.4 Classification Trees 331
Problems 336
10 Multivariate Regression 339
10.1 Introduction 339
10.2 Multiple Regression: Fixed X’s 340
10.2.1 Model For Fixed X’s 340
10.2.2 Least Squares Estimation In The Fixed-X Model 342
10.2.3 An Estimator For Ó2 343
10.2.4 The Model Corrected For Means 344
10.2.5 Hypothesis Tests 346
10.2.6 R2 In Fixed-X Regression 349
10.2.7 Subset Selection 350
10.3 Multiple Regression: Random X’s 354
10.4 Multivariate Multiple Regression: Estimation 354
10.4.1 The Multivariate Linear Model 354
10.4.2 Least Squares Estimation In The Multivariate Model 356
10.4.3 Properties Of Least Squares Estimator B 358
10.4.4 An Estimator For Ó 360
10.4.5 Model Corrected For Means 361
10.4.6 Estimation In The Seemingly Unrelated Regressions
(Sur) Model 362
10.5 Multivariate Multiple Regression: Hypothesis
Tests 364
10.5.1 Test Of Overall Regression 364
10.5.2 Test On A Subset Of The X’s 367
10.6 Multivariate Multiple Regression: Prediction 370
10.6.1 Confidence Interval For E(Y0) 370
10.6.2 Prediction Interval For A Future Observation Yo 371
10.7 Measures Of Association Between The Y\ And
The X’s 372
10.8 Subset Selection 374
10.8.1 Stepwise Procedures 374
10.8.2 All Possible Subsets 377
10.9 Multivariate Regression: Random X’s 380
Problems 381
Canonical Correlation 385
11.1 Introduction 385
11.2 Canonical Correlations And Canonical
Variates 385
11.3 Properties Of Canonical Correlations 390
11.4 Tests Of Significance 391
11.4.1 Tests Of No Relationship Between The Y’s And The X’s 391
11.4.2 Test Of Significance Of Succeeding Canonical
Correlations After The First 393
11.5 Interpretation 395
11.5.1 Standardized Coefficients 396
11.5.2 Correlations Between Variables And Canonical Variates 397
11.5.3 Rotation 397
11.5.4 Redundancy Analysis 398
11.6 Relationships Of Canonical Correlation
Analysis To Other Multivariate Techniques 398
11.6.1 Regression 398
11.6.2 Manova And Discriminant Analysis 400
Problems 402
Principal Component Analysis 405
12.1 Introduction 405
12.2 Geometric And Algebraic Bases Of Principal
Components 406
12.2.1 Geometric Approach 406
12.2.2 Algebraic Approach 410
12.3 Principal Components And Perpendicular
Regression 412
12.4 Plotting Of Principal Components 414
12.5 Principal Components From The Correlation
Matrix 419
12.6 Deciding How Many Components To Retain 423
12.7 Information In The Last Few Principal
Components 427
12.8 Interpretation Of Principal Components 427
12.8.1 Special Patterns In S Or R 427
12.8.2 Rotation 429
12.8.3 Correlations Between Variables And Principal
Components 429
12.9 Selection Of Variables 430
Problems 432
Exploratory Factor Analysis 435
13.1 Introduction 435
13.2 Orthogonal Factor Model 437
13.2.1 Model Definition And Assumptions 437
13.2.2 Nonuniqueness Of Factor Loadings 441
13.3 Estimation Of Loadings And Commonalities 442
13.3.1 Principal Component Method 443
13.3.2 Principal Factor Method 448
13.3.3 Iterated Principal Factor Method 450
13.3.4 Maximum Likelihood Method 452
13.4 Choosing The Number Of Factors, M 453
13.5 Rotation 457
13.5.1 Introduction 457
13.5.2 Orthogonal Rotation 458
13.5.3 Oblique Rotation 462
13.5.4 Interpretation 465
13.6 Factor Scores 466
13.7 Validity Of The Factor Analysis Model 470
13.8 Relationship Of Factor Analysis To Principal
Component Analysis 475
Problems 476
Confirmatory Factor Analysis 479
14.1 Introduction 479
14.2 Model Specification And Identification 480
14.2.1 Confirmatory Factor Analysis Model 480
14.2.2 Identified Models 482
14.3 Parameter Estimation And Model Assessment 487
14.3.1 Maximum Likelihood Estimation 487
14.3.2 Least Squares Estimation 488
14.3.3 Model Assessment 489
14.4 Inference For Model Parameters 492
14.5 Factor Scores 495
Problems 496
Cluster Analysis 501
15.1 Introduction 501
15.2 Measures Of Similarity Or Dissimilarity 502
15.3 Hierarchical Clustering 505
15.3.1 Introduction 505
15.3.2 Single Linkage (Nearest Neighbor) 506
15.3.3 Complete Linkage (Farthest Neighbor) 508
15.3.4 Average Linkage 511
15.3.5 Centroid 514
15.3.6 Median 514
15.3.7 Ward’s Method 517
15.3.8 Flexible Beta Method 520
15.3.9 Properties Of Hierarchical Methods 521
15.3.10 Divisive Methods 529
15.4 Nonhierarchical Methods 531
15.4.1 Partitioning 532
15.4.2 Other Methods 540
15.5 Choosing The Number Of Clusters 544
15.6 Cluster Validity 546
15.7 Clustering Variables 547
Problems 548
Graphical Procedures 555
16.1 Multidimensional Scaling 555
16.1.1 Introduction 555
16.1.2 Metric Multidimensional Scaling 556
16.1.3 Nonmetric Multidimensional Scaling 560
16.2 Correspondence Analysis 565
16.2.1 Introduction 565
16.2.2 Row And Column Profiles 566
16.2.3 Testing Independence 570
16.2.4 Coordinates For Plotting Row And Column Profiles 572
16.2.5 Multiple Correspondence Analysis 576
Biplors 580
16.3.1 Introduction 580
16.3.2 Principal Component Plots 581
16.3.3 Singular Value Decomposition Plots 583
16.3.4 Coordinates 583
16.3.5 Other Methods 585
Problems 588
Appendix A: Tables 597
Appendix B: Answers And Hints To Problems 637
Appendix C: Data Sets And Sas Files 727
References 728
Index 745