Multivariate Statistics: Exercises and Solutions, 2nd Edition PDF by Wolfgang Karl Hardle and Zdenˇek Hlávka

By

Multivariate Statistics: Exercises and Solutions, Second Edition

 By Wolfgang Karl Hardle and Zdenˇek Hlávka

Multivariate Statistics Exercises and Solutions Second Edition

Contents:

Part I Descriptive Techniques

1 Comparison of Batches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Part II Multivariate Random Variables

2 A Short Excursion into Matrix Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3 Moving to Higher Dimensions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4 Multivariate Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

5 Theory of the Multinormal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

6 Theory of Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

7 Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

Part III Multivariate Techniques

8 Regression Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

9 Variable Selection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

10 Decomposition of Data Matrices by Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

11 Principal Component Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

12 Factor Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205

13 Cluster Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225

14 Discriminant Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

15 Correspondence Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259

16 Canonical Correlation Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281

17 Multidimensional Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289

18 Conjoint Measurement Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301

19 Applications in Finance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309

20 Highly Interactive, Computationally Intensive Techniques. . . . . . . . . . . . 319

A DataSets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343

A.1 Athletic Records Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343

A.2 Bank Notes Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343

A.3 Bankruptcy Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344

A.4 Car Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344

A.5 Car Marks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344

A.6 Classic Blue Pullover Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345

A.7 Fertilizer Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345

A.8 French Baccalauréat Frequencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345

A.9 French Food Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346

A.10 Geopol Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346

A.11 German Annual Population Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346

A.12 Journals Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346

A.13 NYSE Returns Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347

A.14 Plasma Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347

A.15 Time Budget Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347

A.16 Unemployment Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349

A.17 U.S. Companies Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349

A.18 U.S. Crime Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349

A.19 U.S. Health Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350

A.20 Vocabulary Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350

A.21 WAIS Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351

References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357

Preface to the Second Edition

I have always had an idea that I would have made a highly efficient criminal. This is the

chance of my lifetime in that direction. See here! This is a first-class, up-to-date burgling kit,

with nickel-plated Jimmy, diamond-tipped glass-cutter, adaptable keys, and every modern

improvement which the march of civilization demands.

Sherlock Holmes in “The Adventure of Charles Augustus Milverton”

The statistical science has seen new paradigms and more complex and richer

data sets. These include data on human genomics, social networks, huge climate

and weather data, and, of course, high frequency financial and economic data.

The statistical community has reacted to these challenges by developing modern

mathematical tools and by advancing computational techniques, e.g., through

fresher Quantlets and better hardware and software platforms.As a consequence, the

book Härdle, W. and Simar, L. (2015) Applied Multivariate Statistical Analysis, 4th

Springer Verlag had to be adjusted and partly beefed up with more easy access

tools and figures. An extra chapter on regression models with variable selection was

introduced and dimension reduction methods were discussed.

These new elements had to be reflected in the exercises and solutions book

as well. We have now all figures completely redesigned in the freely available

software R (R Core Team, 2013) that implements the classical statistical interactive

language S (Becker, Chambers, & Wilks, 1988; Chambers & Hastie, 1992). The R

codes for the classical multivariate analysis in Chaps. 11–17 are mostly based on

library MASS (Venables & Ripley, 2002). Throughout the book, some examples

are implemented directly in the R programming language but we have also used

functions from R libraries aplpack (Wolf, 2012), ca (Nenadic & Greenacre, 2007),

car (Fox & Weisberg, 2011), depth (Genest, Masse, & Plante, 2012), dr (Weisberg,

2002), glmnet (Friedman, Hastie, & Tibshirani, 2010), hexbin (Carr, Lewin-Koh,

& Maechler, 2011), kernlab (Karatzoglou, Smola, Hornik, & Zeileis, 2004), KernSmooth

(Wand, 2012), lasso2 (Lokhorst, Venables, Turlach, & Maechler, 2013),

locpol (Cabrera, 2012), MASS (Venables & Ripley, 2002), mvpart (Therneau,

Atkinson, Ripley, Oksanen, & Deáth, 2012), quadprog (Turlach & Weingessel,

2011), scatterplot3d (Ligges & Mächler, 2003), stats (R Core Team, 2013), tseries

(Trapletti & Hornik, 2012), and zoo (Zeileis & Grothendieck, 2005). All data sets

and computer codes (quantlets) in R and MATLAB may be downloaded via the

quantlet download center: www.quantlet.org. or the Springer web page. For

interactive display of low-dimensional projections of a multivariate data set, we

recommend GGobi (Swayne, Lang, Buja, & Cook, 2003; Lang, Swayne,Wickham,

& Lawrence, 2012).

As the number of available R libraries and functions steadily increases, one

should always consult the multivariate task view at http://www.r-project.org before

starting any new analysis. As before, analogues of all quantlets in the MATLAB

language are also available at the quantlet download center.

The set of exercises was extended and all quantlets have been revised and optimized.

Such a project would not be possible without numerous help of colleagues

and students. We also gratefully acknowledge the support of our cooperation via

the Erasmus program and through the Faculty of Mathematics and Physics at

Charles University in Prague and C.A.S.E.—the Centre for Applied Statistics and

Economics at Humboldt-Universität zu Berlin.

We thank the following students who contributed some of the R codes used in the

second edition: Alena Babiaková, Dana Chromíková, Petra ˇCernayová, Tomáš Hovorka,

Kristýna Ivanková, Monika Jakubcová, Lucia Jarešová, Barbora Lebdušková,

Tomáš Marada, Michaela Maršálková, Jaroslav Pazdera, Jakub Peˇcánka, Jakub

Petrásek, Radka Picková, Kristýna Sionová, Ondˇrej Šedivý, and Ivana Žohová. We

thank Awdesch Melzer who carefully reviewed all R codes and pointed out several

errors that escaped our attention in the first edition of this book.

We also acknowledge support of the Deutsche Forschungsgemeinschaft through

CRC 649 “Economic Risk” and IRTG 1792 “High Dimensional Non Stationary

Time Series Analysis”.

Berlin, Germany Wolfgang K. Härdle

Prague, Czech Republic Zdenˇek Hlávka

May 2015

This book is US$10
To get free sample pages OR Buy this book


Share this Book!

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.