jagomart
digital resources
picture1_Thermal Analysis Pdf 90500 | 56902 Excerpt


 158x       Filetype PDF       File size 0.21 MB       Source: www.sas.com


File: Thermal Analysis Pdf 90500 | 56902 Excerpt
chapter basic concepts for multivariate statistics 1 1 1 introduction 1 1 2 population versus sample 2 1 3 elementary tools for understanding multivariate data 3 1 4 data reduction ...

icon picture PDF Filetype PDF | Posted on 16 Sep 2022 | 3 years ago
Partial capture of text on file.
                                                                                                  Chapter
               Basic Concepts for
               Multivariate Statistics                                                               1
                                      1.1  Introduction   1
                                      1.2  Population Versus Sample    2
                                      1.3  Elementary Tools for Understanding Multivariate Data   3
                                      1.4  Data Reduction, Description, and Estimation   6
                                      1.5  Concepts from Matrix Algebra    7
                                      1.6  Multivariate Normal Distribution   21
                                      1.7  Concluding Remarks     23
               1.1 Introduction
                                      Data are information. Most crucial scientific, sociological, political, economic, and busi-
                                      ness decisions are made based on data analyis. Often data are available in abundance,
                                      but by themselves they are of little help unless they are summarized and an appropriate
                                      interpretation of the summary quantities made. However, such a summary and correspond-
                                      ing interpretation can rarely be made just by looking at the raw data. A careful scientific
                                      scrutiny and analysis of these data can usually provide an enormous amount of valuable
                                      information. Often such an analysis may not be obtained just by computing simple aver-
                                      ages. Admittedly, the more complex the data and their structure, the more involved the data
                                      analysis.
                                        Thecomplexityinadatasetmayexistforavarietyofreasons.Forexample,thedataset
                                      maycontaintoomanyobservationsthatstandoutandwhosepresenceinthedatacannotbe
                                      justified by any simple explanation. Such observations are often viewed as influential ob-
                                      servationsoroutliers.Decidingwhichobservationisorisnotaninfluentialoneisadifficult
                                      problem. For a brief review of some graphical and formal approaches to this problem, see
                                      Khattree and Naik (1999). A good, detailed discussion of these topics can be found in Bel-
                                      sley, Kuh and Welsch (1980), Belsley (1991), Cook and Weisberg (1982), and Chatterjee
                                      and Hadi (1988).
                                        Another situation in which a simple analysis based on averages alone may not suffice
                                      occurs when the data on some of the variables are correlated or when there is a trend
                                      present in the data. Such a situation often arises when data were collected over time. For
                                      example,whenthedataarecollectedonasinglepatientoragroupofpatientsunderagiven
                                      treatment, we are rarely interested in knowing the average response over time. What we
                                      are interested in is observing any changes in the values, that is, in observing any patterns
                                      or trends.
                                        Manytimes, data are collected on a number of units, and on each unit not just one, but
                                      manyvariables are measured. For example, in a psychological experiment, many tests are
                                      used, and each individual is subjected to all these tests. Since these are measurements on
                                      the same unit (an individual), these measurements (or variables) are correlated and, while
                                      summarizing the data on all these variables, this set of correlations (or some equivalent
                                      quantity) should be an integral part of this summary. Further, when many variables exist, in
               2    Multivariate Data Reduction and Discrimination with SAS Software
                                     order to obtain more definite and more easily comprehensible information, this correlation
                                     summary (and its structure) should be subjected to further analysis. There are many other
                                     possible ways in which a data set can be quite complex for analysis.
                                       However,itisthelastsituation that is of interest to us in this book. Specifically, we may
                                     havenindividualunitsandoneachunitwehaveobserved(same) pdifferentcharacteristics
                                     (variables), say x1, x2,...,xp. Then these data can be presented as an n by p matrix
                                                                 x11  x12  ... x1p 
                                                                 x21  x22  ... x2p 
                                                            X=                     .
                                                                 .              .  
                                                                   .             .
                                                                 x.   x    ... x.  
                                                                   n1   n2       np
                                       Of course, the measurements in the ith row, namely, x ,...,x , which are the mea-
                                                                                    i1     ip
                                     surementsonthesameunit,arecorrelated.Ifwearrangetheminacolumnvectorx defined
                                                                                                      i
                                     as
                                                                        xi1 
                                                                        . 
                                                                   x =    .    ,
                                                                    i   x. 
                                                                          ip
                                     then x can be viewed as a multivariate observation. Thus, the n rows of matrix X corre-
                                          i
                                     spondton multivariate observations (written as rows within this matrix), and the measure-
                                     mentswithineachx areusuallycorrelated.Theremayormaynotbeacorrelationbetween
                                                     i
                                     columns x1,...,xn. Usually, x1,...,xn are assumed to be uncorrelated (or statistically
                                     independent as a stronger assumption) but this may not always be so. For example, if x ,
                                                                                                            i
                                     i = 1,...,n contains measurementsontheheightandweightoftheith brotherinafamily
                                     with n brothers, then it is reasonable to assume that some kind of correlation may exist
                                     between the rows of X as well.
                                       For much of what is considered in this book, we will not concern ourselves with the
                                     scenario in which rows of the data matrix X are also correlated. In other words, when rows
                                     of X constitute a sample, such a sample will be assumed to be statistically independent.
                                     However, before we elaborate on this, we should briefly comment on sampling issues.
               1.2 Population Versus Sample
                                     Aswepointedout, the rows in the n by p data matrix X are viewed as multivariate obser-
                                     vations on n units. If the set of these n units constitutes the entire (finite) set of all possible
                                     units, then we have data available on the entire reference population. An example of such
                                     a situation is the data collected on all cities in the United States that have a population of
                                     1,000,000 or more, and on three variables, namely, cost-of-living, average annual salary,
                                     andthequality of health care facilities. Since each U.S. city that qualifies for the definition
                                     is included, any summary of these data will be the true summary of the population.
                                       However,moreoftenthannot,thedataareobtainedthroughasurveyinwhich,oneach
                                     of the units, all p characteristics are measured. Such a situation represents a multivariate
                                     sample. A sample (adequately or poorly) represents the underlying population from which
                                     it is taken. As the population is now represented through only a few units taken from it,
                                     any summary derived from it merely represents the true population summary in the sense
                                     that we hope that, generally, it will be close to the true summary, although no assurance
                                     about an exact match between the two can be given.
                                       Howcanwemeasureandensurethatthesummaryfromasampleisagoodrepresenta-
                                     tive of the population summary? To quantify it, some kinds of indexes based on probabilis-
                                                                                            Chapter 1   Basic Concepts for Multivariate Statistics     3
                                                   tic ideas seem appropriate. That requires one to build some kind of probabilistic structure
                                                   over these units. This is done by artificially and intentionally introducing the probabilistic
                                                   structure into the sampling scheme. Of course, since we want to ensure that the sample is
                                                   a good representative of the population, the probabilistic structure should be such that it
                                                   treats all the population units in an equally fair way. Thus, we require that the sampling is
                                                   done in such a way that each unit of (finite or infinite) population has an equal chance of
                                                   being included in the sample. This requirement can be met by a simple random sampling
                                                   with or without replacement. It may be pointed out that in the case of a finite population
                                                   andsamplingwithoutreplacement,observationsarenotindependent,althoughthestrength
                                                   of dependence diminishes as the sample size increases.
                                                       Although a probabilistic structure is introduced over different units through random
                                                   sampling, the same cannot be done for the p different measurements, as there is neither a
                                                   reference population nor do all p measurements (such as weight, height, etc.) necessarily
                                                   represent the same thing. However, there is possibly some inherent dependence between
                                                   these measurements, and this dependence is often assumed and modeled as some joint
                                                   probability distribution. Thus, we view each row of X as a multivariate observation from
                                                   some p-dimensional population that is represented by some p-dimensional multivariate
                                                   distribution. Thus, the rows of X often represent a random sample from a p-dimensional
                                                   population. In much multivariate analysis work, this population is assumed to be infinite
                                                   andquitefrequentlyitisassumedtohaveamultivariatenormaldistribution.Wewillbriefly
                                                   discuss the multivariate normal distribution and its properties in Section 1.6.
                     1.3 Elementary Tools for Understanding Multivariate Data
                                                   Tounderstand a large data set on several mutually dependent variables, we must somehow
                                                   summarize it. For univariate data, when there is only one variable under consideration,
                                                   these are usually summarized by the (population or sample) mean, variance, skewness, and
                                                   kurtosis.Thesearethebasicquantitiesusedfordatadescription.Formultivariatedata,their
                                                   counterparts are defined in a similar way. However, the description is greatly simplified if
                                                   matrix notations are used. Some of the matrix terminology used here is defined later in
                                                   Section 1.5.
                                                       Let x be the p by 1 random vector corresponding to the multivariate population under
                                                   consideration. If we let
                                                                                                    x1 
                                                                                                    . 
                                                                                              x =      .     ,
                                                                                                    . 
                                                                                                      xp
                                                   then each xi is a random variable, and we assume that x1,...,xp are possibly dependent.
                                                   With E(·) representing the mathematical expectation (interpreted as the long-run average),
                                                   let µ = E(x ),andletσ = var(x )bethepopulationvariance.Further,letthepopulation
                                                         i        i            ii         i
                                                   covariance between x and x be σ            =cov(x ,x ). Then we define the population mean
                                                                            i       j      ij          i   j
                                                   vector E(x) as the vector of term by term expectations. That is,
                                                                                       E(x )           µ 
                                                                                              1              1
                                                                                            .      . 
                                                                             E(x) =          .       =       .     =(say).
                                                                                            .      . 
                                                                                          E(xp)            µp
                                                       Additionally, the concept of population variance is generalized to the matrix with all the
                                                   population variances and covariances placed appropriately within a variance-covariance
                                                   matrix. Specifically, if we denote the variance-covariance matrix of x by D(x), then
                   4     Multivariate Data Reduction and Discrimination with SAS Software
                                                                        var(x1)        cov(x1,x2) ... cov(x1,xp) 
                                                                        cov(x2,x1)       var(x2)     ... cov(x2,xp) 
                                                              D(x) =                                                    
                                                                              .                                 .       
                                                                               .                                 .
                                                                              .                                 .       
                                                                          cov(xp,x1)    cov(xp,x2) ...        var(xp)
                                                                        σ      σ     ... σ      
                                                                           11     12         1p
                                                                        σ      σ     ... σ      
                                                                           21     22         2p
                                                                    =                           =(σ )=(say).
                                                                        .                   .         ij
                                                                           .                 .
                                                                        .                   .   
                                                                          σ     σ     ... σ
                                                                           p1     p2         pp
                                                  That is, with the understanding that cov(x ,x ) = var(x ) = σ , the term cov(x ,x )
                                                                                             i  i           i      ii                i  j
                                                                    th                                                 th
                                               appears as the (i, j)   entry in matrix . Thus, the variance of the i    variable appears
                                               at the ith diagonal place and all covariances are appropriately placed at the nondiagonal
                                               places. Since cov(x ,x ) = cov(x ,x ),wehaveσ          = σ for all i, j. Thus, the matrix
                                                                   i  j           j   i            ij      ji
                                               D(x) =  is symmetric. The other alternative notations for D(x) are cov(x) and var(x),
                                               and it is often also referred to as the dispersion matrix, the variance-covariance matrix, or
                                               simply the covariance matrix. We will use the three terms interchangeably.
                                                  The quantity tr() (read as trace of )= p        σ is called the total variance and
                                                                                                  i=1 ii
                                               || (the determinant of ) is referred to as the generalized variance. The two are often
                                               taken as the overall measures of variability of the random vector x. However, sometimes
                                               their use can be misleading. Specifically, the total variance tr() completely ignores the
                                               nondiagonaltermsofthatrepresentthecovariances.Atthesametime,twoverydifferent
                                               matrices may yield the same value of the generalized variance.
                                                  Asthereexistsdependencebetween x1,...,xp,itisalsomeaningfultoatleastmeasure
                                               the degree of linear dependence. It is often measured using the correlations. Specifically,
                                               let
                                                                                    cov(x ,x )           σ
                                                                                         i   j            ij
                                                                          ρ =                     = √
                                                                           ij      var(x )var(x )       σ σ
                                                                                        i       j         ii jj
                                               be the Pearson’s population correlation coefficient between xi and xj. Then we define the
                                               population correlation matrix as
                                                                      ρ      ρ     ... ρ        1 ρ              ... ρ     
                                                                         11     12         1p                  12          1p
                                                                      ρ      ρ     ... ρ        ρ           1    ... ρ     
                                                                         21     22         2p            21                pp
                                                         = (ρ ) =                            =                            .
                                                               ij
                                                                      ρ      ρ     ... ρ        ρ          ρ     ...    1  
                                                                          p1    p2          pp           p1    p2
                                                  As was the case for ,  is also symmetric. Further,  can be expressed in terms of 
                                               as
                                                                                          1              1
                                                                            =[diag()] 2 [diag()] 2,
                                               wherediag()isthediagonalmatrixobtainedbyretainingthediagonalelementsofand
                                               by replacing all the nondiagonal elements by zero. Further, the square root of matrix A
                                                             1                               1  1                                      1
                                               denoted by A2 is a matrix satisfying A = A2A2. It is defined in Section 1.5. Also, A 2
                                                                                 1
                                               represents the inverse of matrix A2.
                                                  It may be mentioned that the variance-covariance and the correlation matrices are al-
                                               waysnonnegativedefinite (See Section 1.5 for a discussion). For most of the discussion in
                                               this book, these matrices, however, will be assumed to be positive definite. In view of this
                                               assumption, these matrices will also admit their respective inverses.
                                                  Howdowegeneralize(andmeasure) the skewness and kurtosis for a multivariate pop-
                                               ulation? Mardia (1970) defines these measures as
                                                                                                	         ′ 1         
3
                                                              multivariate skewness: β     =E (x) (y) ,
                                                                                        1,p
The words contained in this file might help you see if this file matches what you are looking for:

...Chapter basic concepts for multivariate statistics introduction population versus sample elementary tools understanding data reduction description and estimation from matrix algebra normal distribution concluding remarks are information most crucial scientic sociological political economic busi ness decisions made based on analyis often available in abundance but by themselves they of little help unless summarized an appropriate interpretation the summary quantities however such a correspond ing can rarely be just looking at raw careful scrutiny analysis these usually provide enormous amount valuable may not obtained computing simple aver ages admittedly more complex their structure involved thecomplexityinadatasetmayexistforavarietyofreasons forexample thedataset maycontaintoomanyobservationsthatstandoutandwhosepresenceinthedatacannotbe justied any explanation observations viewed as inuential ob servationsoroutliers decidingwhichobservationisorisnotaninuentialoneisadifcult problem bri...

no reviews yet
Please Login to review.