Applications Of Derivatives Pdf 171157

Partial capture of text on file.
                          Munich Personal RePEc Archive
        Anote on matrix diﬀerentiation
        Kowal, Pawel
        December 2006
        Online at https://mpra.ub.uni-muenchen.de/3917/
        MPRAPaper No. 3917, posted 09 Jul 2007 UTC
                   Anote on matrix diﬀerentiation
                              Paweł Kowal
                              July 9, 2007
                               Abstract
                 This paper presents a set of rules for matrix diﬀerentiation with
               respect to a vector of parameters, using the ﬂattered representation of
               derivatives, i.e. in form of a matrix. We also introduce a new set of
               Kronecker tensor products of matrices. Finally we consider a problem
               of diﬀerentiating matrix determinant, trace and inverse.
                 JEL classiﬁcation: C00
                 Keywords: matrixdiﬀerentiation, generalizedKroneckerproducts
            1 Introduction
              Derivatives of matrices with respect to a vector of parameters can be ex-
            pressed as a concatenation of derivatives with respect to a scalar parameters.
            However such a representation of derivatives is very inconvenient in some
            applications, e.g. if higher order derivatives are considered, and or even are
            not applicable if matrix functions (like determinant or inverse) are present.
            For example ﬁnding an explicit derivative of det(∂X=∂θ) would be a quite
            complicated task. Such a problem arise naturally in many applications, e.g.
            in maximum likelihood approach for estimating model parameters.
              The same problems emerges in case of a tensor representation of deriva-
            tives. Additionally, in this case additional eﬀort is required to ﬁnd the ﬂat-
            tered representation of resulting tensors, which is required, since running
            numerical computations eﬃciently is possible only in case of two dimensional
            data structures.
              In this paper we derive formulas for diﬀerentiating matrices with respect
            to a vector of parameters, when one requires the ﬂattered form of resulting
            derivatives, i.e. representation of derivatives in form of matrices. To do this
                                      weintroduce a new set of the Kronecker matrix products as well as the gener-
                                      alized matrix transposition. Then, ﬁrst order and higher order derivatives of
                                      functions being compositions of primitive function using elementary matrix
                                      operations like summation, multiplication, transposition and the Kronecker
                                      product, can be expressed in a closed form based on primitive matrix func-
                                      tions and their derivatives, using these elementary operations, the generalized
                                      Kronecker products and the generalized transpositions.
                                            We consider also more general matrix functions containing matrix func-
                                      tions (inverse, trace and determinant). Deﬁning the generalized trace func-
                                      tion we are able to express derivatives of such functions in closed form.
                                      2 Matrix diﬀerentiation rules
                                            Let as consider smooth functions Ω ∋ θ 7→ X(θ) ∈ Rm×n, Ω ∋ θ 7→
                                      Y(θ) ∈ Rp×q, where Ω ⊂ Rk is an open set. Functions X;Y associate a m×n
                                      and p×q matrix for a given vector of parameters, θ = col(θ1;θ2;:::;θk). Let
                                      the diﬀerential of the function X with respect to θ is deﬁned as
                                                                                 ∂X =£ ∂X ∂X ::: ∂X ¤
                                                                                                  ∂θ       ∂θ                ∂θ
                                                                                  ∂θ                 1        2                 k
                                      for ∂X=∂θ ∈ Rm×n, i = 1;2;:::;k.
                                                        i
                                      Proposition 2.1. The following equations hold
                                          1. ∂ (αX) = α∂X
                                                ∂θ                     ∂θ
                                          2. ∂ (X +Y) = ∂X + ∂Y
                                                ∂θ                         ∂θ        ∂θ
                                          3. ∂ (X ×Y) = ∂X ×(I ⊗Y)+X × ∂Y
                                                ∂θ                         ∂θ           k                          ∂θ
                                      where α ∈ R and I is a k × k dimensional identity matrix, assuming that
                                                                         k
                                      diﬀerentials exist and matrix dimensions coincide.
                                      Proof. The ﬁrst two cases are obvious. We have
                                        ∂ (X ×Y)=£ ∂X ×Y +X× ∂Y ::: ∂X ×Y +X× ∂Y ¤
                                                                     ∂θ                           ∂θ                ∂θ                           ∂θ
                                       ∂θ                               1                            1                 k                            k
                                                                                                       Y ···              0 
                                                                 £ ∂X                   ∂X ¤           .         .         .                   £ ∂Y                   ∂Y ¤
                                                             =                : : :              ×          .       .       .       +X×                       : : :
                                                                     ∂θ                 ∂θ             .            .      .                       ∂θ                ∂θ
                                                                        1                  k                                                            1                  k
                                                                                                           0     · · ·     Y
                                                             =∂X×(I ⊗Y)+X×∂Y
                                                                   ∂θ            k                           ∂θ
                                                                                                          2
                                            Diﬀerentiating matrix transposition is a little bit more complicated. Let
                                      us deﬁne a generalized matrix transposition
                                      Deﬁnition 2.2. Let X = [X ;X ;:::X ], where X ∈ Rp×q, i = 1;2;:::;n
                                                                                            1      2           n                     i
                                      is a p × q matrix is a partition of p × nq dimensional matrix X. Then
                                                                                                : £ X′;X′;:::;X′ ¤
                                                                                 Tn(X)=                   1      2              n
                                      Proposition 2.3. The following equations hold
                                          1. ∂ (X′) = T (∂X)
                                                ∂θ                   k ∂θ
                                          2. ∂ (T (X)) = T                        (∂X)
                                                ∂θ       n                  k×n ∂θ
                                      Proof. The ﬁrst condition is a special case of the second condition for n = 1.
                                      Wehave
                                              ∂ (T         (X)) = £ T(n)(∂X) :::                           T(n)(∂X) ¤
                                             ∂θ        (n)                             ∂θ1                          ∂θk
                                                                          h ∂X′               ∂X′                 ∂X′             ∂X′ i                       ³∂X´
                                                                     =            1;:::;          n     : : :         1;:::;          n       =T(k×n)
                                                                               ∂θ              ∂θ                 ∂θ               ∂θ
                                                                                  1               1                   k               k                          ∂θ
                                      since
                                                                    ∂X           £ ∂X                ∂X                  ∂X              ∂X ¤
                                                                            =            1;:::;          n     : : :         1;:::;          n
                                                                                     ∂θ               ∂θ                 ∂θ              ∂θ
                                                                     ∂θ                  1               1                  k                k
                                            Let us now turn to diﬀerentiating tensor products of matrices. Let for
                                      any matrices X, Y, where X ∈ Rp×q is a matrix with elements x                                                                ∈ R for
                                                                                                                                                               ij
                                      i = 1;2;:::;p, j = 1;2;:::;q. The Kronecker product, X ⊗Y is deﬁned as
                                                                                             :  x11Y             · · ·    x1qY 
                                                                                                 .               .            .      
                                                                              X⊗Y =                      .          .          .
                                                                                                 .                   .        .      
                                                                                                     xp1Y         · · ·    xpqY
                                      Similarly as in case of diﬀerentiating matrix transposition we need to intro-
                                      duce the generalized Kronecker product
                                      Deﬁnition 2.4. Let X = [X ;X ;:::X ], where X ∈ Rp×q, i = 1;2;:::;m
                                                                                           1      2            m                     i
                                      is a p × q matrix is a partition of p × mq dimensional matrix X. Let Y =
                                      [Y ;Y ;:::Y ], where Y ∈ Rr×s, i = 1;2;:::;n is a r×s matrix is a partition
                                         1     2           n                   i
                                      of r × ns dimensional matrix Y. Then
                                                                             1        :
                                                                     X⊗ Y =[X⊗Y1;:::;X⊗Yn]
                                                                             n        :
                                                                             m                      1                         1
                                                                    X⊗ Y =[X ⊗ Y;:::;X ⊗ Y]
                                                                             n        :       1     n                  m n
                                                                1;m ;:::;ms                        m ;:::;ms                          m ;:::;ms
                                                        X⊗ 2                    Y =[X⊗ 2                        Y ;:::;X ⊗ 2                        Y ]
                                                                                                                   1                                  n
                                                                n ;n ;:::;n                        n ;:::;n                           n ;:::;n          1
                                                                  1   2       s       :              2      s                           2       s
                                                              m1;m ;:::;m                           1;m ;:::;m                               1;m ;:::;m
                                                      X⊗ 2 sY =[X ⊗ 2                                             s Y;:::;X              ⊗ 2              s Y ]
                                                                                              1                                     m
                                                              n ;n ;:::;n                           n ;n ;:::;n                        1     n ;n ;:::;n
                                                               1    2      s                          1   2       s                           1    2      s
                                      assuming that appropriate matrix partitions exist.
                                                                                                          3
The words contained in this file might help you see if this file matches what you are looking for:

...Munich personal repec archive anote on matrix dierentiation kowal pawel december online at https mpra ub uni muenchen de mprapaper no posted jul utc pawe july abstract this paper presents a set of rules for with respect to vector parameters using the attered representation derivatives i e in form we also introduce new kronecker tensor products matrices finally consider problem dierentiating determinant trace and inverse jel classication c keywords matrixdierentiation generalizedkroneckerproducts introduction can be ex pressed as concatenation scalar however such is very inconvenient some applications g if higher order are considered or even not applicable functions like present example nding an explicit derivative det x would quite complicated task arise naturally many maximum likelihood approach estimating model same problems emerges case deriva tives additionally additional eort required nd tered resulting tensors which since running numerical computations eciently possible only two ...
Related files

Share

Help

Related files

Share

Share to social media

Help

Login Area