jagomart
digital resources
picture1_Basic Grammar Pdf 103885 | W08 2317


 125x       Filetype PDF       File size 0.13 MB       Source: aclanthology.org


File: Basic Grammar Pdf 103885 | W08 2317
ametagrammarforvietnameseltag 129 ametagrammarforvietnameseltag lehngphng nguynthminhhuyn azimroussanaly loria inrialorraine hanoiuniversity of science loria inrialorraine nancy france hanoi vietnam nancy france lehong loria fr huyenntm vnu edu vn azim loria fr abstract ...

icon picture PDF Filetype PDF | Posted on 23 Sep 2022 | 3 years ago
Partial capture of text on file.
                 AMetagrammarforVietnameseLTAG                                                                           129
                                          AMetagrammarforVietnameseLTAG
                         LêHồngPhương                   NguyễnThịMinhHuyền                      AzimRoussanaly
                     LORIA/INRIALorraine               HanoiUniversity of Science           LORIA/INRIALorraine
                           Nancy, France                      Hanoi, Vietnam                      Nancy, France
                       lehong@loria.fr                 huyenntm@vnu.edu.vn                     azim@loria.fr
                                      Abstract                           of natural language processing in general and in
                     We present in this paper an initial inves-          the task of parsing Vietnamese in particular. No
                     tigation into the use of a metagrammar              work on formalizing Vietnamese grammar is re-
                     for explicitly sharing abstract grammati-           ported before (Nguyen et al., 2004). In (Lê et
                     cal specifications for the Vietnamese lan-          al., 2006), basic declarative structures and comple-
                     guage. Wefirst introduce the essential syn-         ment clauses of Vietnamese sentences have been
                     tactic mechanisms of the Vietnamese lan-            modeled using about thirty elementary trees, rep-
                     guage. We then show that the basic sub-             resenting as many subcategorization frames. We
                     categorization frames of Vietnamese can             show in this paper that these basic subcatego-
                     be compactly represented by classes us-             rization frames can be compactly represented by
                     ing the XMGformalism(eXtensible Meta-               classes in XMG formalism.
                     Grammar). Finally, we report on the im-               Wefirst introduce the essential syntactic mech-
                     plementation the first metagrammar pro-             anisms of the Vietnamese language. We then show
                     ducing verbal elementary trees recogniz-            that the basic subcategorization frames of Viet-
                     ing basic Vietnamese sentences.                     namese can be compactly represented by classes
                                                                         using the XMG formalism. We then report on the
                 1 Introduction                                          implementation the first metagrammar producing
                 Metagrammars (MG) have recently emerged as a            verbal elementary trees recognizing basic Viet-
                 means to develop wide-coverage LTAG for well-           namese sentences, before concluding.
                 studied languages like English, French and Ital-        2   Vietnamese Subcategorizations
                 ian (Candito, 1999; Kinyon, 2003). MGs help
                 avoid redundancy and reduce the effort of gram-         As for other isolating languages, the most impor-
                 mardevelopment bymaking useofcommonprop-                tant syntactic information source in Vietnamese is
                 erties of LTAG elementary trees.                        wordorder. Thebasic wordorder isSubject –Verb
                    We present in this paper an initial investiga-       – Object. A verb is always placed after the sub-
                 tion into the use of a metagrammar for explic-          ject in both predicative and question forms. In a
                 itly sharing abstract grammatical specifications for    noun phrase, the main noun precedes the adjec-
                 the Vietnamese language. We use the eXtensible          tives and the genitive follows the governing noun.
                 MetaGrammar (XMG) tool which was developed              The other syntactic means are function words,
                 byCrabbé(Crabbé,2005;ParmentierandL.Roux,               reduplication, and, in the case of spoken language,
                 2005) to compile a TAG for Vietnamese. The built        prosody (Nguyễn et al., 2006).
                 grammar is called vnMG and is made available              From the point of view of functional gram-
                                       1
                 online for free access .                                mar, the syntactic structure of Vietnamese fol-
                    Only in recent years have Vietnamese re-             lows a topic-oriented structure. It belongs to the
                 searchers begun to be involved in the domain            topic-prominent languages as described by (Li and
                    1http://www.loria.fr/∼lehong/tools/vnMG.php          Thompson, 1976). In those languages, topics are
                         Proceedings of The Ninth International Workshop on Tree Adjoining Grammars and Related Formalisms
                                                       Tübingen, Germany. June 6-8, 2008.
                      130                                                                         Le, Nguyen and Roussanaly
                      codedinthesurfacestructure andthey tend tocon-              is feeble., Học cũng là làm việc / To study is
                      trol co-referentiality. The topic-oriented “double          to work.
                      subject” construction is a basic sentence type. For    2.3  ThirdTypePredicates
                      example, “Cậu ấy khoẻ mạnh, là sinh viên y khoa
                      / He strong, be student medicine”, which means         Thethirdtypepredicates arepredicates whichcon-
                      that “Heis strong, he is medicine student”. In Viet-   nect directly to their subjects in the declarative
                      namese, passive voice and cleft subject sentences      form; however in the negative form, they are con-
                      are rare or non-existent.                              nected to their subjects by a copula. Predicates of
                         In general, Vietnamese predicates may be clas-      this type are usually
                      sified into three types depending on the need of a       • A clause: Nó vẫn tên là Quþt. / His name is
                      copula connecting them with their subjects in the           still Quþt.
                      declarative and negative forms (Nguyễn, 2004).
                      Complexpredicates canbeconstructed toformco-             • A composition of a numeral and a noun: Lê
                      ordinated predicative structures starting from these        này mười ngàn đồng. / This pear costs ten
                      basic types of predicates. We present briefly these         thousand dongs.
                      three types of Vietnamese predicates in the follow-
                      ing subsections.                                         • A composition of a preposition and a noun:
                                                                                  Lúanày của chị Hoa. / This is the rice of Ms.
                      2.1   First Type Predicates                                 Hoa.
                      Thefirst type predicates are predicates which con-       • An expression: Thằng ấy đầu bò đầu bướu
                      nect directly to their subjects without the need of         lắm. / That guy is very stubborn.
                      a copula in both of the declarative and negative
                      forms. For example                                     2.4  Subcategorizations
                         • Declarative form:Tôiđọcsách. /Iamreading          In the first grammar LTAG for Vietnamese pre-
                           books.                                            sented in (Lê et al., 2006), each subcategorization
                                                                             is represented by the same structure of elemen-
                         • Negative form: Tôi không đọc sách. / I am not     tary trees associcated with a considered predicate.
                           reading books.                                    We view that the suject is subcategorized in the
                      These predicates are assumed by verbal phrases or      same way like arguments. The verbs anchor thus
                      adjectival phrases. Thefact that an adjective can be   elementary trees composed of a node for the sub-
                      a predicate is a specificity of Vietnamese in com-     ject and one or more nodes for each of its essential
                      parison with predicates of occidental languages. In    complements.
                      English or French for instance, only verbal phrases      Wefollow the de facto standard that in TAG, in
                      can be predicates, adjectives in these languages al-   which each subcategorization is represented by a
                      wayssignify properties of subjects and they are al-    family of elementary trees. We define families of
                      waysfollowed the verb “to be” in English or “être”     verbal elementary trees in the Table 1.
                      in French.                                               We present in the next section a metagrammar
                                                                             that generates this set of elementary trees.
                      2.2   SecondTypePredicates
                      The second type predicates are predicates which        3   AMetagrammarforVerbalTrees
                      are connected to their subjects by the copula “là”     The subcategorizations of elementary trees de-
                      in the declarative form and by copulas “không là”      scribe only “canonical” constructions of predica-
                      or“khôngphải”,or“khôngphảilà”inthenegative             tive elements without taking into account for rela-
                      form. Predicates of this type are rather rich. They    tive or question structures. For the purpose of in-
                      can be:                                                vestigation, we constraint ourselves in developing
                         • Nouns or noun phrases: Tôi là sinh viên. / I      at the first stage only the verb spines and argument
                           amstudent.                                        realizations shown in the subcategorizations pre-
                                                                             sented in the previous section.
                         • Verbs, adjectives, verbal phrases or adjecti-       We have developed a XMG metagrammar that
                           val phrases: Van xin là yếu đuối. / Begging       consists of 11 classes (or tree fragments). The
                             Proceedings of The Ninth International Workshop on Tree Adjoining Grammars and Related Formalisms
                                                           Tübingen, Germany. June 6-8, 2008.
                  AMetagrammarforVietnameseLTAG                                                                               131
                   Subcategorizations Families                Examples                             S
                   Intransitive            N V                ngủ/sleep
                                             0
                   With    a   nominal     N VN               đọc/to
                                             0     1                                       N ↓         PredP
                   complement                                 read                           0
                   With     a    clausal   N VS               tin/to be-
                                             0    1
                   complement                                 lieve                         tôi   V⋄         N1 ↓
                   With modal com-         N V V              mong/to
                                             0 0 1
                   plement                                    wish                                đọc        sách
                   Ditransitive            N VN N             cho/to
                                             0     1  2
                                                              give          Figure 1: Declarative transitive structure αn0V n1
                   Ditransitive with a     N VN ON            vay/to
                                             0     1    2
                   preposition                                borrow
                   Ditransitive with a     N V N V            lãnh
                                             0 0 1 1                        4   Conclusion and Future Work
                   verbal complement                          đạo/to
                                                              lead          This paper presents an initial investigation into
                   Ditransitive with an    N VN A             làm/to        the use of XMG formalism for developing a first
                                             0     1
                   adjectival comple-                         make          metagrammar producing a LTAG for Vietnamese
                   ment                                                     which recognizes basic verbal constructions. We
                   Movement        verbs   N V V N            ra/to   go    have shown that the essential subcategorization
                                             0 0 1 1
                   with    a   nominal                        out           frames ofVietnamese predicates can be effectively
                   complement                                               encoded by means of XMG classes while retain-
                   Movement        verbs   N V AV             trở nên/to    ing basic properties of the realized verbal trees.
                                             0 0     1
                   with an adjectival                         become        Thisconfirms that various syntactic phenomena of
                   complement                                               Vietnamese can be covered in a Vietnamese MG.
                   Movementditransi-       N V N V N          chuyển/to       The first evaluation of the MG for Vietnamese
                                             0 0 1 1 2
                   tive                                       transfer      is promising but the lexical coverage has to be
                                                                            improved further. Moreover, the grammar cover-
                   Table 1: Subcategorizations of Vietnamese verbs          age needs to be revised by refining the constraints
                                                                            of agrammatical syntactic constructions. Although
                  metagrammar is currently able to produce the              there are not many tree fragments in the current
                  same set of elementary trees described in Table 1         metagrammar, we find that the current MG over-
                  including intransitive, transitive, ditransitive fami-    generates some undesired structures. The MG will
                  lies with and/or without optional complements. As         also be extended to deal with constructions not yet
                  an illustration, the declarative transitive structure     covered like adjectival and noun phrase construc-
                  in Figure 1 can be defined by combining a canon-          tions. We also intend to generate a test suite to doc-
                  ical subject fragment with an active verb and a           ument the grammars and perform realistic evalua-
                  canonical object fragment.                                tions.
                                                                              There is an existing work on the development
                           S           + S + S                              of metagrammars for not frequently studied lan-
                                                                            guages like Korean and Yiddish and their rela-
                                                                            tions to a German grammar (Kinyon, 2006). They
                   N↓         PredP          V             PredP            showed that cross-linguistic generalizations, for
                                                                            example the verb-second phenomenon, can be in-
                                                                            corporated into a multilingual MG. We think that
                                 V                     V           N↓       a comparison of the Vietnamese MG with this
                    This combination is conveniently expressed by           work would be useful. In particular, a study of the
                  a statement in terms of XMG language as usual:            relative position of verbs and arguments of Viet-
                                                                            namese and relate it to this work would be benefi-
                                                                            tial.
                  TransitiveVerb = Subject ∧ ActiveVerb ∧Object:
                          Proceedings of The Ninth International Workshop on Tree Adjoining Grammars and Related Formalisms
                                                         Tübingen, Germany. June 6-8, 2008.
                      132                                                                            Le, Nguyen and Roussanaly
                       References
                       Marie-Hélène Candito. 1999. Représentation modu-
                          laire et paramétrable de grammaires électroniques
                          lexicalisées : application au franc¸ais et à l’italien.
                          Doctoral Dissertation, Université Paris 7.
                       Benoit Crabbé. 2005. Représentation informatique de
                          grammairesfortement lexicalisées. Doctoral Disser-
                          tation, Université Nancy 2.
                       Nguyễn Thị Minh Huyền, Laurent Romary, Mathias
                          Rossignol and Vũ Xuân Lương. 2006. A Lexicon
                          for VietnameseLanguageProcessing. LanguageRe-
                          sources and Evaluation, Vol. 40, No. 3–4.
                       Kinyon A. and Rambow O. 2003. Using the Meta-
                          Grammar to generate cross-language and cross-
                          framework annotated test-suites.  In Proc. LINC-
                          EACL,Budapest.
                       Alexandra Kinyon and Carlos A. Prolo. 2002. A Clas-
                          sification of Grammar DevelopmentStrategies. Pro-
                          ceedingsoftheWorkshoponGrammarEngineering,
                          Taipei, Taiwan.
                       Kinyon, Alexandra and Rambow, Owen and Schef-
                          fler, Tatjana and Yoon, SinWon and Joshi, Aravind
                          K. 2006. The Metagrammar Goes Multilingual: A
                          Cross-Linguistic Look at the V2-Phenomenon. Pro-
                          ceedings of the Eighth International Workshop on
                          Tree Adjoining Grammar and Related Formalisms,
                          Sydney,Australia
                       Lê Hồng Phương, Nguyễn Thị Minh Huyền, Laurent
                          Romary, Azim Roussanaly. 2006. A Lexicalized
                          Tree-Adjoining Grammar for Vietnamese. Proceed-
                          ings of LREC 2006,Genoa, Italia.
                       Thanh Bon Nguyen, Thi Minh Huyen Nguyen, Lau-
                          rent Romary, Xuan Luong Vu. 2004. Developing
                          Tools and Building Linguistic Resources for Viet-
                          namese Morpho-Syntactic Processing. Proceedings
                          of LREC2004,Lisbon,Portugal.
                       Charles N. Li and Sandra A. Thompson. 1976. Subject
                          and topic: a new typology of language. In Charles
                          N. Li (ed.). Subject and Topic. London/New York:
                          AcademicPress, pp. 457-489..
                       Yannick Parmentier and Joseph L. Roux. 2005. XMG:
                          a Multi-formalism Metagrammar Framework. Pro-
                          ceedings of the Tenth ESSLLI Student Session.
                       Nguyễn Minh Thuyết and Nguyễn Văn Hiệp. 2004.
                          ThànhphầncâutiếngViệt. NXBGiáodục,HàNội,
                          Vietnam.
                              Proceedings of The Ninth International Workshop on Tree Adjoining Grammars and Related Formalisms
                                                             Tübingen, Germany. June 6-8, 2008.
The words contained in this file might help you see if this file matches what you are looking for:

...Ametagrammarforvietnameseltag lehngphng nguynthminhhuyn azimroussanaly loria inrialorraine hanoiuniversity of science nancy france hanoi vietnam lehong fr huyenntm vnu edu vn azim abstract natural language processing in general and we present this paper an initial inves the task parsing vietnamese particular no tigation into use a metagrammar work on formalizing grammar is re for explicitly sharing grammati ported before nguyen et al le cal specifications lan basic declarative structures comple guage wefirst introduce essential syn ment clauses sentences have been tactic mechanisms modeled using about thirty elementary trees rep then show that sub resenting as many subcategorization frames categorization can these subcatego be compactly represented by classes us rization ing xmgformalism extensible meta xmg formalism finally report im syntactic mech plementation first pro anisms ducing verbal recogniz viet namese introduction implementation producing metagrammars mg recently emerged re...

no reviews yet
Please Login to review.