Vietnamese Grammar Pdf 105222 | 93588 Item Download 2022-09-24 06-35-03

Partial capture of text on file.
                                   Language-oriented Sentiment Analysis based on the Grammar 
                                                    Structure and Improved Self-attention Network 
                                                       1,2,*    a                  3,4,*                             4b                              5                            6c
                              Hien D. Nguyen                     , Tai Huynh            , Suong N. Hoang                  , Vuong T. Pham  and Ivan Zelinka                              
                                            1Faculty of Computer Science, University of Information Technology, Ho Chi Minh City, Vietnam 
                                                                     2Vietnam National University, Ho Chi Minh City, Vietnam 
                                                                      3Ton Duc Thang University, Ho Chi Minh City, Vietnam 
                                                                                          4Kyanon Digital, Vietnam 
                                                    5Faculty of Information Technology, Sai Gon University, Ho Chi Minh City, Vietnam 
                                                                    6Technical University of Ostrava (VŠB-TU), Czech Republic 
                                                                                              ivan.zelinka@vsb.cz 
                                                                      * Equal contribution by Hien D. Nguyen and Tai Huynh 
                           Keywords:           Sentiment Analysis, Sentiment Classification, Vietnamese, Self-attention, Transformer, Natural Language 
                                               Processing. 
                           Abstract:           In the businesses, the sentiment analysis makes the brands understanding the sentiment of their customers. 
                                               They can know what people are saying, how they’re saying it, and what they mean. There are many methods 
                                               for sentiment analysis; however, they are not effective when were applied in Vietnamese language. In this 
                                               paper, a method for Vietnamese sentiment analysis is studied based on the combining between the structure 
                                               of Vietnamese language and the technique of natural language processing, self-attention with the Transformer 
                                               architecture. Based on the analysing of the structure of a sentence, the transformer is used to process the word 
                                               positions to determine the meaning of that sentence. The experimental results for Vietnamese sentiment 
                                               analysis of our method is more effectively than others.  Its accuracy and F-measure are more than 91% and 
                                               its results are suitable to apply in practice for business intelligence. 
                           1 INTRODUCTION                                                                      influencer on the social network for the influencer 
                                                                                                               marketing (Huynh et al, 2019). 
                           Sentiment analysis (SA) is one of the subfields of                                       Vietnamese is a language isolate (Nguyen et al., 
                           Computational Linguistics and Natural Language                                      2006). The meaning of a sentence belongs to the way 
                           Processing (NLP) (Gamal et al., 2019).  In the                                      for organizing of its predicates (Clark, 1974). In other 
                           businesses intelligence, the sentiment analysis makes                               words, the information about word positions 
                           the brands understanding the sentiment of their                                     contribute the sentence meaning and grammatical 
                           customers (Rokade and Kumari, 2019). They can                                       meaning. The analysing on the Vietnamese sentence 
                           know what people are saying, how they’re saying it,                                 has to combine the studying of the grammar structure. 
                           and what they mean. The sentiment of customer                                            Some machine learning-based approaches have 
                           sentiment can be found in tweets, comments, reviews,                                been studied to analysis the sentiment of a 
                           or other places where people mention the brands.                                    Vietnamese sentence.  
                                In the current era, social network is a popular                                     CountVectorizer (Irfan et al., 2015) and Term 
                           platform for communication and interaction (Beigi,                                  Frequency–Inverse Document Frequency (Tf-idf) 
                           2016). Many people found innovative information on                                  (Aggarwal, 2011) are used for word representations. 
                           social network and due to that social network is the                                However, they cannot analysis the positions of words 
                           important data source. SA is also used to detect the                                in a sentence, so their results are not exactly. Support 
                                                                                                                            
                           a    https://orcid.org/0000-0002-8527-0602 
                           b    https://orcid.org/0000-0002-3354-013X 
                           c    https://orcid.org/0000-0002-3858-7340 
                                                                                                                                                                                     339
                           Nguyen, H., Huynh, T., Hoang, S., Pham, V. and Zelinka, I.
                           Language-oriented Sentiment Analysis based on the Grammar Structure and Improved Self-attention Network.
                           DOI: 10.5220/0009358803390346
                           In Proceedings of the 15th International Conference on Evaluation of Novel Approaches to Software Engineering (ENASE 2020), pages 339-346
                           ISBN: 978-989-758-421-3
                                    c
                           Copyright 
 2020 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved
                      ENASE2020-15thInternational Conference on Evaluation of Novel Approaches to Software Engineering
                      Vector Machine (Joachims, 1998) and Naïve Bayes                      analysed to determine whether they are positive, 
                      (Irfan et al., 2015) are used as classifiers. However,               negative or neutral.  
                      those methods did not mention to the structure of a                      The experimental results show that our method 
                      sentence, so their results are not suitable in the                   being more effective than other in Vietnamese 
                      practice.                                                            sentiment analysis. Its accuracy and F-measure are 
                          In (Krouska et al., 2017, Troussas et al., 2016),                more than 91% and its results are suitable to apply in 
                      authors present five well-known learning-based                       practice for business intelligence. 
                      classifiers (Naïve Bayes, Support Vector Machine, k-                     The next section presents some techniques of the 
                      Nearest Neighbor, Logistic Regression and C4.5) and                  Transformer. Section 3 presents the method for 
                      a lexicon-based approach (SentiStrength) to analysis                 Vietnamese sentiment analysis. That method uses the 
                      the sentiment on Twitter. However, it only studies on                improved architecture of self-attention with 
                      English.                                                             transformer on the structure of the sentences in 
                          Besides, some types of recurrent neural networks                 Vietnamese to determine their meaning. Section 4 
                      (RNNs), such as long short-term memory (LSTM)                        described the experimental results. The last section 
                      (Hochreiter, 1997, Cheng et al., 2016), Bi-Directional               concludes the main results in this paper. 
                      LSTM (biLSTM) (Schuster and Paliwal, 1997) or 
                      gated recurrent unit (GRU) (Chung et al., 2014), are 
                      very complex and take a long time to solve the                       2 SELF-ATTENTION NETWORK 
                      problem about sentiment analysis on Vietnamese. 
                          The sentiment analysis for Vietnamese was                        Scaled Dot-Product Attention: Let si - 1 be a query 
                      researched in (Nguyen et al., 2014). This study                      vector q, and h is duplicated with one is key vector k 
                      investigated the task regarding both Support Vector                                   j                                            j
                                                                                           and the other is value vector v (in 
                      Machine (SVM) model and linguistics feature aspects                                                                        j
                      which is an annotated corpus for sentiment                           current NLP work, the key and value vector are 
                                                                                           frequently the same, there for h can be considered as 
                      classification extracted from hotel reviews in                                                            j
                                                                                           k or v).  
                      Vietnamese. However, this method is not designed                      j     j
                      based on the grammar structure, so some sentences                                                n                                      (1) 
                                                                                                                  ca       v
                                                                                                                       jj
                      cannot be determined accurately.                                                                j1
                          Self-attention has been used successfully in a                                                                         T
                                                                                                           exp(eq)                             .k
                                                                                                                 jj
                                                                                           where  ae, and                         (q,k)               (2)
                                                                                                        
                                                                                                     jjj
                      variety of tasks including reading comprehension,                                   n                                   d
                      abstractive summarization, textual entailment and                                  exp(e )                               model
                                                                                                                   k
                      learning task-independent sentence representations                                 k1
                      (Zhou et al., 2018). The Transformer (Vaswani et al.,                                                                               
                      2017) is the transduction model based on self-                                                                                       (1  j  n)  
                      attention to compute representations of its input and                    dmodel is the dimension of input vectors or k vector 
                      output without using sequence aligned RNNs or                        (q, k, v have the same dimension as input embedding 
                      convolution. In (Hoang et al., 2019), authors study                  vector) 
                      sentiment analysis of product reviews in Vietnamese                      Self-attention is a mechanism to apply Scaled 
                      by using Self-attention neural networks. However,                    Dot-Product Attention to every token of the sentence 
                      that study does not mention to the structure of                      for all others.  
                      Vietnamese sentence in the analysing, so its results                     For every token in sentence, three vectors Query, 
                      are not exactly and suitable the practical                           Key, Value are created by using a linear feed-forward 
                      requirements.                                                        layer as a transformation, then the attention 
                          In this paper, the method for Vietnamese                         mechanism is applied to get the context matrix. 
                      sentiment analysis is proposed. This method is used                  However, this process is very slow, so we consider 
                      to determine the sentiment of a sentiment sentence                   three matrices Q, K, V:  
                      including positive, negative or neutral. The structures                    Q is a matrix containing all the query vectors, 
                                                                                           Q = [q, q ,..., qn] with q is a query vector.  
                      of a Vietnamese sentence are studied. Based on those                         1  2                i
                      structures, the meaning of this sentence is analysed by                    K is a matrix containing all the key vectors, K 
                                                                                           = [k , k , ..., kn] with k  is a key vector. 
                      using the self-attention neural network architecture                      1  2                i
                      Transformer. Besides, the layer of Squeeze and                             V is a matrix containing all the key vectors, V 
                                                                                           = [v , v , ..., vn] with v  is a value vector. 
                      Excitation (Hu et al., 2018) is also used to recalibrate                  1  2                i
                      features in the process. The sentences will be                       Thus, we have: 
                      340
                                                   Language-oriented Sentiment Analysis based on the Grammar Structure and Improved Self-attention Network
                                                                       T                   indicates the speaker’s desire to influence future 
                                                               
                                                                  QK.                 
                            Attention(,Q K,V)softmax                      .V    (3)       events. In the problem about sentiment analysis, we 
                                                               
                                                               
                                                                   dmodel                  only need to determine whether a sentence is positive, 
                                                               
                      Multi-head Attention performs the attention h times                  negative or neutral; thus, in the scope of this paper, 
                      with (Q, K, V) matrices of the dimension d           /h. Each        we only mention to the declarative sentence type. 
                                                                        model
                      head is a time for applying Attention. For each head,                     The structure of a single declarative sentence in 
                      the (Q, K, V) matrices are uniquely projected with the               Vietnamese is shown in Fig.1: 
                      dimensions  d        /h. Self-attention mechanism is 
                                        model
                      performed to yield an output of the same dimension 
                      d    /h. After all, the outputs of h  heads are 
                        model
                      concatenated, and applied a linear projection layer 
                      once again. The formula for this process is as follows: 
                       MultiHead(Q,K,V)Concat head ,head ,...,head .WO
                                                       
                                                             12h
                                                                                      
                                                    OOO  
                            where  head  Q.W ,K.W ,V.W                             (4)
                                         i                                                                                                        
                                                                                           Figure 1: Structure of a single declarative sentence in 
                      3     METHOD FOR VIETNAMESES                                         Vietnamese. 
                            SENTIMENT ANALYSIS                                             Definition 1: Kinds of the structure of a positive 
                                                                                           sentence 
                      In this section, the method for analysing the sentiment                   A single positive declarative sentence in 
                      of a Vietnamese sentence is proposed. The sentences                  Vietnamese has the foundation structure: 
                      will be analysed to determine whether they are                                            =   
                      positive, negative or neutral.                                            It is classified as Table 1. 
                          Firstly, the structures of a Vietnamese sentence                    Table 1: Kinds of the structure of a positive sentence. 
                      are studied. Because the scope of this study is the 
                      evaluation comments for a product on the social                               Kinds Variants 
                      network, there are two kinds of declarative sentence                   P is :            “là”  
                      were mentioned: positive and negative sentence.                                        
                          Secondly, based on those structures, the meaning                   =               
                      of this sentence is analysed by using the self-attention                                       
                      neural network architecture Transformer. Because the                                             
                      meaning of a Vietnamese sentence belongs to the                        P is :
Related files

Share

Help

Related files

Share

Share to social media

Help

Login Area