Language Pdf 101400 | Fulltext01

Partial capture of text on file.
                              A Segmentation-free Approach to Recognise Printed Sinhala Script 
                                                                                        
                                                                           H. L. Premaratne 
                                                   University of Colombo School of Computing, Sri Lanka  
                                                    Lalith.Premaratne@ide.hh.se   hlp@mail.cmb.ac.lk 
                             
                           
                                                                                  J.Bigun  
                                        School of Information Science, Computer and Electrical  Engineering 
                                                       Halmstad University, S-301 18 Halmstad, Sweden 
                                                                          Josef.Bigun@ide.hh.se 
                           
                          Abstract                                                          symbols  to  produce  the  required  vocal  sound. 
                          Majority  of  character  recognition  algorithms                  The total number of different modifications from 
                          such as the use of ANNs needs segmentation of                     the entire alphabet including the basic characters 
                          the  script  prior  to  recognition.  Contrast  to                is nearly 400. Although each character possesses 
                          Western scripts, Brahmi descended South Asian                     a  distinct characteristic shape to distinguish from 
                          scripts  such  as  Sinhala  consist  of  modifier                 the others, some characters resemble with one or 
                          symbols, which make the segmentation a difficult                  more of the other characters by their appearance. 
                          task  that  needs  to  be  addressed  as  a  separate             Some examples are given in Figure 1. 
                          issue. Further, the change of shape of the basic                       
                          character (by violating modification rules) in the                Modification  of  a  character  is  carried  out  by 
                          modification  process  makes  some  modified                      simply  adding  one  or  more  modifier  symbols 
                          Sinhala  characters  impossible  to  segment.  The                before/after/above/below  the  character  without 
                          proposed method, which uses Linear Symmetry                       affecting its general shape.. However this rule is 
                          to examine a co-relation between characters in                    violated  for  a  specific  subset  of  the  alphabet 
                          the  script  with  the  testing  alphabet,  recognises            numbering  to  10  characters,  in  most  of  the 
                          characters directly within the image of the script.               printed  scripts,  to  give  a  better  appearance 
                          A  similar  method  is  used  to  resolve  confusing              (Figure 2). Also, in some modifications, the joint 
                          characters. Experiments show highly favourable                    between the character and the modifier symbol is 
                          results  not  only  for  the  basic  characters  of  the          smoothed to make the modified character appear 
                          alphabet but also for the modifier symbols.  A                    as a single unit of symbol.  
                          novel but simple method using Linear Symmetry                        
                          for skew correction has also been proposed.                       1.2  Characteristics of the Script 
                          Key Words: Linear Symmetry, Recognition,                              A single line of script is organised in three 
                          Segmentation, Skew Correction                                     horizontal layers. The middle layer contributing 
                                                                                            to  approximately  50%  of  the  total  line  height, 
                          1.  INTRODUCTION                                                  mainly include fifteen (15) basic characters and 
                          1.1  Alphabet and the Modification                                Nine  (9)  modifier  symbols.  Twenty  two  (22) 
                                Process                                                     other  basic  characters  occupy  the  middle  layer 
                                The Sinhala script used by over 80% of the                  and the upper layer, with approximately 75% and 
                          18.4  million  population  in  Sri  Lanka  has  been              25% of the total height of each character in each 
                          descended  from  the  ancient  Brahmi  script  and                layer  respectively.  The  middle  and  the  lower 
                          evolved independently over many centuries. The                    layers include the remaining eight (8) characters, 
                          Sinhala language is unique to Sri Lanka and the                   with  approximately  75%  and  25%  of  the  total 
                          Sinhala  characters  that  are  generally  round  in              height    of    each    character     in   each    layer 
                          shape differ from all the other Brahmi descended                  respectively. Four (4) modifiers occupy the upper 
                          scripts  in  South  Asia.  The  Sinhala  alphabet                 layer while the remaining five (5) modifiers are 
                          consists  of  18  vowels,  41  consonants  and  17                assigned to the lower layer. The upper and the 
                          modifier symbols. A vowel may appear only as                      lower layers are of equal height each having 25% 
                          the first character of a word and a consonant is                  of the total line height. (Figure 4). 
                          modified  using  one  or  more  of  the  modifier                      
                                                 1.3 The OCR Technology and Recent                                                                                           2. RECOGNITION PROCESS 
                                                            Developments                                                                                                     2.1 Theory  
                                                 Optical  Character  Recognition  (OCR)  is  the                                                                             The theory used in the recognition process is the 
                                                 process of converting typed or printed documents                                                                            orientation  field  tensor  which  has  been  used 
                                                 into  machine-readable code. The original typed                                                                             effectively  in  many  applications  over  the  past 
                                                 or printed documents scanned to form an image                                                                               few  years.  A  local  neighbourhood  with  ideal 
                                                 file  would  be  the  input  to  the  OCR  software                                                                         local orientation is characterised by the fact that 
                                                 system.  The  result  is  a  picture  represented  as                                                                       the gray value only changes in one direction. In 
                                                 light intensities on a rectangular grid of points,                                                                          all other directions it is constant. Since the gray 
                                                 which do not yet identify individual characters.                                                                            values are constant along lines, local orientation 
                                                 The OCR will in turn, recognise each character                                                                              is also denoted as linear symmetry [1]. The linear 
                                                 or  symbol  in  the  image  file  and  make  them                                                                           symmetry  is  also  represented  in  the  form  a 
                                                 available  in  a  suitable  text  editor,  which  could                                                                     vector.             Since  the  direction  of  a  simple 
                                                 either be edited or modified.                                                                                               neighbourhood is different from the direction of 
                                                                                                                                                                             a gradient, which is strictly cyclic, representation 
                                                 Most of the OCR systems use Artificial Neural                                                                               of the linear symmetry needs the doubling of the 
                                                 Networks (ANN's) as the major tool. In addition                                                                             angle  of  orientation.  The  vector  that  represents 
                                                 to the features identified in a rectangular grid of a                                                                       the  linear  symmetry  is  composed  of  two 
                                                 matrix  that  encloses  a  single  character,  other                                                                        quantities.  One is the orientation angle and the 
                                                 features  of  the  character  such  as  the  curvature                                                                      other is the certainty measure.  
                                                 features and transition counts are also used.   In                                                                           
                                                 the  case  of  handwriting  recognition,  some                                                                              2.1.1 Mathematical representation 
                                                 common                     approaches                       are            the            ANN's,                            The  local  orientation  is  determined  using  the 
                                                 mathematical  morphology,  shape  analysis  and                                                                             following three steps [1]. 
                                                 hidden  Markov  model  (HMM).  Each  of  the                                                                                  i.   Select a local neighbourhood from the image 
                                                 above  approaches  has  its  own  strengths  and                                                                            using a window function 
                                                 weaknesses.  Researchers  have  achieved  a                                                                                  ii.   Fourier transform the windowed image 
                                                 significant  improvement  in  performance  by                                                                               iii.    Determine the local orientation by fitting a 
                                                 combining two or more of the above methods.                                                                                             straight             line          to        the         spectral              density 
                                                 Majority  of  alphabets  consists  of  confusing                                                                                        distribution.  
                                                 characters that resemble to each other to a greater                                                                          
                                                 extent. Resolving this problem especially in the                                                                            When  fitting  a  straight  line,    the  sum  of  the 
                                                 case of handwriting recognition is a critical issue.                                                                        squares  of  the  distances  of  the  data  points  are 
                                                                                                                                                                             minimised.  
                                                 The  research  on  the  south  and  the  South-East                                                                         Since  the  minimisation  of  di  is  same  as  the 
                                                 Asian scripts lag behind that on European scripts                                                                           maximisation of SI, the equation (2) is obtained. 
                                                 due to various reasons. The main reason is the                                                                                                                                                            
                                                 complexity  of  a  script.  In  Asian  alphabets,  the                                                                                                              
                                                 number of characters in the alphabet is high and                                                                            The orientation is obtained as the eigen vector of 
                                                 the generation of a vocal sound by modifying a                                                                              the largest eigen value of J. J can be rotated so 
                                                 character  using  modifier  symbols  is  complex.                                                                           that it is diagonalised. The rotation matrix is in 
                                                 Extensive  research  has  been  done  on  a  few                                                                            fact the eigen vector matrix given in equation (1).  
                                                 scripts  used  by  a  very  large  population  of  the                                                                       
                                                 community.  Some  of  such  research  has  been                                                                              
                                                 initiated in developed countries due to the high 
                                                 exposure to such research.                                                                                                  Comparison of the diagonal elements on both sides of 
                                                                                                                                                                             the equation (1) gives 
                                                                                                                                                                             λ +λ =   J  + J   ;                                                           
                                                        At  present,  the  OCR  software  for  the                                                                              1          2           xx        yy
                                                 languages such as Sindhi, Bengali and Thai are                                                                                                                                                                                  
                                                                                                                                                                             λ - λ  =    (J  - J  )Cos2φ + 2 J  Sin2φ     
                                                 available as commercial products. The research                                                                                 1         2                xx       yy                           xy
                                                 on Devanagari and Tamil languages has achieved                                                                               
                                                 a  tremendous  progress.  To  the  best  of  our                                                                                                                                      Cos2φ   
                                                                                                                                                                              =     (J            - J        , +2J ) 
                                                 knowledge, there have been no or a very little                                                                                               xx         yy              xy
                                                 research    done  on  the  recognition  of  printed                                                                                                                                    Sin2φ   
                                                 Sinhala script.                                                                                                              
                                                                                                                                                            2.2 . Determination of Skew Angle 
                                                                                                                                                            Almost all  the  recognition  algorithms  need  the 
                                             =  I20 ,   Cos2φ     I20 , I20 /  I20    I20                                                         text  lines  in  the  input  image  to  be  horizontal. 
                                                             Sin2φ                                                                                   Therefore,  any  skew  associated  with  the  input 
                                             ∴∴    λ         - λ       =   I                                                                                image  needs  corrections  prior  to  recognition. 
                                             ∴∴          1        2             20                                                                          Experiments show that the recognition algorithm 
                                                                                                                                                            proposed in this thesis tolerates a skew of +10 to -
                                            Define ∇f =   ∂f/∂x +  i (∂f/∂y)                                                                                10.       The  accuracy  of  recognition  deviates 
                                                                                                                                                            considerably with the increasing skew. Therefore 
                                            then   I20                                                                                                      a robust method for skew correction needs to be 
                                            =(∇f)2=((∂f/∂x)2-(∂f / ∂y)2 +2I(∂f/ ∂x). (∂f / ∂y)) 
                                                                      2                   0     2                                                           incorporated.  
                                            = [ (ω +iω ) (ω - iω ) |F| ]  =  (λ -λ )exp(2iφ) 
                                                         x         y         x        y                        1       2                                     
                                                                            1                  1     2                 2         2       2  
                                             I    =[ (ω +iω ) (ω - iω ) |F| ] =  (ω + ω )|F|
                                               11             x          y        x         y                        x          y                           Careful  observation  of  a  line  of  Sinhala  script 
                                                      =   ((∂f/ ∂x)2  + (∂f / ∂y)2     =   λ                      + λ  
                                                                                                               1          2                                 shows that the boundary between the upper and 
                                                                                                                                                            the middle layers and the boundary between the 
                                            Angle of  I20  represents the (2 x angle) where the                                                             middle and the lower layers (fig. 8) possess the 
                                            angle  is  the  inclination  angle  of  the  fitting                                                            highest  amount  of  energy  in  the  horizontal 
                                            orientation if the linear symmetry exists,  and  I11                                                            direction. The horizontal projection of a sample 
                                            represents the sum of the best and the worst total                                                              script  clearly  agrees  with  this  concept.  This  is 
                                            errors.                                                                                                         due to the fact that any character in the alphabet 
                                                                                                                                                            should touch either at least one or both of these 
                                            The Linear Symmetry algorithm that extracts the                                                                 boundaries. Therefore, tracing the appearance of 
                                            tensor is characterised by the fact that it delivers                                                            one of these boundaries in a skewed script could 
                                            a dense orientation field along with certainties. In                                                            be used to determine the skew angle. Although 
                                            case  of  high  confidence  on  the  existence  of                                                              any straightforward method to detect a boundary 
                                            orientation, the linear  orientation  represents the                                                            line  could  have  been  used,  a  more  appropriate 
                                            least change of gray values in one direction and                                                                method using the Linear Symmetry (LS) tensor  
                                            maximal  change  in  the  orthogonal  direction.                                                                has been proposed.  
                                            Hence a Linear  Symmetry Tensor for an image                                                                     
                                            is constructed by averaging the orientation of the                                                                  
                                            local neighbourhood, for each pixel of the image.                                                               The  Linear  Symmetry  tensor  [1]  which  gives 
                                                                                                                                                            information for each pixel of the image, on how 
                                                                                                                                                            it  is  organised  with  respect  to  the  orientation 
                                            2.1.2 Implementation                                                                                            within a local neighbourhood, could effectively 
                                            The LS Tensor for an image is built as explained                                                                be used to determine the orientation of the script. 
                                            in the following steps.                                                                                         In general, the orientation angle of the resultant 
                                                                                                                                                            vector of all the vectors representing the LS for 
                                            Four 1-D derivative filters  dx  (Gaussian kernal),                                                             each  pixel  of  the  image  would  provide  a  near 
                                            dy (= - dx’) and  gx (Gaussian kernal),                                                                         approximation  to  the  skew  angle.  In  order  to 
                                            gy  (= gx’) are generated.                                                                                      improve  the  accuracy,  the  interference  to  the 
                                                                                                                                                            final  result  from  the  following  components 
                                            The  two  derivative  convolutions  dxf    (=                                                                   should be elimination. 
                                            convolution(gy, convolution(dx, Image))  and                                                                     
                                            dyf (= convolution(gx, convolution(dy, Image))                                                                   i.  Edges of the image 
                                            of the original image with respect to x and y are                                                               ii.  Background of the image, which consists of 
                                            constructed using the above pair of filters.                                                                          pixels  having  random  orientations  of  low 
                                                                                                                                                                  confidence. 
                                            The LS Tensor (complex) is then given by                                                                        iii.  Other  pixels  (within  the  text  area)  having 
                                            LS = (dxf +j∗dxy)^2        where j = √ (-1)                                                                     orientations of low confidence. 
                                                                                                                                                             
                                            The  correlation  between  the  character  being                                                                The results obtained for the LS tensor derived in 
                                                                                                                                                            section 3.3.2 yield the skew angle within +10 to –
                                            tested  with  the  image  is  calculated  using  the                                                            10  accuracy,  which  is  well  within  the  required 
                                            formula                                                                                                         accuracy for the recognition algorithm.    
                                            absolute(convolution(conjugate(LS  Tensor  of                                                                    2.3 Recognition Procedure  
                                            Character), LS Tensor of Image )).                                                                               
                                 2.3.1  Testing Database.                                                             of  filtering  is  carried  out  in  order  to  determine 
                                 The  recognition  process  is  based  on  the                                        the  acceptance  or  rejection  of  the  identified 
                                 examination  of  the  correlation  of  characters  in                                character. A tertiary level of filtering is carried 
                                 the  script  with  each  character  of  the  alphabet                                out similarly.  
                                 through  a  filtering  operation.  The  testing                                           It  has been observed that, in addition to the 
                                 alphabet  which  consists  of  all  the  characters                                  highest value of correlation produced usually at 
                                 (including  the  modifier  symbols),  is  built  by                                  the centre of the character, a few more relatively 
                                 extracting  characters  from  an  LS  tensor.    Each                                high  values  are  also  produced  around  the 
                                 character in the testing alphabet is filtered (one at                                neighbouring pixels. This is due to the fact that 
                                 a  time)  through  the  LS  tensor  of  the  script  in                              the  template  of  the  testing  character  nearly 
                                 order  to  identify  its  occurrences  in  the  entire                               coincides with the neighbouring pixels around its 
                                 script.  The plot of correlation at each pixel (Fig.                                 centre. This will result in recognising the same 
                                 10) shows that, each occurrence of the character                                     character  in  the  image  more  than  once. 
                                 being tested gives a strong correlation. A suitable                                  Therefore, once the filtering has been performed, 
                                 threshold  that  separates  the  required  character                                 non-maximums in a  small  neighbourhood  (e.g. 
                                 from the rest of  the characters in the  script,  is                                 3x3)  are  suppressed  in  order  to  eliminate  the 
                                 then determined. This procedure is conducted for                                     multiple acceptance of the same character.  
                                 each and every character of the alphabet. During                                           The recognition algorithm  is as follows:  
                                 this  process,  it  has  been  observed  that  a  total                               Input image 
                                 number of 35 characters amounting to 60% of the                                      Input database-of-characters                             
                                 alphabet separates from all the other characters                                     */Alphabet/* 
                                 with  a  clear  threshold  (Fig.  10(a))  while  the                                 Pre-process image 
                                 balance  40%  confuse  with  one  or  more                                           Perform Horizontal-projection 
                                 characters with similar shapes (Fig. 10(b)). Eight                                   Extract Line-data 
                                 (8) such confusing groups have been  identified.                                     ConstructLS-tensor 
                                      Once all  the  different  confusing  groups  are                                Read character 
                                 identified, another level of filtering is carried out                                While not-end-of-alphabet do 
                                 to  separate  each  character  within  the  confusing                                        Filter characte with the LS Tensor  
                                 group.  The  secondary   level  of   filtering   is                                     
                                 performed to examine the correlation of a distinct                                                */ Primary Filtering /* 
                                 segment from one character with all the members                                              Supress non-maximums 
                                 in  the  group  (Fig.  11).  A  suitable  (secondary)                                       While not-end-of-image do 
                                 threshold that separates each character from the                                                   Segment occurrences above threshold  
                                 rest  is  then  determined.  A  further  level  of                                                If  confusing-charcater 
                                 filtering  is  carried  out  if  the  confusion  still                                                 Determine relative rhreshold 
                                 occurs.                                                                                                Perform secondary-filtering 
                                      The  structure  of  the  testing  database  is  as                                              /* and tertiary-filtering if necessary*/ 
                                 follows.                                                                                          End-If                       
                                 Character Identifier                                                                              Store  image-coordinates    of  -each 
                                 LS Tensor of character                                                                            occurrence 
                                 Primary Threshold                                                                         End-While *** not-end-of-image *** 
                                 Flag to indicate confusing status                                                        Update output array  
                                 Secondary Threshold (for confusing characters)                                            /* with ASCII Value, row, column no, .*/ 
                                 Tertiary Threshold (for confusing characters)                                            Read character                        
                                                                                                                      End-While *** not-end-of-alphabet*** 
                                 2.3.2  Recognition.                                                                  Sort output on Column No. within the Row No. 
                                 The image is initially  pre-processed  to  remove                                      
                                 the background noise. The image is then scaled                                       Since a character is identified directly within the 
                                 (if  necessary)  to  match  the  average height of a                                 image  of  the  script,  the  need  to  segment 
                                 character to that of the testing alphabet.                                           individual  characters  does  not  arise.  Symbols 
                                      Recognition  of  a  script  is  performed  by                                   such as comma, full stop, question mark are also 
                                 filtering  the  LS  tensor  of  each  character  of  the                             recognised with the same accuracy.  
                                 testing alphabet with the LS tensor of the script.                                    
                                 In each filtering cycle, all the occurrences of the                                   
                                 character being tested are identified. If the testing                                 
                                 character is a confusing one, the secondary level
The words contained in this file might help you see if this file matches what you are looking for:

...Institutionen for kultur och kommunikation nathalie colin english and swedish animal idioms a study of correspondence variation in content expression engelska c uppsats termin hostterminen handledare michael wherrity abstract titel forfattare ht are found every language learning them is an important aspect the mastery no exception as it contains large number which extensively used however because their rather rigid structure quite unpredictable meaning often considered difficult to learn although little research has been done date on nature well how they better understanding variations can nevertheless be acquired by looking at some theories thoughts about use aim this paper examine focusing primarily similarities differences equivalent even when do not contain two types studies presented first one collected grouped into four categories results such categorization show that half have containing second wording semantics metaphorical analysed compared indicate correspond most part same s...
Related files

Share

Help

Related files

Share

Share to social media

Help

Login Area