Rajan Transform Based Spectral Analysis of Handwritten Characters

: Homomorphic transforms are better suited for pattern recognition or classification. In general, homomorphic maps are not invertible and hence they are known as transformations. So, they do not fall under the category of mathematical transforms. But if the inverse of a transformation is obtained using an algorithm or a semi-decision procedure the transformation could be called a transform in the loose sense. Nonlinear homomorphic operators are not meant for analysis and synthesis, but they are used for classification. In this context, efforts were made to search for a homomorphic map which could be examined for character recognition. One such nonlinear homomorphic map has been identified as Rajan Transform. This paper provides details of this transform and its working principle in recognition of handwritten characters


Introduction
Handwritten Character Recognition (HCR) is an important but most challenging task in the field of pattern recognition with huge number of practical applications. It has been an intensive field of research since the early days of computer science because that is how interactions between computers and humans took place in a natural way. Character recognition is a process of identifying characters from a document image. In other words character recognition is a method of recognizing characters from input image and converting them into ASCII or any other equivalent machine editable form (Kai Ding et al 2007). It contributes greatly to the advancement of computerization process and improving the cybernetics of man-machine interface. There has always been a worldwide interest in the development of handwritten character recognition techniques and applications. To cater to the needs of various sectors of people, tremendous advances made in the computational intelligence and algorithms, provided latest tools for the development of modern character recognition systems. Based on the mode of input, handwritten character recognition methods have been grouped into two classes (i) online and (ii) offline. In offline handwritten character recognition, the input is made available in the form of an image which is obtained through a scanner or a digital camera whereas in online character recognition, coordinate information of strokes is made available with timing information.
In order to filter frequency content and manage entropy in a digital image one uses discrete transforms, in the two dimensional sense, like 'Discrete Fourier Transform', 'Discrete Cosine Transform', 'Discrete Wavelet Transform', 'Walsh-Hadamard Transform', 'Haar Transform', 'Hough Transform', to name a few. All these are basically isomorphic functions and so they are used for signal analysis but not for pattern recognition. Alternatively, Rajan Transform (RT)' exhibits both isomorphic and homomorphic functional properties and so it could be used both for signal analysis and pattern classification. Rajan Transform is structurally similar to Hadamard Transform, but it functionally differs from the latter in the sense that it generates phasor information during different stages of computation, and it works as an isomorphic function with the phasor information and it works as a homomorphic function without the phasor information. This homomorphic property is made use of for the purpose of pattern classification.
Rajan Transform (RT) was introduced in the year 1997 by E. G. Rajan on the lines of 'Hadamard Transform' (Rajan, E. G., 1997). Rajan Transform (RT) is a coding morphism by which a number sequence (integer, rational, real or complex) of length equal to any power of two is transformed into a highly correlated number sequence of the same length. It is a homomorphism that maps a number sequence, its graphical inverse and their cyclic and dyadic permutations, to a set consisting of a unique number sequence ensuring the invariance property under such permutations. This invariance property is also true for the permutation class of the dual sequence of the number sequence under consideration. A number sequence and its dual are like an object and its mirror image [1].

Literature Review
Now-a-days character recognition has gained lot of attention in the field of pattern recognition due to its application in various fields. Optical Character Recognition (OCR) and Handwritten Character Recognition (HCR) are two apparently different but related domains. While OCR system is most suitable for applications like multi choice examinations and printed postal address resolution, HCR has wider applications like cheque processing in banks and handwritten postal address resolution. In coming days, character recognition system might serve as a key factor to create paperless environment by digitizing and processing existing paper documents.
'Image Processing and Pattern Recognition' plays significant role in handwritten character recognition. Two major approaches are generally deliberated in image processing for pattern recognition purposes (i) spatial domain characterization and (ii) spectral domain characterization. Mostly all handwritten character or word recognition techniques developed so far fall under these two categories.
Spatial domain characterization mainly deals with various spatially distributed image features and their structural relationships for classification purposes. Neighborhood processing and mathematical morphological processing of digital images are the two basic paradigms in which a number of feature extraction techniques and algorithms have been developed for pattern recognition purposes. Pixel-wise value updating numerical operations like addition of a value say x to each pixel value A i,j present at i th row and j th column of a digital image or subtracting a value x from each pixel value A i,j are called 'monadic' operations. Pixel-wise value updating numerical operations like addition of a pixel value A i,j present at say i th row and j th column of a digital image to the corresponding pixel value B i,j of another digital image or subtracting one from the other are called 'dyadic' operations. Pixel value updating operations based on an algebraic or logical formula that involves the pixel values present in a specific neighborhood are called 'cellular' or 'neighborhood' operations. Treating two images A and B as sets {<i, j, A i,j >} and {<m, n, B m,n >}, where <i, j> and <m, n> are the coordinate pairs of the pixel positions and A i,j and B i,j are the pixel values of the two images respectively and carrying out the set theoretic operations of 'union' and 'intersection' among them are called morphological operations of 'dilation' and 'erosion' respectively.
Alternatively, spectral domain characterization deals with frequency contents, that is, energy distribution in an image under consideration and the spectral correlations for classification purposes. For example, edge implies high spatial frequency and magnitude implies energy. A two dimensional high pass filter highlights the edges whereas a two dimensional low pass filter smoothens them. The amount of energy content is called the 'entropy' associated with an image. Lower the energy content smaller the entropy value. In this parlance, a two dimensional high pass filter maximizes entropy whereas a two dimensional low pass filter reduces it. In order to get the frequency content in a digital image for analysis purposes, one uses discrete transforms, in the two dimensional sense, like 'Discrete Fourier Transform', 'Discrete Cosine Transform', 'Discrete Wavelet Transform', 'Hadamard Transform', 'Haar Transform', 'Hough Transform', to name a few. All these transforms are essentially isomorphic functions and that is precisely why they are extensively used for signal analysis and sporadically used for pattern classification.
In this context, 'Rajan Transform (RT)' has been tried as a part of this research for pattern classification because of the fact that RT exhibits both isomorphic and homomorphic functional properties. Though RT is structurally similar to Hadamard Transform, but RT functionally differs from the latter in the sense that it generates phasor information during different stages of computation and it works as an isomorphic function with the phasor information and it works as a homomorphic function without the phasor information. This homomorphic property is made use of for the purpose of pattern classification. Many researchers have been trying to develop algorithms and techniques in the framework of these two paradigms of spatial domain and spectral domain characterization with the idea of recognizing handwritten characters or words of different languages. Recently Rajbala et al (2012) have discussed various types of feature extraction methods like statistical, structural and global transformation techniques for pattern classification [2]. Statistical methods make use of statistical distribution of pixels in an image while finding out variations in writing styles. Structural methods make use of feature values that are obtained from the structural and geometrical properties of a character while forming its structural feature space. Global transformation techniques make use of spectral domain characterization of a character. That is, discrete transforms like Discrete Fourier Transform are used for extracting the outline of a character image such that certain coefficients are used to reconstruct the outline of the character image once again. These coefficients form the feature vector of that particular character.
Pathan et al (2012) have proposed offline approach for handwritten isolated Urdu characters in their work. Urdu characters contain one or more segments, in which a component is known as 'primary' (large continuous strokes) and rest as secondary (small strokes or dots). Moment Invariants (MI) features are used in their method to recognize the characters.
Amritha Sampath et al (2012) have stated that feature extraction could be carried out using either low level or high level features. Low level features include width, height, curliness, aspect ratio etc., of a character [3]. These alone cannot be used to distinguish one character from another in the character set of a language. So, there are various other high level features which include number and position of loops, straight lines, headlines, curves etc.

Hadamard Transform and Rajan Transform
As stated earlier, any mathematical (integral or discrete) transform is developed based on the fact that a function (analog or discrete) is represented in terms of basis functions which are orthogonal to each other. The term 'orthogonality' refers to 'perpendicularity' as a result of inner product in the domain of vector algebra, that is, two vectors are perpendicular (orthogonal) to each other when their dot product is 0. The same term refers to the result of a binary operation between two entities which does not yield any third entity. The trigonometric functions cosine and sine are called basis functions because of the fact that they are orthogonal functions due to the phase difference of 90° between them. On the same line, one can visualize the operations of addition and subtraction as complementing binary operations which introduce, in loose sense, phase difference between the results of their operations with reference to the pivot. For example let us consider a number 2. Call it as pivot. Now the number 3 is added to it and the result is number +5. This operation introduces a positive phase with respect to the pivot. If number 3 is subtracted from it the result is number -1. This operation introduces a negative phase with respect to the pivot. Figure 1 demonstrates this simple concept. Hadamard Transform is based on this simple concept (Helger Lipmaa 2002) [10]. This transform is named after the French mathematician Jacques Hadamard, the German mathematician Hans Rademacher, and the American mathematician Joseph Walsh since all the three of them were involvd in the formulation of the basics behind it. It is essentially a 2 m ×2 m matrix called Hadamard matrix which performs orthogonal, symmetric, involutional and linear operation on x(n), asequence of 2 m real numbers and transforms it into another sequence X(k) of The signal flowgraph for this computation is shown below in Figure 2. This signal flowgraph is similar to that of the Decimation-in-Frequency (DIF) of FFT algorithm.   [8,9]. The question that arises here is that whether FWHT could be used for pattern recognition or not. The answer is a simple 'NO' because of the fact that Hadamard transform is an isomorphic map and hence it cannot be effectively used for classification purposes. It is in this connection, the following argument evolved. Orthogonal representation of periodic or finite energy functions involve basis functions and the complementary pair of binary operations among them decides the dynamics of a transform. In most cases of fast algorithms of discrete transforms which work on number sequences, addition and subtraction form the pair of operations. It is to be noted that in such cases phase information is invariably used in the transformation so that the inverse transformation is ensured and isomorphism established. In the absence of phase information, the forward transformation is realized but not the inverse. Let us consider the case of FWHT where the phase information is omitted. The forward transformation is obtained by replacing the operation of subtraction by the operation of difference ~ (Rajan 1997) [4]. The signal flow graph for this modified scheme is shown below in Figure 4 and it could be seen to be similar to that of FWHT but with another pair of operations of addition + and difference ~. For the same x(n) = [1 0 1 0 0 1 1 0], X(k) due to FWHT is [4 2 0 -2 0 2 0 2] whereas X(k) due to the above modified scheme is [4 2 2 0 2 0 2 0]. For this modified scheme, inverse transformation has been tried on the lines of inverse Hadamard transform. But it did not work. Hence the modified scheme is as such noninvertible and the corresponding signal flow graph given below in Figure 5 is self-evidential. The question that arises at this point is whether it is possible to develop a procedure to get the inverse of the modified scheme. It was in this context a novel pair of algorithms was proposed which could be used either as a homomorphic map or as an isomorphic map either in the absence or in the presence of phase information respectively (Rajan 1997) [4]. The pair was called Rajan Transform (RT). A formal definition of forward Rajan Transform is given below.

Two Dimensional Rajan Transform
The definition and various results of one-dimensional case discussed just above can be extended and applied to the two dimensional case. There are two approaches to the implementation of two-dimensional Rajan Transform: (i) Row-column approach and (ii) column-Row approach. The 2-D Rajan transform obtained using the first approach will not be equal to that obtained using the second approach. Let us consider the matrix Applying RT to the above matrix H 4, first row-wise and then column-wise, the resultant matrix X r,c (k 1 ,k 2 ) as shown below.
Applying RT to the above matrix H 4, first column-wise and then row-wise, the resultant matrix X c,r (k 1 ,k 2 ) as shown below.
So, [X r,c (k 1 ,k 2 )] = [X c,r (K 1 ,k 2 )] for symmetric matrices. Let us consider the two-dimensional array A as the input pattern with the shape of T. One can verify that Rajan Transform of C and B are the same as that of A. From this it is obvious that Rajan Transform is permutation invariant and so it could be used as a pattern recognition procedure. This is the motivating factor behind choosing the homomorphic Rajan Transform (RT) for character recognition and classification.

Rajan Transform Based Spectral Analysis of Handwritten Characters
A written document consists of characters from an alphabet of a language. In the scanned image of a handwritten document each character is treated as an object (digital image). The ability to classify machine printed characters in an automated or a semi-automated manner has potential applications in numerous fields. Creating an algorithm with one hundred percent correct recognition rate is impossible. Robustness of an algorithm is all about its ability to recognize morphologically distorted characters. Human understanding of certain characters may sometimes turnout to be erroneous. For example, a distorted number '5' may be mistaken for the character 'S'. But it is unreasonable to say that some one has mistaken '5' for character 'M'. Algorithms make such mistakes because they mostly operate on different sets of features than humans do. Humans understand characters and symbols by their syntactic features and relations between them whereas algorithms work on the quantificational measures of various geometric and other spatial features or on the abstract notions of frequencies and phase relationships imposed by various transforms. Many such methods do work efficiently, but they make the computer understand letters through what is known as 'machine vision' or 'machine intelligence'. The handwritten character classification efficiency not only depends on the type of algorithm that is used but also on the training character set used for reference and the quality of the scanned image under test. In this context, an effort has been made to use Rajan Transform for handwritten character recognition and classification. This involves three steps (i) mapping a character of interest on a fixed wire frame consisting of 8×8 array of cells, (ii) creating a (0-1)-matrix corresponding to the character of interest and (iii) applying two dimensional Rajan Transform (2D-RT) to the (0-1)-matrix following either row-column approach or column-row approach exclusively. The 2D-RT spectrum is used either as an input to the training algorithm or as an input to the testing algorithm for matching and classification. These steps are explained here. The character in the scanned document is treated as an image and projected on 8×8 array of cells as shown in Figure 6. Shape and size of the image are not of concern here since any image could be projected on 8×8 array of cells of relevant size. Care is taken that the image is fully projected. Projection algorithm involves image resizing and sampling by 8×8 cell array [2,3]. The projected character image is sampled and then the (0-1)-matrix obtained which is also called as 'characteristic matrix' of the projected character. One can obtain characteristic matrix of a character image as such or of the processed (thinned) image. Figure 7 shows the fixed font character 'A', scanned character , its thinned version and the characteristic matrices of all the three images. Figure 8 shows various geometric permutations of the handwritten character .

Conclusion
An empirical study of the 2-D array A(m,n) and its 2-D RT A(k 1 ,k 2 ) has revealed that 2-D RT spectra of the letter A and of its permuted and intentionally distorted forms exhibit spectra with minimum errors. An extensive study has been undertaken to examine the robustness of 2D-RT algorithm on images of handwritten characters of various writers.