Speech compression based on wavelet and contourlet Transformation

Speech compression is very important in many applications. In this research we applied compression on speech by using wavelet transform and contourlet transform. By taking speech (which is often single dimension) into a two-dimensional array (to be suitable for transferring it to contourlet transformation), and then we applied wavelet transform .Then we make contourlet transform on coefficients of high wavelet. After transforming or storing speech and when using the decompression is applied by using inverse way of those transformations. Applied measuring (SNR, PSNR, NRMSE and Corr.) to test performance of the results, the results are very good. 1_Introduction


Speech compression based on wavelet and contourlet
Speech compression is very important in many applications.In this research we applied compression on speech by using wavelet transform and contourlet transform.By taking speech (which is often single dimension) into a two-dimensional array (to be suitable for transferring it to contourlet transformation), and then we applied wavelet transform .Then we make contourlet transform on coefficients of high wavelet.After transforming or storing speech and when using the decompression is applied by using inverse way of those transformations.Applied measuring (SNR, PSNR, NRMSE and Corr.) to test performance of the results, the results are very good.1_Introduction Although the Wavelet Transform (WT) is known to be a powerful in many signal and image processing applications such as compression, noise removal, image edge enhancement, and extraction, wavelets are not optimal in capturing the two-dimensional singularities.Found in images and often required in many segmentation and compression applications [1].In particular, natural images consist of edges that are smooth curves which cannot be captured efficiently by the wavelet transform.Therefore, several new transforms are required for image signals.
The Contourlet Transform (CT) is one of the new geometrical image transforms, which can efficiently represent images containing contours and textures.This transform uses a structure similar to that of curvelets, that is, a stage of sub band decomposition followed by a directional transform.In the contourlet transform, a Laplacian Pyramid (LP) is employed for the first stage, while Directional Filter Banks (DFBs) (as in Figure (1 -a) and figure (1 -b), are used in the angular decomposition stage.A comparison between the wavelet scheme and the contourlet shows the improved edge contours of the later.First applying a multiscale transform, followed by a local directional transform to gather the nearby basis functions at the same scale into linear structures.Consider the wavelet transform of a 2-D piecewise functions with a smooth discontinuity curve in addition to contourlet transform (as in figure 2(a)) both wavelet and contourlet transformation However, they are blind to the smoothness of this curve and it is easy to see that there are O(2j) significant wavelet coefficients at the scale 2 −j .Comparing the wavelet scheme with the contourlet (as in Figure 2(b)), we see that the improvement of the new scheme can be attributed to the grouping of nearby wavelet coefficients, since they are locally correlated due to the smoothness of the contours.Therefore, we can obtain a sparse expansion [2].In essence, a wavelet-like transform for edge detection, and then a local directional transform for contour segment detection are applied.The overall result is an image expansion using basic elements like contour segments, and thus are named contourlets and the process is called the contourlet transform (CT) [3].

The Fourth Scientific Conference of the College of Computer Science & Mathematics
[311] In this research wavelet transform is used, it has obtained the traditional three high pass bands corresponding to the LH, HLand HH bands.We apply Directional Filter Banks (DFB) with the same number of directions to each band in a given level (j).Level of the wavelet transform J, we decrease the number of directions at every other dyadic scale when we proceed through the coarser levels (j < J).In this way we could achieve it.

2-Wavelet transform
The fundamental idea behind wavelets is to analyze according to scale.The wavelet analysis procedure is to adopt a wavelet prototype function called an analyzing wavelet or mother wavelet.Any signal can then be represented by translated and scaled versions of the mother wavelet [4].The wavelets can be translated about time in addition to being compressed and widened [5].Speech compression is the technology of converting human speech into an efficiently encoded representation that can later be decoded to produce a close approximation of the original signal.The compressed speech signals using Discrete Wavelet Transform (DWT) techniques.Wavelet analysis is the breaking Speech compression based on wavelet and contourlet… [312] up of a signal into a set of scaled and translated versions of an original (or mother) wavelet.Taking the wavelet transform of a signal decomposes the original signal into wavelets coefficients at different scales and positions.These coefficients represent the signal in the wavelet domain and all data operations can be performed using just the corresponding wavelet coefficients.Wavelet transforms are broadly divided into three classes: continuous, discrete and multiresolution based [4].

2-1-Discrete wavelet transforms
It is computationally impossible to analyze a signal using all wavelet coefficients, so one may wonder if it is sufficient to pick a discrete subset of t h e u p p er h al f pl an e t o b e ab l e t o reco n st ru ct a si g n al f r om th e corresponding wavelet coefficients.One such system is the affine system for some real parameters a>1, b>0.The corresponding discrete subset of the half plane consists of all the points with integers .The corresponding baby wavelets are now given as [6,8] ψ m,n (t) = a − m / 2 ψ(a − m t − nb).
A sufficient condition for the reconstruction of any signal x of finite energy by the formula is that the functions form a tight frame of [6,7].

2-2-Multiresolution discrete wavelet transforms
In any discredited wavelet transform, there are only a finite number of wavelet coefficients for each bounded rectangular region in the upper half plane.Still, each coefficient requires the evaluation of an integral.To avoid this numerical complexity, one needs one auxiliary function, the father wavelet .Further, one has to restrict a to be an integer.A typical choice is a=2 and b=1 [ 6 , 9 ]  , where ψ m,n (t) = 2 − m / 2 ψ(2 − m t − n).[6, 7, 8, and 9].
From these one requires that the sequence forms a multiresolution analysis of and that the subspaces are the orthogonal "differences" of the above sequence, that is, W m is the orthogonal complement of V m inside the subspace V m − 1 .In analogy to the sampling theorem one may conclude that the space V m with sampling distance 2 m more or less covers the The second identity of the first pair is a refinement equation for the father wavelet φ.Both pairs of identities form the basis for the algorithm of the fast wavelet transform [6, 7, 8, 9, and 10].

3-Contourlet transform
The contourlet transform consists of two major stages: the subband decomposition and the directional transform [3].

3-1-Pyramid frames
One way to obtain a multiscale decomposition is to use the Laplacian pyramid (LP).The LP decomposition at each level generates a down sampled low pass version of the original and the difference between the original and the prediction, resulting in a band pass image.(As in Figure ( 4)) depicts this decomposition process, where H and G are called ( l o w p a s s ) a n al y si s a n d s y n t h e si s f i l t e r s, r e s p e c t i v el y , a n d M i s t h e Speech compression based on wavelet and contourlet… [314] sampling matrix.The process can be iterated on the course (sampled down) signal (Low frequencies).Note that in multidimensional filter banks, sampling is represented by sampling matrices; for example, down sampling x[n] by M yields xd[n] = x[Mn], where M is an integer matrix [12].

3-2-Iterated directional filter banks
Bamberger and Smith constructed a 2-D Directional Filter Bank (DFB) that can be maximally decimated while achieving perfect reconstruction.The DFB is efficiently implemented via an l-level binary tree decomposition that leads to 2 l sub bands with wedge-shaped frequency partitioning as in Figure (5).The original construction of the DFB involves modulating the input image and using quincunx filter banks with diamond-shaped filters [13].A proposed new construction for the DFB that avoids modulating the input image and has a simpler rule for expanding the decomposition tree [3,12].
His simplified DFB is intuitively constructed from two building blocks.The first building block is a two-channel quincunx filter bank with fan filters (as in Figure ( 6)) that divides a 2-D spectrum into two directions: horizontal and vertical.The second building block of the DFB is a shearing operator.Figure (10) shows an application of a shearing operator where a −45• direction edge becomes a vertical edge.By adding a pair of shearing operator and its inverse ("unshearing") before and after respectively, a two channel filter bank in Figure ( 9), obtain a different directional frequency partition while maintaining perfect reconstruction.Thus, the key in the DFB is to use an appropriate combination of shearing operators together with two-direction partition of quincunx filter banks at each node in a binary tree-structured filter bank, to obtain the desired 2-D spectrum division as shown in Figure (7).

The Fourth Scientific Conference of the College of Computer Science & Mathematics
[315]

4-Performance Measures
A number of quantitative parameters can be used to evaluate the performance of the wavelet based speech coder, in both reconstructed signal quality after decoding and compression .The following parameters are compared [10] 1-Signal to Noise Ratio (SNR) the following formula: Is the mean square of the speech signal is the mean square difference between the original and reconstructed signal.

5-Proposed algorithm
The proposed algorithm as shown in fig.(9) we used wavelet transform on speech signal after convert it from 1-D to 2-D.we applied contourlet transform on high coefficients of wavelet, and then make zeroing the high coefficients of contourlet for getting compression.To decompress this signal used inverse way and then listen.Measuring the correlated factor between the original signal and the retrieved ones show that they are closed to each other up to (-+0.90) as shown in table 1, in addition to evaluating the SNR plus the compression ratio gave a good result as seen in Table (1).The proposed algorithm can be used to build a small model to apply it practically to give result.

Figure (9) work scheme 6-Practical application
The proposed algorithm is applied with different speech signals and the performance was measured depending on evaluating some factors.The result of correlation factor, Normalized Root Mean Square Error, Peak Signal to Noise Ratio and Signal to Noise R a t i o ( S N R ) b e t w e e n o r i g i n a l s i g n a l a n d r e t r i e v e r s i g n a l a f t e r compression are very good when making HL LH HH contourlet transform and the coefficients of contourlet are set to zero, as we see in Table (1) and figure (10) Compression ratio equals 1: 2.2857 between original and reconstruct signal as we see in figure (11).

Figure ( 2 )
Figure (2) Illustration showing how capture point in wavelet and Figure (3) illustrates a schematic plot of the wavelet based contourlet transform using 3 wavelet levels and 3 d L = directional levels.

Figure ( 3 )
Figure (3) plot of the WBCT using 3 wavelet levels.The directional decomposition is overlaid the wavelet sub bands.
. F r o m t h e m o t h e r an d f a t h er wavelets one constructs the subspaces , where φ m,n (t) = 2 − m / 2 φ(2 − m t − n) And The Fourth Scientific Conference of the College of Computer Science & Mathematics [313] frequency base band from 0 to 2 − m − 1 .As orthogonal complement, W m roughly covers the band [2 − m − 1 , 2 − m ] [8, 9, 10, and 11].From those inclusions and orthogonally relations follows the existence of sequences and that satisfy the identities And And And .

Figure ( 5 )
Figure (5) Directional filter bank.Frequency partitioning where l = 3 and there are 23 = 8 real wedgeshaped frequency bands.Sub-bands 0-3 correspond to the mostly horizontal directions, while sub bands 4-7 correspond to the mostly vertical directions.

Figure ( 6 )
Figure(6).Two-dimensional spectrum partition using quincunx filter banks with fan filters.The black regions represent the ideal frequency supports of each filter.Q is a quincunx sampling matrix.

Figure ( 7 )
Figure (7) Example of shearing operation that is used like a rotation operation for DFB decomposition.(a) The "cameraman" image.(b) The "cameraman" image after a shearing operation.

Figure ( 8 )
Figure (8).Block diagram of the contourlet transform with two levels of multiscale decomposition.Gray regions represent the ideal pass band support of the component filters.Left: The iterated form.Right: The equivalent parallel form.

2 -
Peak Signal to Noise Ratio: N is the length of reconstructed signal, X is the maximum absolute square value of the signal X and is the energy of the difference between the original and reconstructed signal.3-Normalized Root Mean Square Error:x(n) is the speech signal , r(n) is the reconstructed signal ,and the mean of the speech signal .4-Correlation between original signals to the compressed signal.5-Compressionratio: I t i s t h e r a t i o o f t h e o r i g i n a l s i g n a l t o t h e compressed signal.