Lecture notes Digital Image Processing

digital image processing applications lecture notes and digital speech processing lecture notes. digital speech processing using matlab
BellaLloyd Profile Pic
Published Date:12-07-2017
Your Website URL(Optional)
LECTURE NOTES SUBJECT: DIGITAL IMAGE AND SPEECH PROCESSING SUBJECT CODE: ECS-702, BRANCH: EL&TCE SYLLABUS Module  I (12 hours) Digital Image 1. Different stages of Image processing & Analysis Scheme. Components of Image Processing System, Multiprocessor Interconnections. 2. A Review of various Mathematical Transforms. 3. Image Formation: Geometric Model, Photometric Model. 4. Image Digitization : A review of Sampling and quantization processes. A digital image. Module  II (12 Hours) Image Processing 5. Image Enhancement: Contrast Intensification, Smoothing, Image sharpening. 6. Restoration : Minimum Mean  Square Error Restoration by Homomorphic Filtering. 7. Image Compression : Schematic diagram of Data Compression Procedure, Lossless compression  coding. 8. Multivalued Image Processing, Multispectral Image Processing, Processing of color images. Module  III (8 Hours) Digital Speech Processing 1. The Fundamentals of Digital Speech Processing. A Review of Discrete-Time Signal & Systems , the Z-transform, the DFT, Fundamental of Digital Filters, FIR system, IIR Systems. 2. Time  Domain Methods for Speech Processing. Time-Dependent Processing of speech, short-time energy and Average Magnitude, Short time Average Zero- Crossing Rate. 3. Digital Representation of speech Waveform Sampling speech signals,statistical model,Instantaneous quantization, Instantaneous companding, quantization for optimum SNR,Adaptive quantization,Feed-forward Feedback adaptions. Module  IV (8 Hours) Linear Predictive Coding of Speech Block diagram of Simplified Model for Speech Production. Basic Principles of Linear Predictive Analysis- The Auto Correlation Method. The Prediction Error Signal. Digital Speech Processing for Man-Machine Communication by voice. Speaker Recognition Systems- Speaker verification and Speaker Identification Systems. . MODULE-1 DIGITAL IMAGE INTRODUCTION The digital image processing deals with developing a digital system that performs operations on a digital image. An image is nothing more than a two dimensional signal. It is defined by the mathematical function f(x,y) where x and y are the two co-ordinates horizontally and vertically and the amplitude of f at any pair of coordinate (x, y) is called the intensity or gray level of the image at that point. When x, y and the amplitude values of f are all finite discrete quantities, we call the image a digital image. The field of image digital image processing refers to the processing of digital image by means of a digital computer. A digital image is composed of a finite number of elements, each of which has a particular location and values of these elements are referred to as picture elements, image elements and pixels. Motivation and Perspective Digital image processing deals with manipulation of digital images through a digital computer. It is a subfield of signals and systems but focus particularly on images. DIP focuses on developing a computer system that is able to perform processing on an image. The input of that system is a digital image and the system process that image using efficient algorithms, and gives an image as an output. The most common example is Adobe Photoshop. It is one of the widely used applications for processing digital images. Applications Some of the major fields in which digital image processing is widely used are 1. Gamma Ray Imaging- Nuclear medicine and astronomical observations. 2. X-Ray imaging  X-rays of body. 3. Ultraviolet Band  Lithography, industrial inspection, microscopy, lasers. 4. Visual And Infrared Band  Remote sensing. 5. Microwave Band  Radar imaging. Components of Image Processing System i) Image Sensors With reference to sensing, two elements are required to acquire digital image. The first is a physical device that is sensitive to the energy radiated by the object we wish to image and second is specialized image processing hardware. ii) Specialize image processing hardware  It consists of the digitizer just mentioned, plus hardware that performs other primitive operations such as an arithmetic logic unit, which performs arithmetic such addition and subtraction and logical operations in parallel on images. iii) Computer It is a general purpose computer and can range from a PC to a supercomputer depending on the application. In dedicated applications, sometimes specially designed computer are used to achieve a required level of performance iv) Software It consist of specialized modules that perform specific tasks a well designed package also includes capability for the user to write code, as a minimum, utilizes the specialized module. More sophisticated software packages allow the integration of these modules. v) Mass storage This capability is a must in image processing applications. An image of size 1024 x1024 pixels, in which the intensity of each pixel is an 8- bit quantity requires one megabytes of storage space if the image is not compressed. Image processing applications falls into three principal categories of storage i) Short term storage for use during processing ii) On line storage for relatively fast retrieval iii) Archival storage such as magnetic tapes and disks vi) Image displays Image displays in use today are mainly color TV monitors. These monitors are driven by the outputs of image and graphics displays cards that are an integral part of computer system vii) Hardcopy devices The devices for recording image includes laser printers, film cameras, heat sensitive devices inkjet units and digital units such as optical and CD ROM disk. Films provide the highest possible resolution, but paper is the obvious medium of choice for written applications. viii) Networking It is almost a default function in any computer system in use today because of the large amount of data inherent in image processing applications. The key consideration in image transmission bandwidth. Elements of Visual Perception Structure of the human Eye The eye is nearly a sphere with average approximately 20 mm diameter. The eye is enclosed with three membranes a) The cornea and sclera: it is a tough, transparent tissue that covers the anterior surface of the eye. Rest of the optic globe is covered by the sclera b) The choroid: It contains a network of blood vessels that serve as the major source of nutrition to the eyes. It helps to reduce extraneous light entering in the eye It has two parts (1) Iris Diaphragms- it contracts or expands to control the amount of light that enters the eyes. (2) Ciliary body c) Retina  it is innermost membrane of the eye. When the eye is properly focused, light from an object outside the eye is imaged on the retina. There are various light receptors over the surface of the retina The two major classes of the receptors are- 1) cones- it is in the number about 6 to 7 million. These are located in the central portion of the retina called the fovea. These are highly sensitive to color. Human can resolve fine details with these cones because each one is connected to its own nerve end. Cone vision is called photopic or bright light vision 2) Rods  these are very much in number from 75 to 150 million and are distributed over the entire retinal surface. The large area of distribution and the fact that several roads are connected to a single nerve give a general overall picture of the field of view.They are not involved in the color vision and are sensitive to low level of illumination. Rod vision is called is scotopic or dim light vision. The absent of reciprocators is called blind spot Image Formation in the Eye The major difference between the lens of the eye and an ordinary optical lens in that the former is flexible. The shape of the lens of the eye is controlled by tension in the fiber of the ciliary body. To focus on the distant object the controlling muscles allow the lens to become thicker in order to focus on object near the eye it becomes relatively flattened. The distance between the center of the lens and the retina is called the focal length and it varies from 17mm to 14mm as the refractive power of the lens increases from its minimum to its maximum. When the eye focuses on an object farther away than about 3m.the lens exhibits its lowest refractive power. When the eye focuses on a nearly object. The lens is most strongly refractive. The retinal image is reflected primarily in the area of the fovea. Perception then takes place by the relative excitation of light receptors, which transform radiant energy into electrical impulses that are ultimately decoded by the brain. Brightness Adaption and Discrimination Digital image are displayed as a discrete set of intensities. The range of light intensity 10 levels to which the human visual system can adopt is enormous- on the order of 10 from scotopic threshold to the glare limit. Experimental evidences indicate that subjective brightness is a logarithmic function of the light intensity incident on the eye. The curve represents the range of intensities to which the visual system can adopt. But the visual system cannot operate over such a dynamic range simultaneously. Rather, it is accomplished by change in its overcall sensitivity called brightness adaptation. For any given set of conditions, the current sensitivity level to which of the visual system is called brightness adoption level , B in the curve. The small intersecting curve a represents the range of subjective brightness that the eye can perceive when adapted to this level. It is restricted at level B , at and below which all stimuli are perceived as b indistinguishable blacks. The upper portion of the curve is not actually restricted. whole simply raise the adaptation level higher than B . a The ability of the eye to discriminate between change in light intensity at any specific adaptation level is also of considerable interest. Take a flat, uniformly illuminated area large enough to occupy the entire field of view of the subject. It may be a diffuser such as an opaque glass, that is illuminated from behind by a light source whose intensity, I can be varied. To this field is added an increment of illumination I in the form of a short duration flash that appears as circle in the center of the uniformly illuminated field. If I is not bright enough, the subject cannot see any perceivable changes. I+I As I gets stronger the subject may indicate of a perceived change. I is the increment of c illumination discernible 50% of the time with background illumination I. Now, I /I is c called the Weber ratio. Small value means that small percentage change in intensity is discernible representing  good brightness discrimination. Large value of Weber ratio means large percentage change in intensity is required representing  poor brightness discrimination . Optical illusion In this the eye fills the non existing information or wrongly pervious geometrical properties of objects. Fundamental Steps in Digital Image Processing There are two categories of the steps involved in the image processing 1. Methods whose outputs are input are images. 2. Methods whose outputs are attributes extracted from those images. Color Image Processing Wavelets & Image Morphological Image Multiresolution Compression Processing Processing Image Restoration Image Segmentation Image Enhancement Representation and Knowledge Base description Image Acquisition Objects recognition Fundamental Steps in DIP i) Image acquisition It could be as simple as being given an image that is already in digital form. Generally the image acquisition stage involves processing such as scaling. ii) Image Enhancement It is among the simplest and most appealing areas of digital image processing. The idea behind this is to bring out details that are obscured or simply to highlight certain features of interest in image. Image enhancement is a very subjective area of image processing. iii) Image Restoration It deals with improving the appearance of an image. It is an objective approach, in the sense that restoration techniques tend to be based on mathematical or probabilistic models of image processing. Enhancement, on the other hand is based on human subjective preferences regarding what constitutes a  good enhancement result iv) Color image processing It is an area that is been gaining importance because of the use of digital images over the internet. Color image processing deals with basically color models and their implementation in image processing applications. v) Wavelets and Multiresolution Processing These are the foundation for representing image in various degrees of resolution vi) Compression It deals with techniques reducing the storage required to save an image, or the bandwidth required to transmit it over the network. It has to major approaches: a) Lossless Compression b) Lossy Compression vii) Morphological processing It deals with tools for extracting image components that are useful in the representation and description of shape and boundary of objects. It is majorly used in automated inspection applications. viii) Representation and Description It always follows the output of segmentation step that is, raw pixel data, constituting either the boundary of an image or points in the region itself. In either case converting the data to a form suitable for computer processing is necessary. ix) Recognition It is the process that assigns label to an object based on its descriptors. It is the last step of image processing which use artificial intelligence software. Knowledge base Knowledge about a problem domain is coded into an image processing system in the form of a knowledge base. This knowledge may be as simple as detailing regions of an image where the information of the interest in known to be located. Thus limiting search that has to be conducted in seeking the information. The knowledge base also can be quite complex such interrelated list of all major possible defects in a materials inspection problems or an image database containing high resolution satellite images of a region in connection with change detection application A Simple Image Model An image is denoted by a two dimensional function of the form fx, y. The value or amplitude of f at spatial coordinates x,y is a positive scalar quantity whose physical meaning is determined by the source of the image. When an image is generated by a physical process, its values are proportional to energy radiated by a physical source. As a consequence, f(x,y) must be nonzero and finite; that is 0 f(x,y)  The function f(x,y) may be characterized by two components- · The amount of the source illumination incident on the scene being viewed. · The amount of the source illumination reflected back by the objects in the scene These are called illumination and reflectance components and are denoted by i(x,y) and r(x,y) respectively. The functions combine as a product to form f(x,y) We call the intensity of a monochrome image at any coordinate (x,y) the gray level (l) of the image at that point l= f (x, y) , L d l d L min max L is to be positive and L must be finite min max L = imin rmin min L = imax rmax max The interval L , L is called gray scale. Common practice is to shift this interval min max numerically to the interval 0, L-l where l=0 is considered black and l= L-1 is considered white on the gray scale. All intermediate values are shades of gray varying from black to white. Image Digitization To create a digital image, we need to convert the continuous sensed data into digital from. This involves two processes  sampling and quantization. An image may be continuous with respect to the x and y coordinates and also in amplitude. To convert it into digital form we have to sample the function in both coordinates and in amplitudes. Digitalizing the coordinate values is called sampling Digitalizing the amplitude values is called quantization There is a continuous image along the line segment AB. To sample this function, we take equally spaced samples along line AB. The location of each samples is given by a vertical tick back (mark) in the bottom part. The samples are shown as block squares superimposed on function the set of these discrete locations gives the sampled function. In order to form a digital image, the gray level values must also be converted (quantized) into discrete quantities. So we divide the gray level scale into eight discrete levels ranging from black to white. The vertical tick mark assign the specific value assigned to each of the eight level values. The continuous gray levels are quantized simply by assigning one of the eight discrete gray levels to each sample. The assignment it made depending on the vertical proximity of a simple to a vertical tick mark. Starting at the top of the image and covering out this procedure line by line produces a two dimensional digital image. Digital Image Definition A digital image fm,n described in a 2D discrete space is derived from an analog image f(x,y) in a 2D continuous space through a sampling process that is frequently referred to as digitization. Some basic definitions associated with the digital image are described. The 2D continuous image f(x,y) is divided into N rows and M columns. The intersection of a row and a column is termed a pixel. The value assigned to the integer coordinates m,n with m=0,1,2,..., M-1andn=0,1,2,...,N-1is fm,n. In fact, in most cases f(x,y) is actually a function of many variables including depth (d), color(µ) and time (t). There are three types of computerized processes in the processing of image 1) Low level process- these involve primitive operations such as image processing to reduce noise, contrast enhancement and image sharpening. These kind of processes are characterized by fact the both inputs and output are images. 2) Mid level image processing - it involves tasks like segmentation, description of those objects to reduce them to a form suitable for computer processing, and classification of individual objects. The inputs to the process are generally images but outputs are attributes extracted from images. 3) High level processing  It involves  making sense of an ensemble of recognized objects, as in image analysis, and performing the cognitive functions normally associated with vision. Representing Digital Images The result of sampling and quantization is matrix of real numbers. Assume that an image f(x,y) is sampled so that the resulting digital image has M rows and N Columns. The values of the coordinates (x,y) now become discrete quantities thus the value of the coordinates at origin become ( x,y) =(0,0) The next Coordinates value along the first signify the image along the first row. It does not mean that these are the actual values of physical coordinates when the image was sampled. Thus the right side of the matrix represents a digital element, pixel or pel. The matrix can be represented in the following form as well. The sampling process may be viewed as partitioning the x-y plane into a grid with the coordinates of the center of each grid being a pair of elements from the Cartesian products Z2 which is the set of all ordered pair of elements (Zi, Zj) with Zi and Zj being integers from Z. Hence f(x,y) is a digital image if gray level (that is, a real number from the set of real number R) to each distinct pair of coordinates (x,y). This functional assignment is the quantization process. If the gray levels are also integers, Z replaces R, and a digital image become a 2D function whose coordinates and the amplitude value are integers. Due to processing storage and hardware consideration, the number of gray levels K typically is an integer power of 2. L=2 Then, the number b, of bits required to store a digital image is B=M N K 2 When M=N The equation become b=N K k When an image can have 2 gray levels, it is referred to as  k- bit . An image with 256 8 possible gray levels is called an  8-bit image (because 256=2 ). Spatial and Gray Level Resolution Spatial resolution is the smallest discernible details are an image. Suppose a chart can be constructed with vertical lines of width w with the space between the also having width W, so a line pair consists of one such line and its adjacent space thus. The width of the line pair is 2w and there is 1/2w line pair per unit distance resolution is simply the smallest number of discernible line pair unit distance. Gray levels resolution refers to smallest discernible change in gray levels. Measuring discernible change in gray levels is a highly subjective process reducing the number of bits R while repairing the spatial resolution constant creates the problem of false contouring .it is caused by the use of an insufficient number of gray levels on the smooth areas of the digital image . It is called so because the rides resemble top graphics contours in a map. It is generally quite visible in image displayed using 16 or less uniformly spaced gray levels. Iso Preference Curves To see the effect of varying N and R simultaneously. There picture are taken having little, mid level and high level of details. Different image were generated by varying N and k and observers were then asked to rank the results according to their subjective quality. Results were summarized in the form of iso-preference curve in the N-k plane. The iso-preference curve tends to shift right and upward but their shapes in each of the three image categories are shown in the figure. A shift up and right in the curve simply means large values for N and k which implies better picture quality The result shows that iso-preference curve tends to become more vertical as the detail in the image increases. The result suggests that for image with a large amount of details only a few gray levels may be needed. For a fixed value of N, the perceived quality for this type of image is nearly independent of the number of gray levels used. Pixel Relationships Neighbors of a pixel A pixel p at coordinate (x,y) has four horizontal and vertical neighbor whose coordinate can be given by (x+1, y) (x-1,y) (x ,y + 1) (x, y-1) This set of pixel is called the 4-neighbours of p and is denoted by n4(p), Each pixel is at a unit distance from (x,y) and some of the neighbors of P lie outside the digital image or (x,y) is on the border of the image . The four diagonal neighbor of P have coordinates (x+1,y+1),(x+1,y+1),(x-1,y+1),(x-1,y-1) And are denoted by nd(p) these points, together with the 4-neighbours are called 8  neighbors of P denoted by n8(p) Adjacency Let V be the set of gray level values used to define adjacency in a binary image, if V=1 we are referencing to adjacency of pixel with value. Three types of adjacency occurs 4- Adjacency  two pixel P and Q with value from V are 4 adjacency if A is in the set n4(P) 8- Adjacency  two pixel P and Q with value from V are 8 adjacency if A is in the set n8(P) M-adjacency  two pixel P and Q with value from V are m adjacency if · Q is in n4 (p) or · Q is in nd (q) and the set N4(p) È N4(q) has no pixel whose values are from V Distance measures For pixel p, q and z with coordinate (x,y), (s,t) and (v,w) respectively D is a distance function or metric if D p.q e O Dp.q = O iff p=q D p.q = D p.q and D p.q e O Dp.q+D(q,z) The Eucledean Distance between p and is defined as De (p,q) = Iy  t I The D4 Education Distance between p and is defined as De (p,q) = Iy  t I UNIT -2 IMAGE ENHENCEMENT IN SPATIAL DOMAIN Introduction The principal objective of enhancement is to process an image so that the result is more suitable than the original image for a specific application. Image enhancement approaches fall into two board categories ð Spatial domain methods ð Frequency domain methods The term spatial domain refers to the image plane itself and approaches in this categories are based on direct manipulation of pixel in an image. Spatial domain process are denoted by the expression g(x,y)=Tf(x,y) f(x,y)- input image T- operator on f, defined over some neighborhood of f(x,y) g(x,y)-processed image The neighborhood of a point (x,y) can be explain by using as square or rectangular sub image area centered at (x,y). The center of sub image is moved from pixel to pixel starting at the top left corner. The operator T is applied to each location (x,y) to find the output g at that location . The process utilizes only the pixel in the area of the image spanned by the neighborhood. Basic Gray Level Transformation Functions It is the simplest form of the transformations when the neighborhood is of size IXI. In this case g depends only on the value of f at (x,y) and T becomes a gray level transformation function of the forms S=T(r) r- Denotes the gray level of f(x,y) s- Denotes the gray level of g(x,y) at any point (x,y) Because enhancement at any point in an image deepens only on the gray level at that point, technique in this category are referred to as point processing. There are basically three kinds of functions in gray level transformation  Point Processing Contract stretching - It produces an image of higher contrast than the original one. The operation is performed by darkening the levels below m and brightening the levels above m in the original image. In this technique the value of r below m are compressed by the transformation function into a narrow range of s towards black .The opposite effect takes place for the values of r above m. Thresholding function  It is a limiting case where T(r) produces a two levels binary image. The values below m are transformed as black and above m are transformed as white. Basic Gray Level Transformation These are the simplest image enhancement techniques Image Negative  The negative of in image with gray level in the range 0, l-1 is obtained by using the negative transformation. The expression of the transformation is s= L-1-r Reverting the intensity levels of an image in this manner produces the equivalent of a photographic negative. This type of processing is practically suited for enhancing white or gray details embedded in dark regions of an image especially when the black areas are dominant in size. Log transformations- The general form of log transform is S=c log(1+R) Where R e 0 This transformation maps a narrow range of gray level values in the input image into a wider range of output gray levels. The opposite is true for higher values of input levels. We would use this transformations to expand the values of dark pixels in an image while compressing the higher level values. The opposite is true for inverse log transformation. The log transformation function has an important characteristic that it compresses the dynamic range of images with large variations in pixel values. Eg- Fourier spectrum Power law transformation Power law transformation has the basic function S= cry Where c and y are positive constants. Power law curves with fractional values of y map a narrow range of dark input values into a wider range of output values, with the opposite being true for higher values of input gray levels. We may get various curves by varying values of y. A variety of devices used for image capture, printing and display respond according to a power law. The process used to correct this power law response phenomenon is called gamma correction. For eg-CRT devices have intensity to voltage response that is a power function Gamma correction is important if displaying an image accurately on a computer screen is of concern. Images that are not corrected properly can look either bleached out or too dark. Color phenomenon also uses this concept of gamma correction. It is becoming more popular due to use of images over the internet. It is important in general purpose contract manipulation. To make an image black we use y1 and y1 for white image. Piece wise Linear transformation functions- The principal advantage of piecewise linear functions is that these functions can be arbitrarily complex. But their specification requires considerably more user input Contrast Stretching- It is the simplest piecewise linear transformation function. We may have various low contrast images and that might result due to various reasons such as lack of illumination, problem in imaging sensor or wrong setting of lens aperture during image acquisition. The idea behind contrast stretching is to increase the dynamic range of gray levels in the image being processed. The location of points (r1,s1) and (r ,s ) control the shape of the curve 2 2 a) If r =r and s =s , the transformation is a linear function that deduces no change in gray 1 2 1 2 levels. b) If r1=s1, s1=0 , and s2=L-1, then the transformation become a thresholding function that creates a binary image c) Intermediate values of (r1, s1) and (r2, s2) produce various degrees of spread in the gray value of the output image thus effecting its contract. Generally r1d r2 and s1 d s2 so that the function is single valued and monotonically increasing Gray Level Slicing- Highlighting a specific range of gray levels in an image is often desirable For example when enhancing features such as masses of water in satellite image and enhancing flaws in x- ray images. There are two ways of doing this- (1) One method is to display a high value for all gray level in the range. Of interest and a low value for all other gray level. (2) Second method is to brighten the desired ranges of gray levels but preserve the background and gray level tonalities in the image Bit Plane Slicing Sometimes it is important to highlight the contribution made to the total image appearance by specific bits. Suppose that each pixel is represented by 8 bits. Imagine that an image is composed of eight 1-bit planes ranging from bit plane 0 for the least significant bit to bit plane 7 for the most significant bit. In terms of 8-bit bytes, plane 0 contains all the lowest order bits in the image and plane 7 contains all the high order bits High order bits contain the majority of visually significant data and contribute to more subtle details in the image. Separating a digital image into its bits planes is useful for analyzing the relative importance