Lecture 2

Image Formation

Digital Image

Basic: Scene element + Illumination (source) as input to imaging system -> Image plane

Digital camera: replaces film with a sensor array, each cell in the array is light-sensitive diode that converts photons to electrons.

Sample the 2D space on a regular grid and Quantize each sample. Image thus represented as a matrix of integer values.

Digital Image representations

Each sensor records amount of light coming in.

Color Images

Each sensor has a filter (R, G, B) layer that filters red, blue or green and result in a pattern. Estimating RGB at each cell from neighboring values.

RGB space.

Digital color image:

Image Filtering

Compute a function of the local neighborhood at each pixel in the image.

- Function specified by a "filter" or mask saying how to combine values from neighbors.

Uses:

  • Enhance an image (denoise, resize, increase contrast etc)

  • Extract information (texture, edges, interest points, etc)

  • Detect patterns (template matching)

Noise reduction

Multiple images of same static scene will not be identical.

Common types of noise:

  • Salt and pepper noise: random occurrences of black and white pixels.

  • Impulse noise: random occurrences of white pixels.

  • Gaussian noise: variations in intensity drawn from a Gaussian normal distribution.

Gaussian Noise

f(x,y)=f^(x,y)+η(x,y)f(x, y) = \hat{f} (x,y) + \eta (x,y) where f^(x,y)\hat{f}(x,y) is the ideal image and η(x,y)\eta (x,y) is the noise process.

Gaussia noise: η(x,y)N(μ,σ)\eta (x,y) \sim \mathcal{N} (\mu, \sigma) .

Here, σ\sigma controls how strong the noice is. μ\mu is often 0, noise has zero mean. x,yx, y is just the coordinate.

How to reduce noise?

Attempt 1: Replace each pixel with an average of all value in its neighborhood.

Assumptions:

  • Expect pixels to be like thier neighbors.

  • Noise processes to be independent from pixel to pixel.

Correlation filtering

Say averaging window size is 2k+1×2k+12k+1 \times 2k+1 :

g(i,j)=1(2k+1)2u=kkv=kkf(i+u,j+v)g(i, j) = \frac{1}{(2k+1)^2} \sum_{u=-k}^k \sum_{v=-k}^kf(i+u, j+v)

Here, first part is attribute uniform weight to each pixel, the second part is loop over all pixels in neighborhood around image pixel f[i, j].

Now we generalize to allow different wieghts depending on neighboring pixel's relative position:

g(i,j)=u=kkv=kkh(u,v)f(i+u,j+v)g(i, j) = \sum_{u=-k}^k \sum_{v=-k}^k h(u,v)f(i+u, j+v)

Where h(u,v)h(u,v) is non-uniform weights. This is cross-correlation, denoted g=hfg = h \bigotimes f

Filtering: replace each pixel with a linear combination of its neighbors, the filter "kernel" or "mask" h[u, v] is the prescription for the weights.

Averaging Filter

box filter

Smoothing by averaging.

Near boudary issue- methods:

  • clip filter (black)

  • wrap around

  • copy edge

  • reflect across edge

Gaussian Filter

If we want nearest neighboring pixels to have most influence on output.

Kernel approximation is a 2d gaussian function: h(u,v)=12πσ2eu2+v2σ2h(u,v) = \frac{1}{2\pi \sigma^2}e^{-\frac{u^2 + v^2}{\sigma^2}}

This removes high-frequency component from image (low pass filter)

Parameter: size of kernel. Variance determine extent of smoothing.

Smoothing filter properties:

  • Values positive

  • Sum to 1 -> constant regions same as input

  • Amount of smoothing proportional to mask size

  • Remove "high-frequency" components, "low-pass" filter.

Convolution

Filp the filter in both dimensions (bottom to top, right to left), then apply cross-correlation.

g(i,j)=u=kkv=kkh(u,v)f(iu,jv)g(i, j) = \sum_{u = -k}^k \sum_{v = -k}^k h(u,v)f(i-u, j-v)

g=hfg = h * f - notation for convolution operator

Properties:

  • Shift invariant - operator behaves the same everywhere

  • Superposition: h(f1+f2)=(hf1)+(hf2)h* (f1 + f2) = (h*f1) + (h*f2)

  • Commutative: fg=gff*g = g*f

  • Associative: (fg)h=f(gh)(f*g)*h = f*(g*h)

  • Distributes over addition: f(g+h)=(fg)+(fh)f*(g+h) = (f*g) + (f*h)

  • Scalars factor out: kfg=fkg=k(fg)kf*g = f*kg = k(f*g)

  • Identity: unit impulse e=[,0,0,1,0,0,].fe=fe = [ \dots , 0,0,1,0,0,\dots]. f*e=f

Median Filter

No new pixel values introduced, removes spikes - good for impulse, salt & pepper noise and non-linear filter. Also edge preserving.

Edge Detection

Map image from 2d array of pixels to a set of curves or line segments or contours. Look for strong gradients, post-process.

A edge is a place of rapid change in the image intensity function.

For 2D, f(x,y)f(x,y), the partial derivative is:

f(x,y)x=limϵ0f(x+ϵ,y)f(x,y)ϵ\frac{\partial f(x,y)}{\partial x} = \lim_{\epsilon \rightarrow 0} \frac{f(x+\epsilon, y) - f(x,y)}{\epsilon}

For discrete data, we approximate using finite differences:

f(x,y)xf(x+1,y)f(x,y)1\frac{\partial f(x,y)}{\partial x} \approx \frac{f(x+1, y) - f(x,y)}{1}

Filters:

filters for edge detection

Image Gradient

Δf=[fx,fy]\Delta f = [\frac{\partial f}{\partial x}, \frac{\partial f}{\partial y}]

In the direction of most rapid change in intensity.

Gradient detection (orientation of edge normal) is given by θ=tan1(fy/fx)\theta = tan^-1(\frac{\partial f}{\partial y}/\frac{\partial f}{\partial x})

Edge strength is given by magnitude of gradient: Δf=(fx)2+(fy)2|| \Delta f || = \sqrt{(\frac{\partial f}{\partial x})^2 + (\frac{\partial f}{\partial y})^2}

Effect of noise

Different filters respond strongly to noise. Image noise results in pixels look very different from their neighbors, generally, larger noise -> stronger response.

Solution: smooth first.

Edge: look for peaks in x(hf)\frac{\partial}{\partial x} ( h *f).

Derivative theorem of convolution:

x(hf)=(xh)f\frac{\partial }{\partial x} (h * f) = (\frac{\partial}{\partial x} h) * f

Smoothing gaussian, effect of σ\sigma - larger value result in larger scale edges detected. Smaller values-> fine features detected.

Last updated