Lecture 3
Seam Carving
Content aware resizing.
Intuition: preserve the most interesting content (remove pixels with low gradient energy). To reduce or increase the size in one dimension, remove irregular shaped "seams". (Optimal solution via dp).
Energy(f)=(∂x∂f)2+(∂y∂f)2
Want to remove seam where they won't be very noticeable, measure energy as gradient magnitude.
Choose seam based on minimum total energy path across image, subject to 8-connectedness.
Algorithm:
Let vertical seam s consist of h positions that form an 8-connected path.
Let the cost of seam be Cost(s)=∑i=1hEnergy(f(si))
Optimal seam minimize the cost: s∗=minsCost(s)
Compute efficiently with DP.
Identify min cost seam (height h, width w):
Greedy
For each entry (i,j):
Min value in the last row of M indicates the end of minimal connected vertical seam. Backtrack up, selecting min of 3 above in M.
Can also insert seams to increase size of imgae in either dimension. (duplicate optimal seam, average with neighbor)
M is min of gradient, high gradient part - energy - edge or important area (seam)
2D Transformations
Fit the parameters of transformation according to a set of matching feature pairs (alignment problem)
Parametric Warping
p=(x,y) -> T -> p′=(x′,y′)
Transformation T is a coordinate-changing machine.
p′=T(p)
T is global means is the same for any point p and can be described by just a few numbers.
Matrix representation of T:
p′=Mp
Scaling
Scaling a coordinate means multiplying each of its components by a scalar.
Uniform scaling means this scalar is same for all components.
Non-uniform scaling: different scalars per component.
E.g:
x′=ax , y′=by
Matrix:


Only linear 2D can be represented by 2x2 matrix.
Homogenous Coordinates
convenient.
(x,y)⟹xy1 convert to homogenous coordinates.
xyw⟹(wx,wy) convert from homogenous coordinates.
How to represent 2d translation as 3x3 matrix using homogenous coordinates.
x′=x+txy′=y+ty
Using rightmost column:
Translation=100010txty1
x′y′1=100010txty1xy1=x+txy+ty1

Affine Transformations - combinations of linear transformations and translations.
E.g parallel lines remain parallel.
x′y′w′=ad0be0cf1xyw
Projective transformations: affine and projective warps. Parallel lines does not necessarily remain paralle.
x′y′w′=adgbehcfixyw
Other def:
Mosaic: obtain wider angle view by combining multiple images.
Image warping: image plane in front -> image plane below, image rectification.
Deep Learning Fundamentals
Linear Classifier
NN
Prametric Approach. f(x,W)=Wx+b

Hard if data not linearly separable.
Now - we need a loss function and optimization.
Loss Function
Give a dataset of examples {(xi,yi)}i=1N, where xi is image and yi is label.
Loss over dataset: L=N1∑iLi(f(xi,W),yi)
Multiclass SVM loss: score of correct class should be higher than that of any other class (by some margin)
Let score vector be s=f(xi,W)
SVM loss: Li=∑j=yi{0sj−syi+1ifsyi≥sj+1otherwise
Can be simplified to: Li=∑j=yimax(0,sj−syi+1)
Over full dataset: L=N1∑i=1NLi
Regularization
prevent model from doing too well on training data.
L(W)=N1∑iLi(f(xi,W),yi)+λR(W)
λ: regularization strength (param)
L2: R(W)=∑k∑lWk,l2
L1: R(W)=∑k∑l∣Wk,l∣
Elastic Net: R(W)=∑k∑lβWk,l2+∣Wk,l∣
Dropout, Batch Normalization, Stochastic depth, fractional pooling, etc.
Last updated