university computervision lecture1b
Source:
Computer Vision Lecture 1b
This lecture continues the week 1 material with:
- interpolation recap
- 1D and 2D interpolation
- extrapolation
- image representation in NumPy/OpenCV
- domain iterators
- histograms
- point operators
- histogram-based contrast improvement
Administrative note
- Lecturer shown on the slides:
Dimitris Tzionas - Contact shown on the slides:
d.tzionas@uva.nle.a.veltmeijer@uva.nl
- A note on the first slide says to
please CC your TA
Recap: storing continuous images
The lecture starts from the question:
- how can we store a continuous image?
Answer:
- discretize the spatial domain by
sampling - discretize the image range by
quantization
Important implementation note from the slides:
2^8 = 256values can be stored inuint8- for processing, switch to
float - for storing or displaying, switch back to
uint8 - watch out for
overflow - evaluation order matters
Example normalization formula from the slides:
OpenCV note:
- OpenCV uses
BGR, notRGB
Discretization and interpolation
Discrete pixels are samples of a continuous image:
with the slides using unit sample spacing for the simple case.
The interpolation problem is:
- start from discrete samples
F - infer a continuous approximation
\hat{f} - use
\hat{f}to estimate values between sample points
1D interpolation
Nearest neighbor
Idea:
- copy the nearest sample
Formula from the lecture:
Properties:
- uses
1nearby sample - produces a
staircasefunction - defined everywhere
- not continuous
- not differentiable
Linear interpolation
For x \in [k, k+1], mix the left and right sample:
Interpretation:
- the coefficients are steering or mixing weights
- closer sample gets more weight
Properties:
- uses
2samples - piecewise linear
- continuous
- not differentiable at the sample locations
Cubic interpolation
For the segment x \in [k, k+1], the lecture uses a cubic polynomial:
with constraints from four samples:
F(k-1)F(k)F(k+1)F(k+2)
So cubic interpolation:
- uses
4samples - is smooth
- is continuous
- is differentiable
The lecture also warns that higher-order polynomials can overfit and fluctuate too much between sample points, so in practice you should prefer the smallest order that works well.
Interpolation cheat sheet
Nearest neighbor
- samples used:
1 - shape: staircase
- continuous: no
- differentiable: no
Linear
- samples used:
2 - shape: piecewise linear
- continuous: yes
- differentiable: not at sample points
Cubic
- samples used:
4 - shape: smooth
- continuous: yes
- differentiable: yes
2D interpolation
Bilinear interpolation
For a point:
inside a grid cell, bilinear interpolation uses the four corner samples:
F(i,j)F(i+1,j)F(i,j+1)F(i+1,j+1)
Formula from the lecture:
Interpretation:
- linear interpolation in
x - then linear interpolation in
y
Generalization mentioned on the slides:
- nearest neighbor in
ddimensions uses1sample - bilinear in
2Duses2^2 = 4samples - trilinear in
3Duses2^3 = 8samples - bicubic in
2Duses4^2 = 16samples
Interpolation vs extrapolation
Interpolation:
- estimate values
insidethe image grid
Extrapolation:
- estimate values
outsidethe image grid
Extrapolation strategies mentioned in the lecture:
- closest point
- mirror
- wrap / tiling
Image representation in Python and OpenCV
The slides emphasize the NumPy/OpenCV indexing convention:
F[y, x]- equivalently
F[i, j]
For color images:
- tensors have shape
H x W x 3 - channels are stored as
BGRin OpenCV - an optional fourth channel can represent transparency
Domain iterators
The lecture defines domain iterators as a way to visit every pixel index tuple.
Example from the slides:
def domainIterator(size):
for i in xrange(size[0]):
for j in xrange(size[1]):
yield (i, j)And a pointwise negation example:
def negateImage(image):
result = empty(image.shape)
for p in domainIterator(image.shape):
result[p] = 1 - image[p]
return resultMain idea:
- work with tuples like
(i, j) - use the same iterator pattern for whole images or sub-images
Histograms
Univariate histogram
A histogram summarizes an image by counting how many pixel values fall into each bin.
Notation from the lecture:
Important points:
- choosing bin size is not trivial
- too many bins gives many empty bins and noisy detail
- too few bins loses distribution structure
- the slides mention
Sturges' rule:
where:
kis the number of binsnis the number of pixels
Histogram vs cumulative histogram
Histogram:
- counts frequencies per bin
Cumulative histogram:
- counts how many pixels have value less than or equal to a threshold
The cumulative version corresponds to a cumulative distribution function.
Multivariate histogram
For color images, a single 1D histogram loses channel correlations.
Two options mentioned:
- three separate 1D histograms, one per channel
- one 3D histogram over the joint color space
The lecture recommends the 3D histogram as the better representation if you care about color correlations.
Point operators
Point operators process each pixel independently, using only values from the same location.
General form from the lecture:
and for every location:
Examples from the slides:
- negation:
- add a scalar:
- logarithmic transform:
- pointwise addition of two images:
Image arithmetic
For two images on the same grid:
The slides also give a weighted sum example:
Alpha blending
Weighted average of two images:
Meaning:
\alpha = 0gives onlyf\alpha = 1gives onlyg- intermediate values mix both
Unsharp masking
If g is a blurred version of f, the lecture defines:
Interpretation:
f-gextracts detail- adding it back sharpens the image
Thresholding and binarization
Thresholding converts a grayscale image to a binary image:
So each pixel becomes:
1if above threshold0otherwise
Histogram issues
The lecture points out common contrast problems:
- image values only occupy a narrow middle range
- large bright values may be missing, making the image too dark
- low dark values may be missing, making the image too bright
- changing illumination can strongly change the histogram
Histogram contrast stretching
Goal:
- use the full luminance range
Formula:
Interpretation:
- subtract the minimum
- divide by the old range
- map the image to
[0,1]
This improves contrast when the input occupies only a narrow luminance interval.
Histogram equalization
Goal:
- transform the image so luminance values are distributed as evenly as possible
The lecture motivates this using the cumulative histogram:
Then:
- visit each pixel in
f - read its luminance value
u - use
uas a lookup key in the normalized cumulative histogram - write the resulting value into the output image
g
Core idea:
- cumulative histogram becomes a lookup table
- percentile ordering is preserved
- the output uses the available intensity range more effectively
Histogram thresholding
The lecture gives a histogram-based segmentation idea:
- separate foreground from background using a threshold
Manual option:
- choose a threshold by looking at a bimodal histogram
Automatic option:
IsoData thresholding
Definitions from the slides:
m_L(t): mean of values<= tm_H(t): mean of values> t
Then update:
and iterate until convergence.
Main takeaways
- interpolation reconstructs values between samples
- nearest, linear, and cubic interpolation trade off smoothness and sample use
- bilinear interpolation is the 2D extension of linear interpolation
- histograms summarize intensity distributions
- point operators transform pixel values independently
- histogram stretching and equalization improve contrast in different ways
- thresholding uses pixel values or histogram structure to segment an image