Peng Et Al.: Deep Learning and Practice 1
Peng Et Al.: Deep Learning and Practice 1
Chapter 1
Introduction
& %
Spring 2018
ES
A
1896
CS/NCTU
'
Peng et al.: Deep Learning and Practice 2
$
Machine Learning
• Acquiring knowledge by extracting patterns from raw data
• Example: To predict a person’s wellness t from their MRI scan x by
learning patterns from the medical records {x, t} of some population
t
– x: MRI scan
– φ(x): data representation of MRI scan
– y ∈ (0, 1): model prediction with parameter w
1
y = fw (φ(x)) , σ(wT φ(x)), where σ(s) =
1 + e−s
– t ∈ {0, 1}: ground-truth result associated with input x
& %
Spring 2018
ES
A
1896
CS/NCTU
'
Peng et al.: Deep Learning and Practice 3
$
– Cost: some distance between y and t (e.g. ky − tk22 ), which is to
be minimized w.r.t. w over the {x, t} pairs
• Essentially, we want to find a function fw (φ(x)) to approximate t(x)
• In the present example, fw (φ(x)) bears a probabilistic interpretation of
p(t = 1|x; w)
• The setting here is termed supervised learning as the ground-truth
result t is given for each x
& %
Spring 2018
ES
A
1896
CS/NCTU
'
Peng et al.: Deep Learning and Practice 4
$
200
0
x2
100
−1
0
−2 0 0.5 1 1.5 2
−2 −1 0 1 2 p
x1 φ1 (x) = x21 + x22
& %
Spring 2018
ES
A
1896
CS/NCTU
'
Peng et al.: Deep Learning and Practice 5
$
Deep Learning
• A machine learning approach whose data representation is based on
building up a hierarchy of concepts, with each concept defined through
its relation to simpler concepts
• Using the previous example, this amounts to learning a function of the
following form
fw,θn ,θn−1 ,∙∙∙ ,θ1 (x) = σ(wT φθn (φθn−1 (φθn−2 (∙ ∙ ∙ φθ1 (x)))))
| {z }
Hierarchy of concepts/features
& %
Spring 2018
ES
A
1896
CS/NCTU
'
Peng et al.: Deep Learning and Practice 6
$
φθ1 (x)
& %
Spring 2018
ES
A
1896
CS/NCTU
'
Peng et al.: Deep Learning and Practice 7
$
& %
Spring 2018
ES
A
1896
CS/NCTU
'
Peng et al.: Deep Learning and Practice 8
$
A
1896
CS/NCTU