Lecture 1 - Intro
Lecture 1 - Intro
CE4172 - Tiny ML
Course Overview
What is TinyML about?
Deployment of Machine Learning models on tiny devices
• low cost microcontrollers with kiloBytes range (kB) of memory with
integrated sensors
• bare metal system - no operating system support
• very low power consumption
In this course
• Arduino Nano 33 BLE Sense board
• Cortex M4 microcontroller with several sensors
• 256kB of RAM 1
ML vs TinyML
Machine Learning
• detect complex events that rule-based system will struggle to identify
• usually need large amount of data to train
• run on cloud (Data Centres) with high performance computers using GPU,
Accelerator with large memory and storage
TinyML
• on very low cost, and ‘low’ computing horsepower microcontroller
• with small amount of data (for inference) don't do the training on the microcontroller itself
• small amount of memory (< 1 MB, down to <100KB)
• no (need) network connectivity to cloud
• no transfer of data - save power
• no transfer of data – data privacy
• no transfer of data - very short latency - fast in response
• run on very energy efficient ‘computer’
• target : mW with 10s to 1000mAh battery (last for years)
2
• can be deployed anywhere without any maintenance
What does study of TinyML involves?
• Embedded system development – Hardware and software
• Understanding of Machine Learning architectures
• Learning the relevant techniques and tools
In order to deploy on low cost low power devices and systems to
• perform on-device analytics for a variety of sensing methods in
physical world (E.g. phones, cars, building, industry machines, home
appliances, medical devices, human bodies)
• Audio
• Vision (image and video)
• Motion
• Chemical
Development of low-cost low-power smart devices
IoT + AI ≡ AIoT 3
Some (Possible) Use-case Examples
• Wake detection in daily devices – by voice, by motion, by vision
• Monitoring health by using wearable devices (heartbeat, blood pressure,
body weight etc)
• Detection and identification of human presence and condition
• Monitoring premises security
• Monitoring of industrial machinery for maintenance to avoid downtime
• Detection of defect parts in factory production
• Monitoring of stock on shelves in supermarket/departmental store
• Smart meters to save energy consumption
• Monitoring the number of customers waiting in the queue to improve
customer service
• Monitoring of environment and endangered animals
All perform
• without connecting to the network – fast response and data privacy
4
• by machine self-learning - without the need of domain knowledge
Aside: Domain Knowledge vs Machine Learning
Heuristic - a function that ranks alternatives in search algorithms at each
branching step based on available information to decide which branch to
follow (Wikipedia)
• typical needs domain knowledge from experience or provided by
Expert/Specialist - e.g., doctor
An example:
During the 2012 ImageNet computer image recognition competition
• Alex Krizhevsky used machine to implement a based deep
learning algorithm
• First time that machine learning based algorithm beats
handcrafted software written by computer vision experts.
5
What we will do in this course
• Basic introduction of ML and Deep Learning (Neural Network)
• Building a “Hello World” equivalent of TinyML
• Trainings of the model
• Converting model for TensorFlow Lite
• Implementing an application with the model
• Deploying to Microcontrollers
• Case Studies
• Optimization Techniques
Lab exercises:
Hands-on exercises to practice the above on the embedded device
Course project (Open ended):
To design and implement a TinyML application using the embedded
6
device
What we will use
Device
Arduino Nano 33 BLE Sense
Programming Language
Python, C++
8
Course plan and Assessment
Weekly Lecture
• Tuesday 1130am-130pm
• Pre-recorded videos (if needed)
Assessment:
• Exam (1-hr) – 30%
• Lab Quizzes (two) – 20%
• Course project – 50%
9
Project Assessment Rubric
Assessment will be based on demonstration/presentation, a (short) report and
code submission
1. Features (complexity and novelty) – 10%
2. Model Training process - 5% (Bonus mark)
3. Inference performance - 15%
4. Optimization techniques used in the implementation – 10%
• Code size minimization
• Real time performance
• Power efficiency optimization Demos
5. Demo/Presentation and report – 15%
https://www.tinyml.org/
A bit about
Artificial Intelligence
13
AI History and its Renaissance
ARTIFICIAL INTELLIGENCE: Machines that are capable of performing tasks that typically require
human intelligence
Powerful computers:
Become widely available, such as
Cloud computing and GPU
Big Data:
Availability of large amount of data
due to internet and smart mobile
phones
16
AI Implementations – Deep Learning
Hidden Layer
Input layer
• consists of learnable
parameters (the neurons)
Hidden Layer • the ‘algorithm’ that can
learn and improve by
itself
output layer
18
Deep Neural Network for Deep Learning
Input layer
Deep Neural Network
• multiple layers of hidden layers
• much more sophisticated
Hidden algorithms can be learnt
Layers
output layer
19
Aside: AlexNet
21
CNN for Image Recognition
(Pooling)
23
E.g. Face Detection Training and Inference
25
Aside: The Future of Deep Learning
The future of deep learning
(According to its pioneers and Turing Award Winners
Yoshua Bengio, Yann Lecun, Geoffrey Hinton
Communications of the ACM, July 2021, Vol. 64 No. 7, Pages 58-65)
1. Supervised learning requires too much labelled data
2. Reinforcement learning requires far too many trials.
3. Current systems are not as robust to changes in distribution as humans, who
can quickly adapt to such changes with very few examples.
4. Current deep learning is most successful at perception tasks and generally what
are called system 1 tasks (Image classification, Object Detection and Face
recognition)
Using deep learning for system 2 tasks that require a deliberate sequence (e.g., to
code a program) of steps is an exciting area that is still in its infancy.
27
Microcontrollers
Microcontrollers are single-device computers that are highly integrated
with peripherals and (nowadays) various sensors
• used in embedded systems for specific functions
E.g. car, mobile phone, notebook, webcam,
Many are based on 8-bit processors but increasingly
come with 32-bit processors (ARM Cortex-M family)
Main features
• small physical size and low cost
• power efficiency
• but resource constraint
• small amount of memory (< 1MB)
• ‘low’ computing horsepower
• usually don’t support OS (no MMU) 28
Microcontroller Market
29
Microcontroller for IoT
Internet of Things (IoT) are devices based on microcontrollers
• incorporate with sensors to sense the surrounding environments
• with network connectivity
Examples:
Modern jet engine that's filled with thousands of sensors collecting
and transmitting data back to make sure it is operating efficiently.
35
Deep Learning
Workflow
36
Training and Building a Deep Learning based Model
The following are typical steps involved in developing a model based on
machine learning (deep learning)
1. Decide on a goal
2. Select, collect and label a dataset
3. Design a model architecture
4. Train the model using the dataset
5. Convert the model
6. Run the inference
7. Evaluate and troubleshoot
37
Decide on a goal
Need to first decide what we want to predict
• to enable us to decide
• what dataset to collect
• which model architecture
Example:
Predicting whether a factory machine is about to break down
• classification problem with two possible outcome
• Normal
• Abnormal
38
Selecting a Dataset
What data to collect?
• only select those that are relevant
• ignore obvious irrelevant data
Example:
Predicting whether a factory machine is about to break down
• Temperature of the machine – yes
• Vibration of the machine – Yes
• Noise emitted by the machine – Yes (but may be related to vibration)
• Speed of operation - ?
• Duration from last service – ?
• Ambient noise in the factory – No
Usually need domain expertise and perform some experiments to decide
• and ease of collecting the data 40
Collecting Data
After deciding on the dataset to collect
• how much data to collect?
• rule of thumb - the more the better.
Example:
Predicting whether a factory machine is about to break down
• collect data that represent the full range of conditions
• data correspond to different possible modes of failures(?)
• from summer to winter
• collect as a set of time series data
• vibration level every 10 seconds
• temperature every minute
• rate of production once every two minutes
41
Labelling Data
After collecting the dataset
• label the data as normal and abnormal
• assume ‘just before’ failure = abnormal
Example:
Predicting whether a factory machine is about to break down
42
Source: “TinyML” by by Pete Warden & Daniel Situnayake
Which Model Architecture to use
Many possibilities, depending on
• type of problem
• type of data available (e.g. time series)
• data transformation (e.g. in frequency domain)
• personal experience (i.e. domain knowledge)
Also consider the resource constraint of the device to deploy the model
• memory size limitation
• processor capability
• microcontroller special features (e.g. hardware accelerator)
45
Aside: Tensor
A tensor is a list (array) that contains either numbers, or other tensors
Examples:
Vector is a 1D tensor (5,) : [1 2 3 4 5]
[ [1 2 3 4]
Matrix is a 2D tensor (3,4) : [5 6 7 8]
[9 8 7 6] ]
[[[…]
[…]
[…]]
Higher Dimension tensors (2, 3, x): [ [ … ]
[…]
[…]]]
46
Features and Window
Feature – particular type of information that the model will accept for
training
• a 1-D tensor with 3 entries may represent 3 features
• images may be represented with higher dimension tensors
Normalization
• change the value to within the range of 0 to 1
48
Source: “TinyML” by by Pete Warden & Daniel Situnayake
Training the Model
The process for the model to learn
• to produce the correct output for a given set of input
52
A 2-layer Deep Learning Network
Input Layer 1 Layer 2
Output
Biases
53
Neuron Structure
A neuron produces a single output through a linear transformation
• weighted sum of the inputs
• plus a constant value called Bias
But can only solve simple linear problems with linear transformations
54
Activation Functions for Neuron
Add a non-linear function on the neuron's output
• to help the network able to learns complex pattern
• Example: Rectified Linear Unit (Relu)
55
Various terms used in Model Training
At the beginning
• weights are assigned with random values
• biases typical start with a value of 0
During training
• data are fed in batches into the model
• number of data to be processed before updating the parameters
Epochs is the number of iterations that the learning algorithm will work
through the entire training dataset.
56
ML Training and Inference
error
Backward
57
How many Epochs should we use?
We stop when the model’s performance stop improving
Loss
• How far the model is from predicting the expected output
Accuracy
• Percentage of time the model make the right predictions
59
Source: “TinyML” by by Pete Warden & Daniel Situnayake
Underfitting and Overfitting
Two most common reasons a model fails to converge
• Overfitting and Underfitting
Underfit
• model has not yet been able to learn to make a good prediction
• typically due to too small architecture (i.e. simple model) to capture
the complexity of the system
• Try to increase the complexity of the model architecture
Overfit
• model can predict perfectly based on its training data
• but not able to perform well for data not previously seen
• can be due to memorizing, or ‘short cut’ present in the training data
• E.g. recognising of texts that are embedded in the training data
• Try to get more and varied dataset 60
Overfitting
Overfitting pattern:
increasing loss during
the training using the
validation data
61
Source: “TinyML” by by Pete Warden & Daniel Situnayake
Training, Validation and Testing
Dataset are split into 3 groups during training to check the performance of
the model against overfitting
1. Training Data (e.g. 60%)
2. Validation Data (e.g. 20%)
3. Testing Data (e.g. 20%)
64
Deployment - Running Inference
The model can now be integrated into the application programme
• using the TensorFlow Lite for Microcontroller library
• C++ programming language.
66
Source: “TinyML” by by Pete Warden & Daniel Situnayake
Field Run and Evaluation
Does it work to expectation?
68