0% found this document useful (0 votes)
11 views

Pdf&rendition 1

The document discusses various applications of computer vision including facial recognition, face filters, image search, retail, inventory management, self-driving cars, and medical imaging. It also covers computer vision tasks such as image classification, object detection, and instance segmentation. Basic concepts discussed include pixels, resolution, grayscale and RGB images, and image features.

Uploaded by

pragunagarwalx
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Pdf&rendition 1

The document discusses various applications of computer vision including facial recognition, face filters, image search, retail, inventory management, self-driving cars, and medical imaging. It also covers computer vision tasks such as image classification, object detection, and instance segmentation. Basic concepts discussed include pixels, resolution, grayscale and RGB images, and image features.

Uploaded by

pragunagarwalx
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

4/17/24, 10:52 AM about:blank 4/17/24, 10:52 AM about:blank

Computer Vision
Computer Vision
Introduction

The Computer Vision domain of Artificial Intelligence, enables machines to see through images or visual data, process and
analyse them on the basis of algorithms and methods in order to analyse actual phenomena with images.

Applications of Computer Vision - Facial Recognition

With the advent of smart cities and smart homes, Computer Vision plays a vital role in making the home smarter. Security
being the most important application involves the use of Computer Vision for facial recognition. It can be either guest Classification: The image Classification problem is the task of assigning an input image one label from a fixed set of
recognition or log maintenance of the visitors. It also finds its application in schools for an attendance system based on the categories. This is one of the core problems in CV that, despite its simplicity, has a large variety of practical applications.
facial recognition of students.
Classification + Localisation: This is the task that involves both processes of identifying what object is present in the image
and at the same time identifying at what location that object is present in that image. It is used only for single objects.
Applications of Computer Vision - Face Filters Object Detection: Object detection is the process of finding instances of real-world objects such as faces, bicycles, and
buildings in images or videos. Object detection algorithms typically use extracted features and learning algorithms to
Modern-day apps like Instagram and Snapchat have a lot of features based on the usage of computer vision. The application recognize instances of an object category. It is commonly used in applications such as image retrieval and automated vehicle
of face filters is one among them. Through the camera, the machine or the algorithm is able to identify the facial dynamics of parking systems.
the person and applies the facial filter selected.
Instance Segmentation: Instance Segmentation is the process of detecting instances of the objects, giving them a category
Applications of Computer Vision - Google’s Search by Image and then giving each pixel a label on the basis of that. A segmentation algorithm takes an image as input and outputs a
collection of regions (or segments).
The maximum amount of searching for data on Google’s search engine comes from textual data, but at the same time, it has
an interesting feature of getting search results through an image. This uses Computer Vision as it compares different features Basics of Pixels: The word “pixel” means a picture element. Every photograph, in digital form, is made up of pixels. They are
of the input image to the database of images and gives us the search result while at the same time analysing various features of the smallest unit of information that make up a picture. Usually round or square, they are typically arranged in a 2-
the image. dimensional grid.
Applications of Computer Vision - Retail Resolution: The number of pixels in an image is sometimes called the resolution. When the term is used to describe pixel
count, one convention is to express resolution as the width by the height, for example, a monitor resolution of 1280×1024.
The retail field has been one of the fastest growing fields and at the same time is using Computer Vision for making the user
This means there are 1280 pixels from one side to the other, and 1024 from top to bottom.
experience more fruitful. Retailers can use Computer Vision techniques to track customers’ movements through stores,
analyse navigational routes and detect walking patterns. Another convention is to express the number of pixels as a single number, like a 5 megapixel camera (a megapixel is a million
pixels). This means the pixels along the width multiplied by the pixels along the height of the image taken by the camera
Applications of Computer Vision - Inventory Management
equals 5 million pixels. In the case of our 1280×1024 monitors, it could also be expressed as 1280 x 1024 = 1,310,720, or
Through security camera image analysis, a Computer Vision algorithm can generate a very accurate estimate of the items 1.31 megapixels.
available in the store. Also, it can analyse the use of shelf space to identify suboptimal configurations and suggest better item
Pixel value: Each of the pixels that represent an image stored inside a computer has a pixel value that describes how bright
placement.
that pixel is, and/or what colour it should be. The most common pixel format is the byte image, where this number is stored as
Applications of Computer Vision - Self Driving Cars an 8-bit integer giving a range of possible values from 0 to 255. Typically, zero is to be taken as no colour or black and 255 is
taken to be full colour or white.
Computer Vision is the fundamental technology behind developing autonomous vehicles. Most leading car manufacturers in
the world are reaping the benefits of investing in artificial intelligence for developing on-road versions of hands-free Grayscale Images: Grayscale images are images that have a range of shades of gray without apparent colour. The darkest
technology. This involves the process of identifying the objects, getting navigational routes and also at the same time possible shade is black, which is the total absence of colour or zero value of a pixel. The lightest possible shade is white,
environment monitoring. which is the total presence of colour or 255 value of a pixel. Intermediate shades of gray are represented by equal brightness
levels of the three primary colours. A grayscale has each pixel of size 1 byte having a single plane of 2d array of pixels. The
Applications of Computer Vision - Medical Imaging size of a grayscale image is defined as the Height x Width of that image.

For the last decades, computer-supported medical imaging application has been a trustworthy help for physicians. It doesn’t RGB Images: All the images that we see around are coloured images. These images are made up of three primary colours
only create and analyse images, but also becomes an assistant and helps doctors with their interpretation. The application is Red, Green and Blue. All the colours that are present can be made by combining different intensities of red, green and blue.
used to read and convert 2D scan images into interactive 3D models that enable medical professionals to gain a detailed
understanding of a patient’s health condition. Image Features: In computer vision and image processing, a feature is a piece of information that is relevant for solving the
computational task related to a certain application. Features may be specific structures in the image such as points, edges or
Applications of Computer Vision - Google Translate App objects.

All you need to do to read signs in a foreign language is to point your phone’s camera at the words and let the Google In image processing, we can get a lot of features from the image. It can be either a blob, an edge or a corner. These features
Translate app tell you what it means in your preferred language almost instantly. By using optical character recognition to see help us to perform various tasks and then get the analysis done on the basis of the application. Now the question that arises is
the image and augmented reality to overlay an accurate translation, this is a convenient tool that uses Computer Vision. which of the following are good features to be used? As you saw in the previous activity, the features having corners are easy
to find as they can be found only at a particular location in the image, whereas the edges are spread over a line or an edge look
Computer Vision Tasks: The various applications of Computer Vision are based on a certain number of tasks that are the same all along. This tells us that the corners are always good features to extract from an image followed by the edges.
performed to get certain information from the input image which can be directly used for prediction or forms the base for
further analysis. The tasks used in a computer vision application are : OpenCV or Open Source Computer Vision Library is a tool that helps a computer extract these features from the images. It
is used for all kinds of image and video processing and analysis. It is capable of processing images and videos to identify
about:blank 1/4 about:blank 2/4
4/17/24, 10:52 AM about:blank 4/17/24, 10:52 AM about:blank
objects, faces, or even handwriting. Similar to the Convolutional Layer, the Pooling layer is responsible for reducing the spatial size of the Convolved Feature
while still retaining the important features. There are two types of pooling which can be performed on an image.
Convolution: Different filters applied to an image change the pixel values evenly throughout the image with the help of the
process of convolution and the convolution operator which is commonly used to create these effects. As we change the values i. Max Pooling : Max Pooling returns the maximum value from the portion of the image covered by the Kernel.
of these pixels, the image changes. This process of changing pixel values is the base of image editing. ii. Average Pooling: Max Pooling returns the maximum value from the portion of the image covered by the Kernel.

We all use a lot of image editing software like photoshop and at the same time use apps like Instagram and Snapchat, which The pooling layer is an important layer in the CNN as it performs a series of tasks which are as follows :
apply filters to the image to enhance the quality of that image.
Makes the image smaller and more manageable
Convolution: Convolution is a simple Mathematical operation that is fundamental to many common image-processing Makes the image more resistant to small transformations, distortions and translations in the input image.
operators. Convolution provides a way of `multiplying together' two arrays of numbers, generally of different sizes, but of the
same dimensionality, to produce a third array of numbers of the same dimensionality. An (image) convolution is simply an Fully Connected Layer
element-wise multiplication of image arrays and another array called the kernel followed by a sum.
The final layer in the CNN is the Fully Connected Layer (FCP). The objective of a fully connected layer is to take the results
What is a Kernel? of the convolution/pooling process and use them to classify the image into a label.

A Kernel is a matrix, which is slid across the image and multiplied with the input such that the output is enhanced in a certain
desirable manner. Each kernel has a different value for different kind of effects that we want to apply to an image.

Convolution

i. Convolution is a common tool used for image editing.


ii. It is an element-wise multiplication of an image and a kernel to get the desired output.
iii. In computer vision applications, it is used in Convolutional Neural Network (CNN) to extract image features.

What is a Convolutional Neural Network?

A Convolutional Neural Network (CNN) is a Deep Learning algorithm that can take in an input image, assign importance
(learnable weights and biases) to various aspects/objects in the image and be able to differentiate one from the other.

A convolutional neural network consists of the following layers:

1) Convolution Layer 2) Rectified linear Unit (ReLU) 3) Pooling Layer 4) Fully Connected Layer

Convolution Layer

It is the first layer of a CNN. The objective of the Convolution Operation is to extract the high-level features such as edges,
from the input image. CNN need not be limited to only one Convolutional Layer. Conventionally, the first Convolution Layer
is responsible for capturing the Low-Level features such as edges, colour, gradient orientation, etc. With added layers, the
architecture adapts to the High-Level features as well, giving us a network that has a wholesome understanding of images in
the dataset.

Rectified Linear Unit Function

The next layer in the Convolution Neural Network is the Rectified Linear Unit function or the ReLU layer. After we get the
feature map, it is then passed onto the ReLU layer. This layer simply gets rid of all the negative numbers in the feature map
and lets the positive number stay as it is.

Pooling Layer

about:blank 3/4 about:blank 4/4

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy