0% found this document useful (0 votes)

167 views

Machine Learning Mini-Project Report

The document reports on a group project using various machine learning models to perform image classification on the Fashion-MNIST dataset. The group tested convolutional neural networks, multilayer perceptrons, and logistic regression models. Their best performing model was a CNN using Adamax optimization that achieved 93.4% test accuracy. They concluded CNNs outperformed other models due to multi-layer processing but further improvements could be made by tuning hyperparameters.

Uploaded by

Saket Singh

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

167 views

Machine Learning Mini-Project Report

Uploaded by

Saket Singh

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

Machine Learning Mini-Project Report

MTCS - 204(P) Fashion MNIST Problem

Group:
Saurav Rai (17558)
Saichand A V R P (17552)
Akhilesh Pandey (17551)

AIM :
Variations of Neural Network Models for image classification: fashion-MNIST.

Experimental Procedure :
Platforms Used :
1. ( Saurav Rai ) : Python3( backend tensorflow ).
2. ( Saichand ): IPython(anaconda3), Keras, Python ( backend Tensorflow).
3. ( Akhilesh Pandey ) : Python3, Pytorch

Dataset Description :
Fashion-MNIST is a dataset of Zalando’s article images - consisting of training set
of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 gray
scale image, associated with a label from 10 classes. An example of how the data looks:

Figure reproduced from zalando’s github page1.

Each class takes three-rows in the data visualized above. 7-step conversion process
is used to generate the Fashion-MNIST dataset2. Fashion-MNIST poses a more challenging
classification task than the simple MNIST digits data.
1 https://github.com/zalandoresearch/fashion-mnist; 2 1708.07747v2[cs.LG], 15 Sep, 2017 arxiv.
Models Used : ( Work done by Saichand (17552))
S.No. Classifier Activation Optimizer
3 ConvLayers with Relu,softmax Adam
maxpooling and 2 FC,
241546 parameters (keras)
3 ConvLayers with Relu, tanh Adagrad
maxpooling and 2 FC,
1 241546 parameters (keras)
3 ConvLayers with Tanh, softmax Adamax
maxpooling and 2 FC,
241546 parameters (keras)
3 ConvLayers with Sigmoid, relu Adam
maxpooling and 2 FC,
241546 parameters (keras)
MLP with one hidden Relu, sigmoid Adagrad
layer(256), (tensorflow API)
MLP with one hidden Sigmoid, Relu Adam
2 layer(256), (tensorflow API)
MLP with one hidden Sigmoid, Relu Gradient Descent
layer(256), (tensorflow API)
Logistic Regression Softmax LBFGS
3 (python)
K-Class Logistic Regression Softmax liblinear

We have used 3 distinct models, with varied Activation functions and Optimization
techniques. 3 Models used are:
√ Convolutional Neural Network (CNN): Algorithm has 3 convolution layers with
different activation functions and each followed by a maxpooling phase with pool_size
(2, 2), which gives translation invariance. A dropout phase is used after every maxpool
phase for regularization. The algorithm uses around 2.5 lakh parameters. 1.5 lakh
parameters are given as input for the first Fully Connected layer and 1290 parameters
to the second Fully Connected layer.
√ Multi Layer Perceptron (MLP): Algorithm used is a simple multi layer perceptron
model with one hidden layer. Initial parameters to input node are 784. Hidden layer size
is 256. The output layer has nodes equal to the number of labels(10). Weights are
initialized using random_normal function of tensorflow package.
Softmax_cross_entropy_with_logits is the cost function from the neural network package
of tensorflow.
√ Logistic Regression (LR): Algorithm is implemented using the linear_model
library of python-sklearn, that has LogisticRegression model. We use ‘lbfgs’ solver as the
optimizer for the model, and softmax function for predicting probabilities. We also
implemented the k-class LR model with multinomial = ‘ovr’, which fits a binary problem
for each label. The default optimizer used for this model is ‘liblinear’ which is an open
source library for large-scale linear classification.

Experiments Conducted:
❖ At first, we dealt with the fashion-mnist image classification problem using
Simple Logistic Regression(LR) to classify images, which are saved as a features in
a .csv file for train and test data. A fixed train set (60,000) and a fixed test set (10,000)
are used for the model as given by the creators of the fashion-MNIST (zalando). The
accuracies observed in this model are not satisfactory. We observe that LR works
better than k-class LR, which uses ovr(one versus rest) policy. But LR gives only
85.19 and KLR(K-Class LR) gives 84.55.

❖ Next, we tried to approach the problem using Simple Multi-layer Perceptron

Model. We varied the optimizers and also activation functions at different layers.
We noted down the accuracies for all the models and projecting the best few models
that stood out in the analysis. The test accuracy has gone up till 87.01 for Adagrad
optimizer with a learning rate of 0.01. But, this doesn’t seem to impress because
it takes enough time about half an hour, but improves by 3 percent. In the figure,
by accuracy we mean the test accuracy. We even tried the Gradient Descent
Technique, due to its inability of oscillation, it converged to local optima and the
test accuracy was only 35.

❖ Then we used CNN using keras with tensorflow backend, which showed good
improvement in the accuracy. The maximum test accuracy we touched was 93.4,
with adamax optimizer. We say that CNN based model performed better, based on
the model and also the time taken to execute was only 15 minutes compared to
double the time taken for MLP and even more for LR models.
The Comparison for all the three different models shows that CNN based
models outperform other models, due to the ability of multi-layer processing.
The hyper parameters of the models also have been tuned to see the accuracy
improve in case of both MLP and CNN models.

The figures 256 and 512 in CNN_adamax model, correspond to the batch
size. ROC curves for the models have been plotted and the training and test
accuracies are noted down. In the above figure, the polynomial trend line shows
the shift in the test accuracy.

Observations:

The observations on the experimental models are as follows:

The training and test accuracies for most of the models seem to be close, and in
fact for MLP and LR models, the test accuracy is less than the training accuracy. This
indicates that the models when trained on themselves(training set) are better than when
ran on the test data. Hence, the generalization error is more. Also, the training accuracies
are all in the range of 85-95. Hence the models are not overfitting the training data.
For different models we have plotted the ROC curves, given after the comparison of
accuracies. The ROC curves also show that the area under the curve is maximum for
CNN based model, with adam optimizer.

ROC and Accuracy, Losses for CNN Models :

Activation Optimizer ROC
Relu,softmax Adam

Relu, tanh Adagrad

Relu,tanh Adagrad
(0.01)

Tanh, Adamax
softmax (256)

Tanh, Adamax
softmax (512)
Sigmoid, Adam
relu

Activation Optimizer Acc_Loss

Relu,softmax Adam

Relu, tanh Adagrad

Relu,tanh Adagrad
(lr = 0.01)

Tanh, softmax Adamax

(256)

Tanh, softmax Adamax

(512)
Sigmoid, relu Adam

Inferences:
From our analysis, we infer that CNN boosts the accuracy, with less computation
time, compared to other classifiers because of the factor of multi-layer processing. Also
the validation phase of the CNN models help in boosting the accuracy. Because we split
the training and validation data as 80 ad 20 percent, from the training data set. The
area under the curve for CNN with adam and adamax optimizers outstands at 93.
We haven’t experimented thoroughly with variations of Batch_size for the convolutional
neural networks. Also, there is no much work done in MLP also. May be we can increase
the hidden layers and improve the accuracies. The changes in the learning rates for CNN,
MLP have showed some intuitive understandings as to, if lr is low (0.001), it takes more
time and also learns very slow. If lr is too high (0.1, 0.2..), it halts much early, but drastic
changes lead to more bias and reduction in accuracy can be observed. Hence, we found
that a good learning rate for CNN with adagrad and adam is 0.01. But, strangely, for
CNN with adamax optimizer, we see that learning rate of 0.2 gave 93 percent test
accuracy.

Conclusion:
In this mini-project, We have done an image classification for the fashion-MNIST
dataset. We have used various classifiers such as CNN, MLP and LR. We have further
studied the accuracies and concluded that with our minimal experiments, we observe
that CNN performs better. We can extend the work to improve the performance by tuning
the batch_size for CNN, number of hidden layers for MLP and better novel solver for
Logistic Regression.
Models used: ( Work done by Akhilesh Pandey (17551))
CNN

Conv. Batch
Sl.
Layres _norm Batch
No Model Pool Optimizers Iters Acc
(yes/n size
.
(CL) o)

2 CL,
1. Relu Max Adam No 100 7000 83.42
1 FC,

2 CL,
2. Relu Avg Adam Yes 256 4000 88.45
1 FC,

2 CL,
3. Relu Avg Adam Yes 256 5500 88.62
1 FC,

2 CL,
4. Relu Avg Adam Yes 256 4500 89.32
1 FC,

2 CL,
5. Relu Avg RMSprop Yes 300 4000 88.35
1 FC,

6. 2 CL,
Relu Max RMSprop No 100 7000 83.56
1 FC,
2 CL,
7. Relu Avg RMSprop Yes 300 8000 88.31
1 FC,

8. 2 CL,
Relu Avg RMSprop Yes 100 3000 87.21
1 FC,

2 CL,
9. Relu Avg Adamax Yes 100 3000 85.22
1 FC,

2 CL,
10. Relu Avg Adamax Yes 512 4000 89.07
1 FC,

Experiment Conducted and Observations Made:

We want to build a model that will classify digits into one of the ten classes of
clothes ,shoes etc.
I prefered CNN with 2 convolution layer and one fully connected layer. The activation
function I’ve used is Relu. Relu is defined as Relu(x) = x if x >= 0 and Relu(x)= 0
otherwise.
I used two different CNN models with slightly different architecture – one without batch
normalization and other with batch normalization.
Both of these models were run while varying the hyper parameters like batch_size,
number of iterations, optimizers, etc. For each optimizer, the model was run with varying
batch_size and number of iterations.

In the graph given below yes/no refers to whether batch normalization was used or not.

For example, we can observe the improvement in the accuracy for the model using Adam
Optimizer, with batch_size of 100 and running through 5.5 K iterations. From the figure
shown below, we can observe that the accuracy initially improves faster than it did in
the later iterations.
In this experiment we experimented with various optimizers like Adam, Adamax,
RMSprop. Obervation regarding each of these optimizer is given below.

ADAM Optimizer:

From the chart given above we can observe that as the batch size is increasing the
accuracy too is increasing. This can be seen from the fact that as the batch size increases
from 100 to 256, the accuracy increases from 83 to 88. However, this is not the case
with number of iterations. We see from the figure that even though the number of
iterations increased from 4k to 5.5 k, there is very less increase in the accuracy. Also
we observe that when the iteration increases from 4.5k to 5.5k, there is a decrease in
accuracy.
RMSprop Optimizer:

Adamax Optimizer:

From the chart given below we can observe that as the batch size is
increasing the accuracy too is increasing. This can be seen from the fact that as
the batch size increases from 100 to 512, the accuracy increases from 85 to 89.
Similar is the case with number of iterations. We see from the figure that when
the number of iterations increased from 3k to 4k, there is great increase in the
accuracy.
From the above graphs few things can be observed.
❖ Batch narmalization gives better result campared to without batch normalization.
❖ As the batch size increases, the accuracy also increases.
❖ RMSprop does a better job in terms of accuracy.

ROC Plots for Different Models:

Adam:
As we can see from the ROC curve that the area under the curve for the optimizer
Adam is only 89, that indicates that the true positive rate is not up to the mark, because
of which the area reduced.
Adamax:
In this model, we observe from the graph that the area under the curve has
increased compared to the previous, as we note that the True positive rate increased
from around 79 percent to 81.

RMSprop:
In this final model, we conclude that it has maximum area under the curve with
true positive rate at 84.

Conclusion:
From above experinment and observations we can conclude the following:
❖ Accuracy is dependent on batch_size, i.e. when batch_size increases the accuracy
too increases.
❖ With increase in number of iterations the accuracy tends to oscillate. Hence proper
choice of number of iterations to run through is important.
❖ Apart from these, when the batch_size is increased the model takes more time to
train which is natural.
IMPLEMENTATION

OF SELF NORMALIZING NEURAL NETWORK ON FASHION MNIST

( Work done by Saurav Rai (17558))

ABOUT SNN

• FNN(feed forward netwok) that perform well are typically

shallow and, therefore cannot exploit many levels of abstract
representations. Self-normalizing neural networks (SNNs)
enable high-level abstract representations.
• While batch normalization requires explicit normalization,
neuron activations of SNNs automatically converge towards
zero mean and unit variance.

• The activation function of SNNs are "scaled exponential

linear units" (SELUs), which induce self-normalizing
properties.

• The activations close to zero mean and unit variance that

are propagated through many network layers will converge
towards zero mean and unit variance -- even under the
presence of noise and perturbations.
• This convergence property of SNNs allows to

(1) train deep networks with many layers,

(2) employ strong regularization, and
(3) to make learning highly robust.

• I have implemented SNN in python3 using Tensorflow

(backend) .

3. MODELS USED :

MODEL :

Multilayer_perceptron with 4 hidden layer ,SELU activation function , Adagrad

Optimzer.

COST FUNCTION :

•Cross Entropy function

tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(log
its=pred, labels=y))

OPTIMIZER :

•AdagradOptimizer

tf.train.AdagradOptimizer(learning_rate=learning_rate).mi
nimize(cost)
ACTIVATION FUNCTION :

SELU activation function which is defined as

selu(x) = λ * x if x>=0

= λ * α * exp(x) – α if x<=0

Here α and λ are solved for in the equations resulting from

find a fixed point μ, ν = g(μ, ν).

5 .OBSERVATIONS:

Parameters

Learning_rate = 0.05

training_epochs = 10

batch_size = 100

display_step = 1

Networks parameters

n_hidden_1 = 784 #This is the 1st layer number of features

n_hidden_2 = 784 #This is the 2nd layer number of features

n_input = 784 #FMNIST data input (image shape 28 * 28)

n_classes = 10 #FMNIST total classes (0-9 digits)

Layer 1: selu and drop_out selu or relu and drop_out selu

Layer 2 : selu and drop_out selu and drop_out selu

• FOLLOWING IS THE OBSERVATIONS FROM

SEVERAL TESTS.
No of Batch
Sl.no Model used Optimizer Accuracy% Time(secs)
epochs _size

Multilayer
Adagrad
1 Perceptron 10 50 89.59 139.51
Optimizer
(SELU)

Multilayer
Adagrad
2 Perceptron 20 100 89.61 122.17
Optimizer
(SELU)

Multilayer Adadelta
3 10 50 87.49 150.63
Perceptron Optimizer
(SELU)

Multilayer
Adadelta
4 Perceptron 20 100 88.58 303.89
Optimizer
(SELU)

Multilayer
Adagrad
5 Perceptron 10 50 81.21 141.21
Optimizer
(RELU)

Multilayer Proximal

6 Perceptron 20 Adagrad 50 89.56 345.56

(SELU) Optimizer

Multilayer Proximal

561.68
7 Perceptron 20 Adagrad 50 90.33

(SELU) Optimizer

8 Logistic 100 GDM - 84.12 135.21

9 K-logistic 100 GDM - 81.23 160.12

10 Softmax 100 GDM - 83.12 124.45

11 Softmax 200 GDM - 85.11 245.12

Observation :

1 . The Highest Accuracy :

• The optimizer chosen : Proximal Adagrad Optmizer , It

allows the learning rate to adapt based on parameters.

• Batch size was taken as 50.

• The no of iterations was taken as 20.

• The activation function is SELUs which induce

self-normalizing properties.

• The overall time taken was 561.68 seconds and

• The accuracy obtained is 90.33 % which is the highest

accuracy among various combinations for self normalizing
neural network.

2 . Using Logistic and K – Logistic regression :

• Logistic hypothesis model for this dataset.

• Using 100 iterations and Gradient Desent optimzer

• Time taken : 135.21 seconds . The accuracy obtained :

84.12 %.

• Implemented K - logistic hypothesis model for this dataset ,

using 100 iterations and Gradient Desent optimzer. The time
taken is 160.12 seconds and the accuracy obtained : 81.23 %.

3 . Using Softmax regression :

• Implemented using Softamax hypothesis model for the

same dataset.

• Using Gradient Descent Optimizer and 100 and 200

iterations.

• The accuracy obtained : 83.12 % and 85.11 % respectively.

4. Using Multilayer Perceptron :

• Implemented using Multilayer Perceptron model and

SELU activation function for the same dataset.

• Trained with different number of iterations and batch size

with Adagrad and Adadelta optimzers .

• The average accuracy obtained is around 87 %.

5. Using ROC Curve :

• After training, the network showed 94.4% accuracy. As the

classes do not contain an equal amount of data,hence we can
look at the ROC curve.

• I plotted here the accuracy of the network vs. the threshold.

So the x and y axes have to meanings here: for the blue
curve, the x and y axes are the false positive rate and true
positive rate, respectively. For the green curve, the x axis is
the threshold and the y axis the accuracy of the network.

• The blue points indicate the threshold/accuracy (FPR/TPR)

if the threshold is chosen to be T1=0.5. I also marked the
point, which labels the highest accuarcy, namely 90.33%
(red). The grey point is the one you get, when you minimize
the distance between the ROC curve and the point (0,1). It
corresponds to an accuracy of 81.21%.
6 . INFERENCE:

Reason for High accuracy :

• The activation function chosen is SELU which induce

self-normalizing properties by itself so the weight difference
is always near 0.

•The optimizer Proximal Adagrad Optmizer allows the

learning rate to adapt based on parameters.

It performs larger updates for infrequent parameters and

smaller updates for frequent one. Because of this it is well
suited for our Fmnist dataset.
Another advantage is that it basically eliminates the need to
tune the learning rate.

Reason for low accuracy :

• The activation function RELU does not have

self-normalizing properties by itself . So it does not give a
good accuracy for our model.

7 . CONCLUSIONS :

• SNN’s goal is to create neural networks in which, if the

input of any layer is normally distributed, the output will
automatically also be normally distributed.

• This is amazing because normalizing the output of layers is

known to be a very efficient way to improve the
performance of neural networks, but the current ways to do
it (eg BatchNorm) basically involve weird hacks, while in
SNN the normalization is an intrinsic part of the
mathematics of the neural net.

• The implementation of SNNs is relatively simple.

• SNNs do have the property of keeping their output
normalized even after several iterations of training is
quite complicated.

• Future work will be to implement the model using

more hidden layers and with other datasets or noisier
dataset or strings that require segmentation.

SAIRAM

Apache Cassandra Administrator Associate - Exam Practice Tests
From Everand
Apache Cassandra Administrator Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
DP4 Manual
100% (1)
DP4 Manual
199 pages
Fake News Detection Using Machine Learning Models
No ratings yet
Fake News Detection Using Machine Learning Models
5 pages
Machine Learning Assignment
No ratings yet
Machine Learning Assignment
5 pages
Sample - Customer Churn Prediction Python Documentation
No ratings yet
Sample - Customer Churn Prediction Python Documentation
33 pages
Great Learning Notes
No ratings yet
Great Learning Notes
1 page
COMPX310-19A Machine Learning: An Introduction Using Python, Scikit-Learn, Keras, and Tensorflow
No ratings yet
COMPX310-19A Machine Learning: An Introduction Using Python, Scikit-Learn, Keras, and Tensorflow
44 pages
House Price Prediction: Project Description
No ratings yet
House Price Prediction: Project Description
11 pages
CPF Overall Start-Up and Mercury De-Sorption Operation: GTG-EXP-000-EXP-PRO-0010
No ratings yet
CPF Overall Start-Up and Mercury De-Sorption Operation: GTG-EXP-000-EXP-PRO-0010
19 pages
Project Report: CS 574 - Computer Vision Using Machine Learning
No ratings yet
Project Report: CS 574 - Computer Vision Using Machine Learning
38 pages
Bank Customer Churn Analysis - Jupyter Notebook
No ratings yet
Bank Customer Churn Analysis - Jupyter Notebook
11 pages
Predicting Mode of Transport (ML) : Akalya KS
No ratings yet
Predicting Mode of Transport (ML) : Akalya KS
17 pages
Artificial Neural Networks Quiz Questions 1
No ratings yet
Artificial Neural Networks Quiz Questions 1
17 pages
Artificial Neural Networks Kluniversity Course Handout
No ratings yet
Artificial Neural Networks Kluniversity Course Handout
18 pages
Churn For Bank Customers
No ratings yet
Churn For Bank Customers
28 pages
Machine Learning Lab Manual 7
100% (1)
Machine Learning Lab Manual 7
8 pages
Regression Analysis
100% (2)
Regression Analysis
9 pages
DL Lab Manual
100% (1)
DL Lab Manual
35 pages
Sajjad DS
100% (2)
Sajjad DS
97 pages
Lead Scoring Case Study Presentation
100% (2)
Lead Scoring Case Study Presentation
11 pages
Ensemble Classifiers
100% (1)
Ensemble Classifiers
37 pages
Presentation GPT 4
100% (1)
Presentation GPT 4
25 pages
Poly
100% (1)
Poly
108 pages
Brain Tumor Classification
100% (1)
Brain Tumor Classification
12 pages
7 Time Series Datasets For Machine Learning
No ratings yet
7 Time Series Datasets For Machine Learning
8 pages
ML Assignemnt PDF
No ratings yet
ML Assignemnt PDF
21 pages
Decision Tree
No ratings yet
Decision Tree
12 pages
Machine Learning Hands-On
100% (1)
Machine Learning Hands-On
18 pages
Project 5 - Cars
100% (1)
Project 5 - Cars
22 pages
Machine Learning
100% (5)
Machine Learning
56 pages
Prediction of Company Bankruptcy: Amlan Nag
100% (2)
Prediction of Company Bankruptcy: Amlan Nag
16 pages
Churn Modeling
100% (1)
Churn Modeling
11 pages
Seminar Report Machine Learning
No ratings yet
Seminar Report Machine Learning
20 pages
Labpractice 2
100% (2)
Labpractice 2
29 pages
Unit-V Deep Learning Techniques
100% (1)
Unit-V Deep Learning Techniques
31 pages
Programming On Parallel Machines
100% (1)
Programming On Parallel Machines
344 pages
Car Transport Prediction
100% (2)
Car Transport Prediction
27 pages
Machine Learning Project Car Price Prediction Algorithm
No ratings yet
Machine Learning Project Car Price Prediction Algorithm
4 pages
Classification and Regression Trees
100% (1)
Classification and Regression Trees
60 pages
Predictive Modelling - Linear Discriminant Analysis - Mentor Version - Jupyter Notebook
100% (1)
Predictive Modelling - Linear Discriminant Analysis - Mentor Version - Jupyter Notebook
25 pages
Report On Linear Regression Using R
No ratings yet
Report On Linear Regression Using R
15 pages
Keras Cheat Sheet Python
No ratings yet
Keras Cheat Sheet Python
1 page
Project2 NN Digit Classification Brief Updated PDF
No ratings yet
Project2 NN Digit Classification Brief Updated PDF
2 pages
Unit-5 Decision Trees and Ensemble Learning
100% (1)
Unit-5 Decision Trees and Ensemble Learning
162 pages
Machine Learning Project Report
100% (1)
Machine Learning Project Report
4 pages
Matplotlib PDF
No ratings yet
Matplotlib PDF
16 pages
Data Mining
100% (4)
Data Mining
9 pages
Image Recognition Using Neural Networks
No ratings yet
Image Recognition Using Neural Networks
18 pages
ML MU Unit 2
100% (3)
ML MU Unit 2
84 pages
LDA 01 Linear Discriminant Analysis
No ratings yet
LDA 01 Linear Discriminant Analysis
65 pages
Data Mining Project
No ratings yet
Data Mining Project
33 pages
LDA KNN Logistic
100% (1)
LDA KNN Logistic
29 pages
Supervised Learning
No ratings yet
Supervised Learning
3 pages
Weather Forecasting Basepaper
100% (1)
Weather Forecasting Basepaper
14 pages
Loading The Dataset: 'Churn - Modelling - CSV'
No ratings yet
Loading The Dataset: 'Churn - Modelling - CSV'
6 pages
Machine Learning Bits
100% (2)
Machine Learning Bits
28 pages
The Cricket Winner Prediction With Applications of ML and Data Analytics
No ratings yet
The Cricket Winner Prediction With Applications of ML and Data Analytics
18 pages
Handout9 Trees Bagging Boosting
100% (1)
Handout9 Trees Bagging Boosting
23 pages
PM Guided Project Sample Business Report
No ratings yet
PM Guided Project Sample Business Report
52 pages
Effective Amazon Machine Learning
From Everand
Effective Amazon Machine Learning
Alexis Perrier
No ratings yet
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
From Everand
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
Robert Johnson
No ratings yet
MAJOR PPROJECo
No ratings yet
MAJOR PPROJECo
58 pages
TGIF: A New Dataset and Benchmark On Animated GIF Description
No ratings yet
TGIF: A New Dataset and Benchmark On Animated GIF Description
10 pages
Mukesh Patel School of Technology Management and Engineering Computer Engineering Department
No ratings yet
Mukesh Patel School of Technology Management and Engineering Computer Engineering Department
8 pages
Software Requirements Specification: Version 1.0 Approved
100% (1)
Software Requirements Specification: Version 1.0 Approved
13 pages
Mehar Ass 3-Model
No ratings yet
Mehar Ass 3-Model
1 page
A. Thread Creation B. Thread Synchronization C. Thread Termination D. Thread Cleanup Handlers
No ratings yet
A. Thread Creation B. Thread Synchronization C. Thread Termination D. Thread Cleanup Handlers
1 page
Mukesh Patel School of Technology Management & Engineering: TIP Organization Details (Put N/A in Case of College Project)
No ratings yet
Mukesh Patel School of Technology Management & Engineering: TIP Organization Details (Put N/A in Case of College Project)
1 page
Part A (To Be Reffered by Students) : Designing of Control Panel of Domestic Appliances
No ratings yet
Part A (To Be Reffered by Students) : Designing of Control Panel of Domestic Appliances
5 pages
Parajumble Practise Questions
No ratings yet
Parajumble Practise Questions
3 pages
Part A (To Be Reffered by Students) : Designing of Control Panel of Domestic Appliances
No ratings yet
Part A (To Be Reffered by Students) : Designing of Control Panel of Domestic Appliances
5 pages
PELAKarnataka Power Transmission Corporation
No ratings yet
PELAKarnataka Power Transmission Corporation
23 pages
HCI-case Study - B262
No ratings yet
HCI-case Study - B262
10 pages
Parajumble Practise Solutions
No ratings yet
Parajumble Practise Solutions
1 page
HCI-case Study
No ratings yet
HCI-case Study
7 pages
Saket Singh B248 70021118053 Btech Cs Se Assignment 3 Sem Vi
No ratings yet
Saket Singh B248 70021118053 Btech Cs Se Assignment 3 Sem Vi
6 pages
JD GraphicDesigner
No ratings yet
JD GraphicDesigner
2 pages
Renewal Intimation: Existing Policy Details
No ratings yet
Renewal Intimation: Existing Policy Details
2 pages
5 - Lecture 5 - Punctuation
No ratings yet
5 - Lecture 5 - Punctuation
38 pages
Unit 2-Basic Writing Skills: Sentence Structures Use of Phrases and Clauses in Sentences
No ratings yet
Unit 2-Basic Writing Skills: Sentence Structures Use of Phrases and Clauses in Sentences
32 pages
Chapter 2 - Data Envelopment Analysis
100% (1)
Chapter 2 - Data Envelopment Analysis
28 pages
CS 3719 (Theory of Computation and Algorithms) - Lectures 2-4. Finite Automata and Regular Expressions
No ratings yet
CS 3719 (Theory of Computation and Algorithms) - Lectures 2-4. Finite Automata and Regular Expressions
12 pages
Program Intervensi SPM Kimia 2008
No ratings yet
Program Intervensi SPM Kimia 2008
1 page
Chapter 3 Linear Equations
No ratings yet
Chapter 3 Linear Equations
10 pages
Transmission Line Surveys 010903
No ratings yet
Transmission Line Surveys 010903
39 pages
7th English Maths 1 PDF
No ratings yet
7th English Maths 1 PDF
172 pages
Module 11B. Piston Aeroplane Aerodynamics, Structures and Systems
No ratings yet
Module 11B. Piston Aeroplane Aerodynamics, Structures and Systems
6 pages
Conclusion and Recommendation
No ratings yet
Conclusion and Recommendation
1 page
LMP in North America PDF
No ratings yet
LMP in North America PDF
6 pages
Raymond M. Smullyan - Satan, Cantor, and Infinity
No ratings yet
Raymond M. Smullyan - Satan, Cantor, and Infinity
5 pages
NI-IMAQ User Manual
No ratings yet
NI-IMAQ User Manual
112 pages
Classical Mechanics Block 3/4
No ratings yet
Classical Mechanics Block 3/4
64 pages
L2-Gravity Dam Forces
No ratings yet
L2-Gravity Dam Forces
49 pages
Soil Dynamics and Earthquake Engineering: Aparna Roy, Atanu Santra, Rana Roy
No ratings yet
Soil Dynamics and Earthquake Engineering: Aparna Roy, Atanu Santra, Rana Roy
19 pages
How To Read Crochet Patterns in Charts
100% (1)
How To Read Crochet Patterns in Charts
2 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
78 pages
Part A Answer All Questions in This Part.: Equator Gyre
No ratings yet
Part A Answer All Questions in This Part.: Equator Gyre
27 pages
23 - 24 Cat 2023 10cat t2 Practical Test
No ratings yet
23 - 24 Cat 2023 10cat t2 Practical Test
8 pages
S1 TITAN 600-800 Alloy Calibration New
No ratings yet
S1 TITAN 600-800 Alloy Calibration New
2 pages
High Impedance Busbar Protection
100% (1)
High Impedance Busbar Protection
2 pages
Adaptive Model-Predictive-Control-Based Real-Time Energy Management of Fuel Cell Hybrid Electric Vehicles
No ratings yet
Adaptive Model-Predictive-Control-Based Real-Time Energy Management of Fuel Cell Hybrid Electric Vehicles
14 pages
2252 - Even Sem Supplementary
No ratings yet
2252 - Even Sem Supplementary
5 pages
Testingengine: Test4Engine Test Dumps Questions - Free Test Engine Latest Version
No ratings yet
Testingengine: Test4Engine Test Dumps Questions - Free Test Engine Latest Version
6 pages
Hsslive Xii Maths QB Deter 2024
No ratings yet
Hsslive Xii Maths QB Deter 2024
5 pages
Lecture3 Scales and Engineering Curves
No ratings yet
Lecture3 Scales and Engineering Curves
27 pages
GSM PHP MySQL Datatransfer - Ino
No ratings yet
GSM PHP MySQL Datatransfer - Ino
3 pages
Final Exam Schedule For Spring 2024
No ratings yet
Final Exam Schedule For Spring 2024
30 pages
Akash Final Report (1DA20ME403) - AKASH Talwar.
No ratings yet
Akash Final Report (1DA20ME403) - AKASH Talwar.
27 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Machine Learning Mini-Project Report

Uploaded by

Machine Learning Mini-Project Report

Uploaded by

Machine Learning Mini-Project Report

MTCS - 204(P) Fashion MNIST Problem

Figure reproduced from zalando’s github page1.

❖ Next, we tried to approach the problem using Simple Multi-layer Perceptron

The observations on the experimental models are as follows:

ROC and Accuracy, Losses for CNN Models :

Relu, tanh Adagrad

Activation Optimizer Acc_Loss

Relu, tanh Adagrad

Tanh, softmax Adamax

Tanh, softmax Adamax

Experiment Conducted and Observations Made:

ROC Plots for Different Models:

OF SELF NORMALIZING NEURAL NETWORK ON FASHION MNIST

• FNN(feed forward netwok) that perform well are typically

• The activation function of SNNs are "scaled exponential

• The activations close to zero mean and unit variance that

(1) train deep networks with many layers,

• I have implemented SNN in python3 using Tensorflow

Multilayer_perceptron with 4 hidden layer ,SELU activation function , Adagrad

•Cross Entropy function

SELU activation function which is defined as

Here α and λ are solved for in the equations resulting from

n_hidden_1 = 784 #This is the 1st layer number of features

n_hidden_2 = 784 #This is the 2nd layer number of features

n_input = 784 #FMNIST data input (image shape 28 * 28)

Layer 1: selu and drop_out selu or relu and drop_out selu

Layer 2 : selu and drop_out selu and drop_out selu

• FOLLOWING IS THE OBSERVATIONS FROM

6 Perceptron 20 Adagrad 50 89.56 345.56

8 Logistic 100 GDM - 84.12 135.21

9 K-logistic 100 GDM - 81.23 160.12

11 Softmax 200 GDM - 85.11 245.12

1 . The Highest Accuracy :

• The optimizer chosen : Proximal Adagrad Optmizer , It

• Batch size was taken as 50.

• The no of iterations was taken as 20.

• The activation function is SELUs which induce

• The overall time taken was 561.68 seconds and

• The accuracy obtained is 90.33 % which is the highest

2 . Using Logistic and K – Logistic regression :

• Logistic hypothesis model for this dataset.

• Time taken : 135.21 seconds . The accuracy obtained :

• Implemented K - logistic hypothesis model for this dataset ,

3 . Using Softmax regression :

• Implemented using Softamax hypothesis model for the

• Using Gradient Descent Optimizer and 100 and 200

• The accuracy obtained : 83.12 % and 85.11 % respectively.

4. Using Multilayer Perceptron :

• Implemented using Multilayer Perceptron model and

• Trained with different number of iterations and batch size

• The average accuracy obtained is around 87 %.

• After training, the network showed 94.4% accuracy. As the

• I plotted here the accuracy of the network vs. the threshold.

• The blue points indicate the threshold/accuracy (FPR/TPR)

Reason for High accuracy :

• The activation function chosen is SELU which induce

•The optimizer Proximal Adagrad Optmizer allows the

It performs larger updates for infrequent parameters and

Reason for low accuracy :

• The activation function RELU does not have

• SNN’s goal is to create neural networks in which, if the

• This is amazing because normalizing the output of layers is

• The implementation of SNNs is relatively simple.

• Future work will be to implement the model using

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.