SVM LAB.7
SVM LAB.7
Support Vector Machine (SVM) is a supervised learning algorithm used for classification and regression
tasks. It works by finding the optimal hyperplane that best separates different classes in a dataset while
maximizing the margin between the nearest data points (support vectors).
Where is it used?
2. Medical Diagnosis
● Cancer Detection: SVM is used in classifying medical images (e.g., detecting breast cancer in
mammograms).
● Disease Prediction: Helps in diagnosing diseases based on patient data.
4. Financial Sector
Page 1
Applications Development Laboratory (CS33002), Spring 202
Advantages of SVM ?
Dis-Advantages of SVM ?
1.Hyperplane and Decision Boundary – A hyperplane is the optimal boundary that separates different
classes.
2.Support Vectors – These are the closest data points to the hyperplane that influence its position.
3.Margin Maximization – SVM aims to maximize the distance (margin) between the hyperplane and support
vectors for better generalization.
4.Lagrange Multipliers & Dual Formulation – Used to transform the optimization problem into a solvable
form.
5.Kernel Trick – A method to transform non-linearly separable data into a higher-dimensional space for
better separation.
Types of Kernels:
6.Soft Margin vs. Hard Margin – Soft margin allows some misclassification, while hard margin strictly
separates classes.
7.Regularization Parameter (C) – Controls the trade-off between margin maximization and classification
accuracy.
8.Convex Optimization – A mathematical technique to ensure the best hyperplane is found.
9.Hinge Loss Function – A loss function used in SVM that penalizes misclassified points.
Page 2
Applications Development Laboratory (CS33002), Spring 202
10.Quadratic Programming – A mathematical optimization method used to solve SVM’s constrained
optimization problem.
Implementation of SVM using Python (Stepwise with comments with the same lung cancer data)
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.inspection import DecisionBoundaryDisplay
# Load dataset from CSV
file_path = "/content/LungNum.csv" # Replace with your actual file path
df = pd.read_csv(file_path)
df = df.drop(columns='Patient_ID')
# Select features and target (modify column names as needed)
X = df.iloc[:, :-1].values # Assuming all but the last column are features
y = df.iloc[:, -1].values # Assuming the last column is the target
# Normalize features
scaler = StandardScaler()
X = scaler.fit_transform(X)
# Split dataset (optional, for training and testing)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)
from sklearn.decomposition import PCA
from sklearn.svm import SVC
from sklearn.inspection import DecisionBoundaryDisplay
import matplotlib.pyplot as plt
Page 3
Applications Development Laboratory (CS33002), Spring 202
svm_2d, X_train_pca, response_method="predict", cmap=plt.cm.Spectral,
alpha=0.8
)
plt.scatter(X_train_pca[:, 0], X_train_pca[:, 1], c=y_train, edgecolors="k")
plt.title("Decision Boundary of SVM (Reduced to 2D)")
plt.show()
OUTPUT:
Page 4