0% found this document useful (0 votes)

14 views

23BCE7092_ML_Lab_Assignment[1]

The document outlines a lab assignment by K. Sashank Chandra, detailing various machine learning tasks including Decision Trees, Linear Regression, Logistic Regression, Random Forest Classifier, and K-Means Clustering. Each section includes code snippets for data loading, preprocessing, model training, and evaluation metrics. The assignment demonstrates practical applications of different algorithms using datasets like weather, crop production, study hours, Titanic, and student marks.

Uploaded by

abhiram.ch.2005

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views

23BCE7092_ML_Lab_Assignment[1]

Uploaded by

abhiram.ch.2005

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Lab Assignment

Name: K. Sashank Chandra

RegNo : 23BCE7092
SLOT: L41+L42
Faculty : Swanth Boppudi

1. Decision Tree for Whether Dataset:

Code:
import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

from sklearn.tree import DecisionTreeClassifier, plot_tree

from sklearn.preprocessing import LabelEncoder

from sklearn.model_selection import train_test_split

# Load the weather dataset

filename = "weather.csv" # Ensure this path is correct

df = pd.read_csv(filename)

print(df)

# Remove the 'Day' feature if present

df = df.drop(columns=['Day'], errors='ignore')
# Display the first few rows of the dataset

print(df.head())

# Encode categorical features using LabelEncoder

label_encoders = {}

for column in df.columns:

if df[column].dtype == 'object': # Apply encoding only to categorical columns

le = LabelEncoder()

df[column] = le.fit_transform(df[column])

label_encoders[column] = le

print("----------------------------After fit and transform------------------------------------------")

print(df)

# Define features and target

X = df.iloc[:, :-1] # All columns except the last as features

y = df.iloc[:, -1] # Last column as target

# Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Build the decision tree classifier using the entropy criterion

model = DecisionTreeClassifier(criterion='entropy', random_state=42)

model.fit(X_train, y_train)

# Visualize the decision tree

plt.figure(figsize=(10, 6))

plot_tree(

model,

feature_names=X.columns,
class_names=label_encoders[df.columns[-1]].classes_ if df.columns[-1] in label_encoders else
None,

filled=True,

rounded=True,

fontsize=10

plt.title("Simple ID3 Decision Tree for Weather Dataset")

plt.show()

Output:
2.Linear Regression

Code:
# Import required libraries

import pandas as pd

import numpy as np

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LinearRegression

from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

from sklearn.preprocessing import LabelEncoder, StandardScaler

# Load dataset

data = pd.read_csv('India_Crop_Production (1).csv')

# Display basic info

print(data.head())

print(data.info())

# Handle missing values (example: drop rows with missing values)

data = data.dropna()

data = data[data['Production'] != '=']

# Verify the rows are removed

print(data[data['Production'] == '='])

# Encode categorical features

categorical_cols = ['State_Name', 'District_Name', 'Crop', 'Season']

label_encoders = {}

for col in categorical_cols:

le = LabelEncoder()

data[col] = le.fit_transform(data[col])

label_encoders[col] = le

# Define features and target variable

X = data[['Area', 'Season', 'Crop', 'Crop_Year']] # Example features

y = data['Production']

# Split the dataset

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Scale the features

scaler = StandardScaler()

X_train = scaler.fit_transform(X_train)

X_test = scaler.transform(X_test)

# Train the model

model = LinearRegression()

model.fit(X_train, y_train)

# Predict on test data

y_pred = model.predict(X_test)

# Evaluate the model

mae = mean_absolute_error(y_test, y_pred)

mse = mean_squared_error(y_test, y_pred)

r2 = r2_score(y_test, y_pred)

print(f"Mean Absolute Error: {mae}")

print(f"Mean Squared Error: {mse}")

print(f"R-squared: {r2}")
Output:
3.Logistic Regression

Code:
import numpy as np

import pandas as pd

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LogisticRegression

from sklearn.metrics import accuracy_score, confusion_matrix

# Read the dataset using pandas (replace 'study_hours.csv' with your actual file path)

data = pd.read_csv('study_hours.csv')

print(data)

# Assuming the target column is 'status' and all other columns are features

X = data.drop(columns=['status'])

y = data['status'] # Target variable

# Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state=20)

# Initialize the Logistic Regression model

model = LogisticRegression()

# Train the model

model.fit(X_train, y_train)

# Make predictions on the test data

y_pred = model.predict(X_test)
# Evaluate the model

accuracy = accuracy_score(y_test, y_pred)

conf_matrix = confusion_matrix(y_test, y_pred)

# Print results

print("Accuracy:", accuracy)

print("Confusion Matrix:")

print(conf_matrix)

Output:
4.Titanic Dataset:

Code:
import pandas as pd

from sklearn.model_selection import train_test_split

from sklearn.ensemble import RandomForestClassifier

from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

from sklearn.preprocessing import LabelEncoder

# Load the Titanic dataset

file_path = 'Titanic-Dataset.csv' # Replace with your Titanic dataset file path

data = pd.read_csv(file_path)

# Display the first few rows of the dataset

print("Dataset Preview:")

print(data.head())

# Drop columns not relevant for the model

data = data.drop(['PassengerId', 'Name', 'Ticket', 'Cabin'], axis=1, errors='ignore')

# Fill missing values

data['Age'].fillna(data['Age'].median(), inplace=True)

data['Embarked'].fillna(data['Embarked'].mode()[0], inplace=True)

# Encode categorical features

categorical_cols = ['Sex', 'Embarked']

label_encoders = {}

for col in categorical_cols:

le = LabelEncoder()

data[col] = le.fit_transform(data[col])

label_encoders[col] = le

# Define features and target variable

X = data.drop(['Survived'], axis=1)

y = data['Survived']

# Split the dataset into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize the Random Forest Classifier

model = RandomForestClassifier(n_estimators=100, random_state=42)

# Train the model

model.fit(X_train, y_train)

# Make predictions

y_pred = model.predict(X_test)

# Evaluate the model

accuracy = accuracy_score(y_test, y_pred)

conf_matrix = confusion_matrix(y_test, y_pred)

class_report = classification_report(y_test, y_pred)

# Display results

print("\nModel Evaluation:")

print(f"Accuracy: {accuracy:.2f}")

print("\nConfusion Matrix:")

print(conf_matrix)

print("\nClassification Report:")

print(class_report)
Output:
5. Clustering:

Code:
import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

from sklearn.cluster import KMeans

from sklearn.preprocessing import StandardScaler

from sklearn.metrics import silhouette_score, davies_bouldin_score

# Load dataset from CSV file

df = pd.read_csv('clustering.csv') # Ensure the file exists

# Selecting relevant features

marks = df[['Subject1', 'Subject2']].values

# Standardizing the data

scaler = StandardScaler()

marks_scaled = scaler.fit_transform(marks)

# Applying K-Means Clustering

k = 2 # Number of clusters

kmeans = KMeans(n_clusters=k, random_state=42, n_init=10)

df['Cluster'] = kmeans.fit_predict(marks_scaled)

# Get centroids

centroids = kmeans.cluster_centers_

# Assign cluster names based on performance

cluster_names = {0: 'High Performers', 1: 'Low Performers'} # Modify as needed

df['Cluster Name'] = df['Cluster'].map(cluster_names)

# Save clustered data to CSV

df.to_csv('student_marks_clustered.csv', index=False)

# Performance Metrics

inertia = kmeans.inertia_ # SSE

silhouette_avg = silhouette_score(marks_scaled, df['Cluster'])

db_index = davies_bouldin_score(marks_scaled, df['Cluster'])

print(f"Inertia (SSE): {inertia:.2f}")

print(f"Silhouette Score: {silhouette_avg:.2f}")

print(f"Davies-Bouldin Index: {db_index:.2f}")

# Display cluster-wise information

print("\nCluster Information:")

print(df.groupby('Cluster Name')[['Subject1', 'Subject2']].mean())

# Plot the clusters

plt.figure(figsize=(8, 6))

plt.scatter(marks_scaled[:, 0], marks_scaled[:, 1], c=df['Cluster'], cmap='viridis', marker='o',

edgecolors='k', label='Students')

plt.scatter(centroids[:, 0], centroids[:, 1], s=200, c='red', marker='X', label='Centroids')

plt.xlabel('Subject 1 (Scaled)')

plt.ylabel('Subject 2 (Scaled)')

plt.title('K-Means Clustering of Student Marks')

plt.legend()

plt.show()

Output:

Introducing Archaeology Third Edition Robert J Muckle Stacey L Camp 2024 Scribd Download
100% (3)
Introducing Archaeology Third Edition Robert J Muckle Stacey L Camp 2024 Scribd Download
50 pages
SOW EPCC Effluent Treatment Plant Service Water Header Modification
No ratings yet
SOW EPCC Effluent Treatment Plant Service Water Header Modification
27 pages
A Roadside Stand by Robert Frost
100% (2)
A Roadside Stand by Robert Frost
5 pages
Glory UW-500 600 1.0
No ratings yet
Glory UW-500 600 1.0
2 pages
Kundalini Yoga by Yogi Bhajan
100% (5)
Kundalini Yoga by Yogi Bhajan
11 pages
23BCE7199 ML Lab Assignment[1]
No ratings yet
23BCE7199 ML Lab Assignment[1]
15 pages
Import Pandas As PD DF PD - Read - CSV ("Titanic - Train - CSV") DF - Head
No ratings yet
Import Pandas As PD DF PD - Read - CSV ("Titanic - Train - CSV") DF - Head
20 pages
ML Codes
No ratings yet
ML Codes
9 pages
Train
No ratings yet
Train
17 pages
Titanic Akshaya
No ratings yet
Titanic Akshaya
12 pages
ML Assignment
No ratings yet
ML Assignment
34 pages
22K61A0654_2_sasi_auto
No ratings yet
22K61A0654_2_sasi_auto
24 pages
ML5_Implementation
No ratings yet
ML5_Implementation
32 pages
ML Lab Programs (1)
No ratings yet
ML Lab Programs (1)
9 pages
1
No ratings yet
1
13 pages
Machine Learning Lab: Raheel Aslam (74-FET/BSEE/F16)
No ratings yet
Machine Learning Lab: Raheel Aslam (74-FET/BSEE/F16)
3 pages
Machine Learning Strategies
No ratings yet
Machine Learning Strategies
59 pages
Data analytics
No ratings yet
Data analytics
10 pages
DA_012307
No ratings yet
DA_012307
8 pages
Aiml Ex 4-7
No ratings yet
Aiml Ex 4-7
8 pages
Prathamesh KRAI
No ratings yet
Prathamesh KRAI
38 pages
Home Work
No ratings yet
Home Work
12 pages
3 Classification
No ratings yet
3 Classification
16 pages
ML Shristi File
No ratings yet
ML Shristi File
49 pages
Rain in Australia Logistic Regression Classifier
No ratings yet
Rain in Australia Logistic Regression Classifier
10 pages
CP4252 Lab Manual(1)
No ratings yet
CP4252 Lab Manual(1)
13 pages
221IT027_DA_lab3 (2)
No ratings yet
221IT027_DA_lab3 (2)
5 pages
Decision Tree
No ratings yet
Decision Tree
6 pages
AI ML - Cycle 2 Programs (1)
No ratings yet
AI ML - Cycle 2 Programs (1)
15 pages
St. John College of Engineering and Management, Palghar - Maharashtra
No ratings yet
St. John College of Engineering and Management, Palghar - Maharashtra
11 pages
This Study Resource Was
No ratings yet
This Study Resource Was
5 pages
Naive Bayes Classification
No ratings yet
Naive Bayes Classification
8 pages
ML pdf
No ratings yet
ML pdf
30 pages
ML MANUAL WITH OUTPUTS (2)
No ratings yet
ML MANUAL WITH OUTPUTS (2)
30 pages
Import Numpy As NP Import Pandas As PD
No ratings yet
Import Numpy As NP Import Pandas As PD
7 pages
5) Randomforest - Ipynb - Colaboratory
No ratings yet
5) Randomforest - Ipynb - Colaboratory
12 pages
bacdeaf_23032025_115708_split_1
No ratings yet
bacdeaf_23032025_115708_split_1
37 pages
ml.yogesh
No ratings yet
ml.yogesh
23 pages
ML
No ratings yet
ML
11 pages
ML INTERNAL ANSWERS
No ratings yet
ML INTERNAL ANSWERS
9 pages
SUMMARY
No ratings yet
SUMMARY
16 pages
MANUAL (1)
No ratings yet
MANUAL (1)
34 pages
Project-1 (Data Preprocessing)
No ratings yet
Project-1 (Data Preprocessing)
5 pages
Machine
100% (1)
Machine
45 pages
ML2
No ratings yet
ML2
7 pages
ML Ex1
No ratings yet
ML Ex1
12 pages
Udacity Machine Learning Analysis Supervised Learning
100% (1)
Udacity Machine Learning Analysis Supervised Learning
504 pages
ML 4,5,6 (Sample1)
No ratings yet
ML 4,5,6 (Sample1)
6 pages
MLfull
No ratings yet
MLfull
29 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
23 pages
Titanic Survival Prediction Using Machine Learning
No ratings yet
Titanic Survival Prediction Using Machine Learning
7 pages
ML Python Exercises UOM BDS Classification
No ratings yet
ML Python Exercises UOM BDS Classification
18 pages
ML Lab Manual
No ratings yet
ML Lab Manual
12 pages
LAB MANUAL For Machine Learning
No ratings yet
LAB MANUAL For Machine Learning
15 pages
Programs Lab Bca
No ratings yet
Programs Lab Bca
16 pages
MLT 1 - 7 Kanish
No ratings yet
MLT 1 - 7 Kanish
24 pages
AIML PRACTICALS
No ratings yet
AIML PRACTICALS
22 pages
Machine learning lab manual
No ratings yet
Machine learning lab manual
9 pages
ML_Lab_01999676272
No ratings yet
ML_Lab_01999676272
12 pages
Screenshot 2023-12-07 at 11.07.49 AM
No ratings yet
Screenshot 2023-12-07 at 11.07.49 AM
14 pages
MANUAL (2)
No ratings yet
MANUAL (2)
33 pages
DWDM Lab 3
No ratings yet
DWDM Lab 3
10 pages
DECISION TREES
No ratings yet
DECISION TREES
7 pages
Classification
No ratings yet
Classification
3 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
sum of array using function
No ratings yet
sum of array using function
1 page
Assignment
No ratings yet
Assignment
6 pages
CAT-PLANNER
No ratings yet
CAT-PLANNER
50 pages
Subject-wise-prompts-te54p8
No ratings yet
Subject-wise-prompts-te54p8
25 pages
Analytic Geometry Hyperbola Problems
No ratings yet
Analytic Geometry Hyperbola Problems
14 pages
EPO IRENA Offshore Wind Patent Insight Report 2023
No ratings yet
EPO IRENA Offshore Wind Patent Insight Report 2023
59 pages
Participant GMCKottayam
No ratings yet
Participant GMCKottayam
240 pages
Ibong Adarna
No ratings yet
Ibong Adarna
22 pages
Assessment of Sustainable Tourism Development in Mt. Pamitinan
No ratings yet
Assessment of Sustainable Tourism Development in Mt. Pamitinan
60 pages
ESS_Unit-II_Nees for Electrical Energy Storage
No ratings yet
ESS_Unit-II_Nees for Electrical Energy Storage
20 pages
Philippine Standard On Auditing 450 Evaluation of Misstatements Identified During The Audit
No ratings yet
Philippine Standard On Auditing 450 Evaluation of Misstatements Identified During The Audit
11 pages
First Battle of Panipat
No ratings yet
First Battle of Panipat
2 pages
Worksheet Gene Regulation
No ratings yet
Worksheet Gene Regulation
2 pages
Lor - DR
No ratings yet
Lor - DR
2 pages
ServiceNow CSA Dump
No ratings yet
ServiceNow CSA Dump
26 pages
Rule 119 Doctrines PDF
No ratings yet
Rule 119 Doctrines PDF
15 pages
Eye Tracking As A Tool For Machine Translation Error Analysis
No ratings yet
Eye Tracking As A Tool For Machine Translation Error Analysis
6 pages
Deber 8io 2
No ratings yet
Deber 8io 2
7 pages
Resume Aniruddha Wagh
No ratings yet
Resume Aniruddha Wagh
3 pages
O-Ring Standar Sizes PDF
No ratings yet
O-Ring Standar Sizes PDF
14 pages
Day 1 Watershed Wonders Lesson Plan
No ratings yet
Day 1 Watershed Wonders Lesson Plan
2 pages
The Federal Aviation Administration'S Aging Atc Facilities: Investigating The Need To Improve Fa-Cilities and Worker Conditions
No ratings yet
The Federal Aviation Administration'S Aging Atc Facilities: Investigating The Need To Improve Fa-Cilities and Worker Conditions
202 pages
Effective Management of Service Marketing
No ratings yet
Effective Management of Service Marketing
17 pages
Notes - JIT and Backflush Costing
No ratings yet
Notes - JIT and Backflush Costing
19 pages
Payment Gateway Business Plan
No ratings yet
Payment Gateway Business Plan
12 pages
Science: Quarter 2 - Module 3
100% (4)
Science: Quarter 2 - Module 3
20 pages
Shashtyamsha The D60 Divisional Chart: December 21, 2018 Astrologerbydefault
100% (4)
Shashtyamsha The D60 Divisional Chart: December 21, 2018 Astrologerbydefault
10 pages
Sonar System: Section
No ratings yet
Sonar System: Section
35 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.