0% found this document useful (0 votes)
13 views

ML Algorithms Cheat Sheet

Uploaded by

Reddy Mohan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

ML Algorithms Cheat Sheet

Uploaded by

Reddy Mohan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Machine Learning Algorithms Cheat Sheet

1. Linear Regression

**Overview**: Linear Regression is a linear approach to modeling the relationship between a

dependent variable and one or more independent variables.

**Key Hyperparameters**:

- `fit_intercept`: Whether to calculate the intercept for the model. Default is `True`.

- `normalize`: If `True`, the regressors X will be normalized before regression. Default is `False`.

**Example Code**:

```python

from sklearn.linear_model import LinearRegression

from sklearn.model_selection import train_test_split

from sklearn.metrics import mean_squared_error

# Example data

X = ...

y = ...

# Split data

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Model initialization

lr = LinearRegression(fit_intercept=True, normalize=False)
# Model fitting

lr.fit(X_train, y_train)

# Predictions

y_pred = lr.predict(X_test)

# Evaluation

mse = mean_squared_error(y_test, y_pred)

print(f'Mean Squared Error: {mse}')

```

2. Logistic Regression

**Overview**: Logistic Regression is used for binary classification problems. It models the probability

of a binary outcome using a logistic function.

**Key Hyperparameters**:

- `penalty`: Used to specify the norm used in the penalization (`'l1'`, `'l2'`, `'elasticnet'`, `'none'`).

- `C`: Inverse of regularization strength; smaller values specify stronger regularization.

- `solver`: Algorithm to use in the optimization problem (`'newton-cg'`, `'lbfgs'`, `'liblinear'`, `'sag'`,

`'saga'`).

**Example Code**:

```python

from sklearn.linear_model import LogisticRegression

from sklearn.model_selection import train_test_split


from sklearn.metrics import accuracy_score

# Example data

X = ...

y = ...

# Split data

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Model initialization

log_reg = LogisticRegression(penalty='l2', C=1.0, solver='lbfgs', max_iter=1000)

# Model fitting

log_reg.fit(X_train, y_train)

# Predictions

y_pred = log_reg.predict(X_test)

# Evaluation

accuracy = accuracy_score(y_test, y_pred)

print(f'Accuracy: {accuracy}')

```

3. Decision Tree

**Overview**: Decision Tree is a non-parametric supervised learning method used for classification
and regression.

**Key Hyperparameters**:

- `criterion`: The function to measure the quality of a split (`'gini'` for Gini impurity, `'entropy'` for

information gain).

- `max_depth`: The maximum depth of the tree.

- `min_samples_split`: The minimum number of samples required to split an internal node.

- `min_samples_leaf`: The minimum number of samples required to be at a leaf node.

**Example Code**:

```python

from sklearn.tree import DecisionTreeClassifier

from sklearn.model_selection import train_test_split

from sklearn.metrics import accuracy_score

# Example data

X = ...

y = ...

# Split data

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Model initialization

dt = DecisionTreeClassifier(criterion='gini', max_depth=None, min_samples_split=2,

min_samples_leaf=1)
# Model fitting

dt.fit(X_train, y_train)

# Predictions

y_pred = dt.predict(X_test)

# Evaluation

accuracy = accuracy_score(y_test, y_pred)

print(f'Accuracy: {accuracy}')

```

4. Random Forest

**Overview**: Random Forest is an ensemble method that combines multiple decision trees to

improve classification or regression results.

**Key Hyperparameters**:

- `n_estimators`: The number of trees in the forest.

- `criterion`: The function to measure the quality of a split (`'gini'`, `'entropy'`).

- `max_depth`: The maximum depth of the tree.

- `min_samples_split`: The minimum number of samples required to split an internal node.

- `min_samples_leaf`: The minimum number of samples required to be at a leaf node.

**Example Code**:

```python

from sklearn.ensemble import RandomForestClassifier


from sklearn.model_selection import train_test_split

from sklearn.metrics import accuracy_score

# Example data

X = ...

y = ...

# Split data

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Model initialization

rf = RandomForestClassifier(n_estimators=100, criterion='gini', max_depth=None,

min_samples_split=2, min_samples_leaf=1)

# Model fitting

rf.fit(X_train, y_train)

# Predictions

y_pred = rf.predict(X_test)

# Evaluation

accuracy = accuracy_score(y_test, y_pred)

print(f'Accuracy: {accuracy}')

```

5. AdaBoost
**Overview**: AdaBoost is an ensemble method that combines multiple weak classifiers to create a

strong classifier.

**Key Hyperparameters**:

- `n_estimators`: The maximum number of estimators at which boosting is terminated.

- `learning_rate`: Weight applied to each classifier at each boosting iteration.

- `base_estimator`: The base estimator from which the boosted ensemble is built (e.g.,

`DecisionTreeClassifier`).

**Example Code**:

```python

from sklearn.ensemble import AdaBoostClassifier

from sklearn.tree import DecisionTreeClassifier

from sklearn.model_selection import train_test_split

from sklearn.metrics import accuracy_score

# Example data

X = ...

y = ...

# Split data

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Model initialization

ada = AdaBoostClassifier(base_estimator=DecisionTreeClassifier(max_depth=1), n_estimators=50,


learning_rate=1.0)

# Model fitting

ada.fit(X_train, y_train)

# Predictions

y_pred = ada.predict(X_test)

# Evaluation

accuracy = accuracy_score(y_test, y_pred)

print(f'Accuracy: {accuracy}')

```

6. K-Nearest Neighbors (KNN)

**Overview**: KNN is a non-parametric method used for classification and regression by finding the

k most similar instances in the training data.

**Key Hyperparameters**:

- `n_neighbors`: Number of neighbors to use.

- `weights`: Weight function used in prediction (`'uniform'`, `'distance'`).

- `algorithm`: Algorithm used to compute the nearest neighbors (`'auto'`, `'ball_tree'`, `'kd_tree'`,

`'brute'`).

**Example Code**:

```python
from sklearn.neighbors import KNeighborsClassifier

from sklearn.model_selection import train_test_split

from sklearn.metrics import accuracy_score

# Example data

X = ...

y = ...

# Split data

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Model initialization

knn = KNeighborsClassifier(n_neighbors=5, weights='uniform', algorithm='auto')

# Model fitting

knn.fit(X_train, y_train)

# Predictions

y_pred = knn.predict(X_test)

# Evaluation

accuracy = accuracy_score(y_test, y_pred)

print(f'Accuracy: {accuracy}')

```

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy