# Machine Learning II

In Machine Learning I, we observed the prediction success of our models and how the process was. We continue.

**Least Square Method**

This method is generally used in libraries such as scikit-learn, scipy, and regression problems. Gradient descent method is not used in these libraries.

The Normal Equations method gives us an analytical solution. It is a standard regression method used to write the mathematical relationship between two interdependently varying physical quantities as an equation that is as realistic as possible.

**Gradient Descent**

In the Gradient Descent method, the old values of the parameters are updated with certain rates, and the change of the error is monitored. This method gives us the optimization solution. The gradient descent method is not a derived topic with machine learning. It is a method already found in the statistical community.

**Least Square Method vs Gradient Descent**

The purpose of both methods is to find the coefficients/weights(b,w)

**Least Square Method:**

- The error (MSE) is evident when the coefficients are found.

- Methods such as feature engineering and data preparation can be used to reduce the error.

- It will be difficult to find the inverse of the matrix in very large data sets.

- This method is used in widely used libraries. (scikit-learn, scipy etc.)

**Gradient Descent:**

- The error (MSE) changes with each iteration.

- The developer has to intervene in the process and make hyperparameter settings. (number of iterations, learning rate)

**Linear vs Logistic Regression**

Linear Regression and Logistic Regression are the two famous Machine Learning Algorithms that come under the supervised learning techniques. Since both the algorithms are supervised in nature hence these algorithms use labeled datasets to make the predictions. But the main difference between them is how they are being used. Linear Regression is used for solving Regression problems whereas Logistic Regression is used for solving Classification problems. The description of both the algorithms is given below along with the difference table.