I want to use the same logic of linear regression on binary classification problems. However, the range of y is 0 to 1. Then, I would like to transform the y value from [0,1] to [-inf, +inf].
To achieve this, logistic regression converts the probability value to log odds
$$ f(p)=log(\frac{p}{1-p}) $$
Side notes: There are two crucial properties of log odds:
To briefly recall linear regression, the coefficient values (beta) are gradients which indicate the change of dependent variable y per unit change of dependent variable x.
In logistic regression, the coefficient values are the same as the ones of linear regression, except that the y axis is log odds values (The graph on the right). In this case, the coefficient values indicate the change of log(odds) per unit change of x.


In logistic regression, we try to find the best-fit line such that it maximizes the log likelihood. The formula is shown below:
$$ LL=\sum_{i}y_i*ln(\hat{y}_i)+(1-y_i)*ln(1-\hat{y}_i) $$
Say we do linear regression. As the y values of the training sets are either -inf or +inf, we can’t compute the MSE. Therefore, what we do is to:
Loop forever: