Before delving into the topic of Coefficient of Determination, it is important to grasp the importance of evaluating a machine learning model using error metrics.
To solve any model in the field of Data Science, the developer must first analyze the efficiency of the model before applying it to the dataset. The model is evaluated using specific error metrics. One such error metric is the coefficient of determination.
Coefficient of Determination:
The coefficient of determination, often known as the R2 score. It is used to evaluate the performance of a linear regression model.
It is the degree of variation in the output-dependent attribute that can be predicted based on the input independent variable (s). It is used to determine how effectively the model reproduces observed results, based on the ratio of total deviation of results explained by the model.
R square has a range between [0,1].
Formula
R2= 1- SSres / SStot
where.
SSres: SSres is the sum of the squares of the residual errors of the data model’s
SStot: The total sum of the errors is represented by SStot.
Note: The higher the R square value, the better the model and the outcomes.
R2 With Numpy
Example:
Approach:
- Import numpy module using the import keyword.
- Give the list of actual values as static input and store it in a variable.
- Give the list of predicted values as static input and store it in another variable.
- Pass the given actual, predicted lists as the arguments to the corrcoef() function to get the Correlation Matrix.
-
Slice the matrix with the indexes [0,1] to get the value of R, also known as the Coefficient of Correlation.
- Store it in another variable.
-
Calculate the value of R**2(R square) and store it in another variable.
-
Print the value of R square.
-
The Exit of the Program.
Below is the implementation:
# Import numpy module using the import keyword import numpy # Give the list of actual values as static input and store it in a variable. actul_vals = [5, 1, 7, 2, 4] # Give the list of predicted values as static input and store it in another # variable. predctd_vals = [4, 1.5, 2.8, 3.7, 4.9] # Pass the given actual, predicted lists as the arguments to the corrcoef() # function to get the Correlation Matrix correltn_matrx = numpy.corrcoef(actul_vals, predctd_vals) # Slice the matrix with the indexes [0,1] to get the value of R, also known # as the Coefficient of Correlation. rsltcorretn = correltn_matrx[0, 1] # Calculate the value of R**2(R square) and store it in another variable. Rsqure = rsltcorretn**2 # Print the value of R square print("The result value of R square = ",Rsqure)
Output:
The result value of R square = 0.09902230080299725
R2 With Sklearn Library
The Python sklearn module has an r2_score() function for calculating the coefficient of determination.
Example
Approach:
-
Import r2_score function from sklearn.metrics module using the import keyword.
-
Give the list of actual values as static input and store it in a variable.
- Give the list of predicted values as static input and store it in another variable.
-
Pass the given actual, predicted lists as the arguments to the r2_score() function to get the value of the coefficient of determination(R square).
-
Print the value of R square(coefficient of determination).
- The Exit of the Program.
Below is the implementation:
# Import r2_score function from sklearn.metrics module using the import keyword. from sklearn.metrics import r2_score # Give the list of actual values as static input and store it in a variable. actul_vals = [5, 1, 7, 2, 4] # Give the list of predicted values as static input and store it in another # variable. predctd_vals = [4, 1.5, 2.8, 3.7, 4.9] # Pass the given actual, predicted lists as the arguments to the r2_score() # function to get the value of the coefficient of determination(R square). Rsqure = r2_score(actul_vals, predctd_vals) # Print the value of R square(coefficient of determination) print("The result value of R square = ",Rsqure)
Output:
The result value of R square = 0.009210526315789336