Normalization

Normalization is the process of adjusting the scale of numerical variables in a dataset so they fall within a common range. This makes comparisons more accurate and can improve the performance of machine learning models.

Example

Imagine a dataset with information about houses, including square footage, which ranges from 500 to 5,000, and the number of rooms, which ranges from 1 to 10. Because these variables are on very different scales, a machine learning model might place too much weight on square footage and produce biased or inaccurate predictions. By normalizing the variables, for example by scaling both between 0 and 1, the model can assess each variable more fairly and make better predictions.