Dummy variables
Dummy variables are binary numeric variables used to represent categorical data. They convert qualitative attributes into a format that machine learning models and statistical analysis can evaluate.
Example
Consider a dataset with employee information, including employment status such as 'full-time', 'part-time', or 'contract'. To use this field in a machine learning model or statistical analysis that requires numeric input, dummy variables can be created for each category. In this case, two new binary variables might be added: 'Is_full_time' and 'Is_part_time'. A full-time employee would be represented as [1, 0], a part-time employee as [0, 1], and a contract employee as [0, 0]. This lets the model or analysis include the categorical information correctly.