October 11, 2025

What Is Underfitting

Q: How is underfitting different from overfitting?

Underfitting is when a model is too simple to capture data patterns, performing poorly on training and test data (high bias). Overfitting is when a model learns the training data too well, including noise, performing well on training data but poorly on new data (high variance).

Q: What is 'high bias' in the context of underfitting?

High bias means the model makes strong assumptions about the data's underlying structure, often leading it to miss relevant relations between features and target outputs, consistently producing large errors.

Q: Can a model suffer from both underfitting and overfitting?

While models typically lean towards one or the other, the process of finding the optimal model complexity involves navigating the 'bias-variance tradeoff,' where decreasing one often increases the other, so a balance is sought.

Q: Why is underfitting considered a problem?

Underfitting results in a model that cannot effectively learn from the available data, making it unreliable and inaccurate for making predictions or classifications on any dataset, rendering it practically useless.

Learn about underfitting in data modeling, a common problem where a model is too simple to capture the underlying patterns in the data, leading to poor performance.

Have More Questions →

Definition of Underfitting

Underfitting occurs in a statistical or machine learning model when it is too simple to adequately capture the underlying trend or patterns within the training data. This failure to learn from the data results in a model that performs poorly on both the training data and new, unseen data. It is characterized by high bias and often high variance (though primarily high bias).

Causes and Characteristics

Common causes of underfitting include using a model that is not complex enough for the data (e.g., fitting a linear model to non-linear data), insufficient training duration, or insufficient features (input variables) for the model to learn from. An underfit model typically shows low accuracy or high error rates on the training set, indicating it hasn't learned the basic relationships.

Example of Underfitting

Imagine trying to model the relationship between a person's age and their income using a simple straight line (linear regression) when the actual relationship is parabolic (income increases, then plateaus, then slightly decreases after a certain age). The straight line would be too simple, failing to capture the initial rise and later plateau/decline, thus underfitting the data.

Addressing Underfitting

To combat underfitting, one can increase model complexity (e.g., using polynomial regression instead of linear, or a more complex neural network architecture). Adding more relevant features, reducing regularization, or training the model for a longer period (with caution against overfitting) are also effective strategies.

Frequently Asked Questions

How is underfitting different from overfitting?

What is 'high bias' in the context of underfitting?

Can a model suffer from both underfitting and overfitting?

Why is underfitting considered a problem?