Demystifying Model Interpretability: A Deep Dive into Individual Conditional Expectation (ICE) Plots

by Veronica Scerra

In a previous post, I introduced Partial Dependence Plots (PDPs) and discussed their value as a global interpretability tool. But as with most things in modeling, one tool rarely fits all purposes. If PDPs offer a bird’s-eye view of a model’s behavior from one variable’s perspective, ICE plots give us the chance to swoop down and examine the terrain at the level of individual data points. These tools for granular inspection are the focus of today’s deep dive.

TL; DR

ICE
What:	Shows how model prediction changes for a single observation as one feature varies
Use When:	You want local interpretability
Assumptions:	Few assumptions; can handle correlated features gracefully
Alternatives:	PDP, SHAP, LIME, ALE

What is an Individual Conditional Expectation Plot?

Time for a new analogy. Let’s say you have several dogs, and you, as a data enthusiast, collect all manner of data points about your dogs’ daily lives. You track the amount and type of food consumed, length and number of walks, hours of sleep and play, treats given, along with breed and biometric data, and use those variables to predict health and vitality. With PDPs, you could track how changing one specific variable (say, length of walks) influences health overall. What if your dogs are a bit different, and what affects one, might not affect another in the same way? I have a medium-build, athletic, energetic dog and a small, lazy, cuddle-beast of a dog - changing the length of their walks will affect each of them differently. This is where ICE comes in.

ICE plots don’t average effects across the dataset; instead, they trace the effect of changing one variable for each observation individually. While PDPs say “on average, longer walks are better for health,” ICE says “let me show you how walk length affects each dog, one at a time.”

This level of granularity allows us to see whether effects are consistent across populations, or if subgroups react differently - which can be critically important when your model’s decisions affect real people (or animals) in diverse ways.

How Does It Work?

The ICE function for a feature \( x_j \) and instance \( \mathbf{x}^{(i)} \) is:

\[ \hat{f}^{(i)}(x_j) = f(x_j, \mathbf{x}_{-j}^{(i)}) \]

\( f(\cdot) \): The trained prediction function (e.g., a random forest or neural network)
\( \hat{f}^{(i)}(x_j) \): The ICE function – the prediction for the i-th instance as a function of \( x_j \)
\( x_j \): The feature whose impact you want to study
\( \mathbf{x}_{-j}^{(i)} \): All other features (not \( x_j \)) for instance \( i \), held fixed

In practice:

Start with one row of your data (one individual instance)
Change one feature — the one you’re analyzing — across a range of values
For each value, predict using the model, keeping all others fixed
Plot the results: x = feature values, y = predicted outputs
Repeat for each row in the dataset to see how effects vary per instance

Simple 🙂

Strengths of ICE Plots

ICE analysis is great for:

Detecting heterogeneity: you can use ICE plots to identify diversity in model responses and subgroup differences
Uncovering interactions: using this method, you can more easily spot interactions or non-linear patterns in the data that might mislead interpretation attempts when masked by marginal averaging.
Building trust: In sensitive applications where individual outcomes matter, ICE analysis can be a powerful tool for fairness evaluation.

When to Use ICE Plots

ICE plots are a great go-to when you suspect interactions of subgroup differences (things that might get washed out by PDPs) and you want to audit how your model behaves for individual cases. They’re also a useful tool for validating fairness or consistency across subpopulations. Additionally, ICE is handy when you’re working with black-box models and need more than just global interpretability.

PDPs and ICE plots are useful together - you can overlay the PDP on top of your ICE plot for a dual view: one line per individual, and the average across all lines (PDP) to provide context.

Limitations

ICE plots can become cluttered with many data points or noisy data. The same diversity that makes them powerful can also make them harder to interpret without smoothing or subgrouping. You also have to be careful when assuming that varying one feature while holding others fixed is actually meaningful. Too much correlation between features can still make interpretability challenging.

Too much noise and variance in your ICE plots can obscure insights - just imagine trying to interpret 50 observations with their own unique trajectories... Like many methods, it's best to go into the situation with some principled reason for wanting to compare observations in this way, otherwise you might just get a lot of lines and overwhelm without much interpretability.

Final Thoughts

ICE plots are an essential complement to PDPs - one gives you the forest, the other shows you the trees. In fields where trust, fairness, or user-level behavior is paramount, ICE plots can reveal obscure model dynamics and help you understand when and for whom your model is working (or not working). Combined with domain expertise and thoughtful storytelling, they're incredibly useful. Remember! Modeling is not just about prediction, it's also about understanding.

← References