Noise Triggers Tool Dropping

Models systematically drop enhancement features when processing semantic noise

90%
Clean requests
use 4 tools
67%
Poem noise requests
use 4 tools
14/27
4-tool requests
dropped to 3

🔍 The Pattern

When models encounter semantic noise, they maintain the core workflow (Search → Check → Reserve) but systematically drop "nice-to-have" features like:

This is statistically significant (p=0.028) and shows models prioritize essential tasks under cognitive load.

🎯 Why This Matters

This demonstrates that LLMs have an implicit task hierarchy. When processing becomes more complex due to noise, they shed non-essential features while preserving core functionality - just like humans under stress.