From c04e6d7778d5838d9e155f013c67654332ac3ed2 Mon Sep 17 00:00:00 2001
From: Lukas May <lukas.may@carealytix.com>
Date: Wed, 18 Feb 2026 16:54:10 +0900
Subject: [PATCH] refactor: Replace file-count task sizing with lines-changed
 heuristic

Anchor on ~150 lines changed as the sweet spot based on SWE-bench Pro
data (107 lines / 4.1 files = 46% success for best agents). Old rules
used file count as the primary proxy which correlates poorly with task
difficulty compared to lines changed.
---
 src/agent/prompts/detail.ts | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/src/agent/prompts/detail.ts b/src/agent/prompts/detail.ts
index d5865d2..93835ba 100644
--- a/src/agent/prompts/detail.ts
+++ b/src/agent/prompts/detail.ts
@@ -65,10 +65,13 @@ If two tasks need to modify the same file or need the functionality another task
 
 ## Task Sizing
 
-- **1-5 files**: Good task size
-- **7+ files**: Too big — split into smaller tasks
-- **1 sentence description**: Too small — merge with related work or add more detail
-- **500+ words**: Probably overspecified — simplify or split
+Size tasks by expected lines changed — this predicts difficulty far more than file count.
+
+- **Under ~150 lines changed across 1-3 files**: Sweet spot. High confidence an agent completes this in one shot.
+- **~150-300 lines or 4-5 files**: Risky. Only if the work is highly mechanical (e.g., repetitive migrations, boilerplate). Needs very precise specs.
+- **300+ lines or 5+ files**: Too big — split it. Agent success drops sharply at this scale.
+- **1 sentence description**: Too vague — merge with related work or add concrete detail.
+- **Under ~20 lines**: Too small — merge with a related task to avoid per-task overhead.
 
 ## Checkpoint Tasks