Lush vs Bash Benchmark Report
Model: claude-sonnet-4-20250514 · Latest run: 20260401T183152Z · Tasks: 18
Summary
| Task | Cat |
Bash | Turns |
Lush | Turns |
| fizzbuzz | algorithm |
PASS | 1 |
PASS | 1 |
| reverse_string | algorithm |
PASS | 1 |
PASS | 1 |
| two_sum | algorithm |
PASS | 1 |
PASS | 1 |
| env_config | environment |
FAIL | 4 |
PASS | 2 |
| env_path_builder | environment |
PASS | 0 |
PASS | 1 |
| path_normalizer | environment |
PASS | 0 |
PASS | 1 |
| file_organizer | filesystem |
FAIL | 4 |
PASS | 1 |
| multi_file_search | filesystem |
PASS | 1 |
PASS | 2 |
| todo_manager | filesystem |
PASS | 0 |
PASS | 1 |
| csv_transform | pipeline |
PASS | 0 |
PASS | 1 |
| currency_converter | pipeline |
PASS | 0 |
PASS | 1 |
| locale_weather_url | pipeline |
PASS | 0 |
PASS | 1 |
| log_parser | pipeline |
PASS | 0 |
PASS | 1 |
| network_info_parser | pipeline |
PASS | 0 |
PASS | 1 |
| pipeline_transform | pipeline |
PASS | 1 |
PASS | 1 |
| pipeline_word_freq | pipeline |
PASS | 0 |
PASS | 1 |
| url_normalizer | pipeline |
PASS | 0 |
PASS | 1 |
| process_exit_codes | process |
PASS | 4 |
PASS | 1 |
| Total | |
16/18 | |
18/18 | |
Per-Category Summary
| Category |
Bash Pass | Lush Pass |
Bash Avg Turns | Lush Avg Turns |
Bash Avg Score | Lush Avg Score |
| algorithm |
3/3 | 3/3 |
1.0 | 1.0 |
3.5 | 3.9 |
| environment |
2/3 | 3/3 |
4.0 | 1.3 |
2.8 | 3.9 |
| filesystem |
2/3 | 3/3 |
2.5 | 1.3 |
3.1 | 3.8 |
| pipeline |
8/8 | 8/8 |
1.0 | 1.0 |
3.0 | 3.9 |
| process |
1/1 | 1/1 |
4.0 | 1.0 |
3.2 | 4.0 |
Questionnaire Scores
Questionnaire Scores by Category
Agent Turns (Solve Mode)
Score Difference Heatmap (Lush - Bash)
Per-Category Breakdown
algorithm
environment
filesystem
pipeline
process
Per-Task Detail
fizzbuzz [algorithm/solve]
bash=PASS
lush=PASS
| Metric | Bash | Lush | Diff |
| Syntax clarity | 3 | 5 | +2 |
| Signal-to-noise | 3 | 4 | +1 |
| Familiar conventions | 4 | 5 | +1 |
| Built-in operations | 4 | 5 | +1 |
| String operations | 4 | 4 | 0 |
| Composition | 5 | 3 | -2 |
| I/O ergonomics | 5 | 4 | -1 |
| Data structures | 4 | 4 | 0 |
| Error model | 2 | 3 | +1 |
| Edge case support | 2 | 3 | +1 |
| Learnability | 3 | 5 | +2 |
| Fitness for task | 4 | 4 | 0 |
reverse_string [algorithm/solve]
bash=PASS
lush=PASS
| Metric | Bash | Lush | Diff |
| Syntax clarity | 4 | 4 | 0 |
| Signal-to-noise | 5 | 4 | -1 |
| Familiar conventions | 2 | 5 | +3 |
| Built-in operations | 5 | 5 | 0 |
| String operations | 4 | 5 | +1 |
| Composition | 5 | 4 | -1 |
| I/O ergonomics | 5 | 4 | -1 |
| Data structures | 4 | 4 | 0 |
| Error model | 3 | 3 | 0 |
| Edge case support | 3 | 3 | 0 |
| Learnability | 4 | 5 | +1 |
| Fitness for task | 4 | 4 | 0 |
two_sum [algorithm/solve]
bash=PASS
lush=PASS
| Metric | Bash | Lush | Diff |
| Syntax clarity | 2 | 4 | +2 |
| Signal-to-noise | 2 | 3 | +1 |
| Familiar conventions | 3 | 4 | +1 |
| Built-in operations | 4 | 4 | 0 |
| String operations | 3 | 4 | +1 |
| Composition | 4 | 3 | -1 |
| I/O ergonomics | 4 | 4 | 0 |
| Data structures | 3 | 4 | +1 |
| Error model | 2 | 3 | +1 |
| Edge case support | 2 | 2 | 0 |
| Learnability | 3 | 4 | +1 |
| Fitness for task | 2 | 4 | +2 |
env_config [environment/solve]
bash=FAIL
lush=PASS
| Metric | Bash | Lush | Diff |
| Syntax clarity | 2 | 4 | +2 |
| Signal-to-noise | 2 | 3 | +1 |
| Familiar conventions | 2 | 4 | +2 |
| Built-in operations | 3 | 4 | +1 |
| String operations | 3 | 4 | +1 |
| Composition | 4 | 5 | +1 |
| I/O ergonomics | 4 | 5 | +1 |
| Data structures | 3 | 4 | +1 |
| Error model | 2 | 3 | +1 |
| Edge case support | 2 | 3 | +1 |
| Learnability | 2 | 4 | +2 |
| Fitness for task | 3 | 5 | +2 |
env_path_builder [environment/convert]
bash=PASS
lush=PASS
| Metric | Bash | Lush | Diff |
| Syntax clarity | 2 | 4 | +2 |
| Signal-to-noise | 2 | 4 | +2 |
| Familiar conventions | 3 | 4 | +1 |
| Built-in operations | 3 | 3 | 0 |
| String operations | 3 | 3 | 0 |
| Composition | 4 | 3 | -1 |
| I/O ergonomics | 4 | 5 | +1 |
| Data structures | 4 | 4 | 0 |
| Error model | 2 | 3 | +1 |
| Edge case support | 3 | 3 | 0 |
| Learnability | 2 | 4 | +2 |
| Fitness for task | 4 | 4 | 0 |
path_normalizer [environment/convert]
bash=PASS
lush=PASS
| Metric | Bash | Lush | Diff |
| Syntax clarity | 2 | 4 | +2 |
| Signal-to-noise | 2 | 4 | +2 |
| Familiar conventions | 2 | 5 | +3 |
| Built-in operations | 2 | 4 | +2 |
| String operations | 3 | 5 | +2 |
| Composition | 4 | 3 | -1 |
| I/O ergonomics | 4 | 5 | +1 |
| Data structures | 3 | 4 | +1 |
| Error model | 2 | 3 | +1 |
| Edge case support | 3 | 3 | 0 |
| Learnability | 2 | 4 | +2 |
| Fitness for task | 3 | 4 | +1 |
file_organizer [filesystem/solve]
bash=FAIL
lush=PASS
| Metric | Bash | Lush | Diff |
| Syntax clarity | 2 | 4 | +2 |
| Signal-to-noise | 3 | 4 | +1 |
| Familiar conventions | 2 | 5 | +3 |
| Built-in operations | 4 | 5 | +1 |
| String operations | 3 | 4 | +1 |
| Composition | 4 | 3 | -1 |
| I/O ergonomics | 4 | 5 | +1 |
| Data structures | 3 | 5 | +2 |
| Error model | 2 | 4 | +2 |
| Edge case support | 2 | 3 | +1 |
| Learnability | 2 | 4 | +2 |
| Fitness for task | 4 | 5 | +1 |
multi_file_search [filesystem/solve]
bash=PASS
lush=PASS
| Metric | Bash | Lush | Diff |
| Syntax clarity | 4 | 4 | 0 |
| Signal-to-noise | 4 | 3 | -1 |
| Familiar conventions | 3 | 4 | +1 |
| Built-in operations | 5 | 2 | -3 |
| String operations | 4 | 4 | 0 |
| Composition | 5 | 3 | -2 |
| I/O ergonomics | 5 | 4 | -1 |
| Data structures | 3 | 4 | +1 |
| Error model | 2 | 3 | +1 |
| Edge case support | 2 | 3 | +1 |
| Learnability | 2 | 4 | +2 |
| Fitness for task | 5 | 3 | -2 |
todo_manager [filesystem/convert]
bash=PASS
lush=PASS
| Metric | Bash | Lush | Diff |
| Syntax clarity | 2 | 4 | +2 |
| Signal-to-noise | 2 | 4 | +2 |
| Familiar conventions | 3 | 4 | +1 |
| Built-in operations | 2 | 3 | +1 |
| String operations | 2 | 4 | +2 |
| Composition | 4 | 3 | -1 |
| I/O ergonomics | 4 | 4 | 0 |
| Data structures | 2 | 4 | +2 |
| Error model | 2 | 2 | 0 |
| Edge case support | 2 | 3 | +1 |
| Learnability | 3 | 4 | +1 |
| Fitness for task | 3 | 4 | +1 |
csv_transform [pipeline/convert]
bash=PASS
lush=PASS
| Metric | Bash | Lush | Diff |
| Syntax clarity | 2 | 4 | +2 |
| Signal-to-noise | 3 | 4 | +1 |
| Familiar conventions | 2 | 4 | +2 |
| Built-in operations | 4 | 4 | 0 |
| String operations | 4 | 4 | 0 |
| Composition | 5 | 3 | -2 |
| I/O ergonomics | 4 | 5 | +1 |
| Data structures | 3 | 4 | +1 |
| Error model | 2 | 3 | +1 |
| Edge case support | 2 | 3 | +1 |
| Learnability | 2 | 4 | +2 |
| Fitness for task | 4 | 4 | 0 |
currency_converter [pipeline/convert]
bash=PASS
lush=PASS
| Metric | Bash | Lush | Diff |
| Syntax clarity | 2 | 4 | +2 |
| Signal-to-noise | 2 | 3 | +1 |
| Familiar conventions | 2 | 4 | +2 |
| Built-in operations | 1 | 2 | +1 |
| String operations | 3 | 4 | +1 |
| Composition | 4 | 4 | 0 |
| I/O ergonomics | 4 | 4 | 0 |
| Data structures | 2 | 4 | +2 |
| Error model | 2 | 3 | +1 |
| Edge case support | 3 | 3 | 0 |
| Learnability | 2 | 4 | +2 |
| Fitness for task | 2 | 4 | +2 |
locale_weather_url [pipeline/convert]
bash=PASS
lush=PASS
| Metric | Bash | Lush | Diff |
| Syntax clarity | 2 | 4 | +2 |
| Signal-to-noise | 3 | 4 | +1 |
| Familiar conventions | 2 | 4 | +2 |
| Built-in operations | 2 | 5 | +3 |
| String operations | 3 | 5 | +2 |
| Composition | 4 | 4 | 0 |
| I/O ergonomics | 4 | 5 | +1 |
| Data structures | 3 | 4 | +1 |
| Error model | 2 | 3 | +1 |
| Edge case support | 3 | 4 | +1 |
| Learnability | 2 | 4 | +2 |
| Fitness for task | 4 | 5 | +1 |
log_parser [pipeline/convert]
bash=PASS
lush=PASS
| Metric | Bash | Lush | Diff |
| Syntax clarity | 2 | 4 | +2 |
| Signal-to-noise | 3 | 4 | +1 |
| Familiar conventions | 2 | 4 | +2 |
| Built-in operations | 4 | 4 | 0 |
| String operations | 4 | 5 | +1 |
| Composition | 5 | 3 | -2 |
| I/O ergonomics | 4 | 5 | +1 |
| Data structures | 3 | 4 | +1 |
| Error model | 2 | 3 | +1 |
| Edge case support | 3 | 3 | 0 |
| Learnability | 2 | 4 | +2 |
| Fitness for task | 4 | 4 | 0 |
network_info_parser [pipeline/convert]
bash=PASS
lush=PASS
| Metric | Bash | Lush | Diff |
| Syntax clarity | 2 | 4 | +2 |
| Signal-to-noise | 2 | 4 | +2 |
| Familiar conventions | 3 | 5 | +2 |
| Built-in operations | 2 | 5 | +3 |
| String operations | 2 | 5 | +3 |
| Composition | 4 | 4 | 0 |
| I/O ergonomics | 4 | 5 | +1 |
| Data structures | 3 | 4 | +1 |
| Error model | 2 | 3 | +1 |
| Edge case support | 2 | 3 | +1 |
| Learnability | 2 | 4 | +2 |
| Fitness for task | 4 | 5 | +1 |
pipeline_transform [pipeline/solve]
bash=PASS
lush=PASS
| Metric | Bash | Lush | Diff |
| Syntax clarity | 3 | 4 | +1 |
| Signal-to-noise | 5 | 3 | -2 |
| Familiar conventions | 2 | 4 | +2 |
| Built-in operations | 5 | 3 | -2 |
| String operations | 4 | 4 | 0 |
| Composition | 5 | 2 | -3 |
| I/O ergonomics | 5 | 4 | -1 |
| Data structures | 4 | 4 | 0 |
| Error model | 2 | 3 | +1 |
| Edge case support | 2 | 3 | +1 |
| Learnability | 3 | 4 | +1 |
| Fitness for task | 5 | 3 | -2 |
pipeline_word_freq [pipeline/convert]
bash=PASS
lush=PASS
| Metric | Bash | Lush | Diff |
| Syntax clarity | 2 | 4 | +2 |
| Signal-to-noise | 4 | 4 | 0 |
| Familiar conventions | 2 | 5 | +3 |
| Built-in operations | 5 | 4 | -1 |
| String operations | 4 | 5 | +1 |
| Composition | 5 | 3 | -2 |
| I/O ergonomics | 5 | 4 | -1 |
| Data structures | 3 | 4 | +1 |
| Error model | 2 | 3 | +1 |
| Edge case support | 3 | 3 | 0 |
| Learnability | 2 | 4 | +2 |
| Fitness for task | 5 | 4 | -1 |
url_normalizer [pipeline/convert]
bash=PASS
lush=PASS
| Metric | Bash | Lush | Diff |
| Syntax clarity | 2 | 4 | +2 |
| Signal-to-noise | 2 | 4 | +2 |
| Familiar conventions | 3 | 4 | +1 |
| Built-in operations | 2 | 4 | +2 |
| String operations | 3 | 5 | +2 |
| Composition | 4 | 4 | 0 |
| I/O ergonomics | 4 | 5 | +1 |
| Data structures | 3 | 4 | +1 |
| Error model | 2 | 3 | +1 |
| Edge case support | 3 | 4 | +1 |
| Learnability | 2 | 4 | +2 |
| Fitness for task | 3 | 4 | +1 |
process_exit_codes [process/solve]
bash=PASS
lush=PASS
| Metric | Bash | Lush | Diff |
| Syntax clarity | 2 | 4 | +2 |
| Signal-to-noise | 3 | 4 | +1 |
| Familiar conventions | 2 | 4 | +2 |
| Built-in operations | 4 | 5 | +1 |
| String operations | 4 | 4 | 0 |
| Composition | 5 | 3 | -2 |
| I/O ergonomics | 4 | 5 | +1 |
| Data structures | 3 | 4 | +1 |
| Error model | 3 | 3 | 0 |
| Edge case support | 2 | 3 | +1 |
| Learnability | 2 | 4 | +2 |
| Fitness for task | 5 | 5 | 0 |