Files
lush_grading/tasks/category_a/fizzbuzz.toml
Cormac Shannon be8d657b24 Initial commit: Lush vs Bash AI benchmarking framework
Benchmark harness that uses LLM agents to solve shell scripting tasks
in both Bash and Lush, then compares correctness and code quality.

- CLI with run, run-all, list-tasks, report, and export commands
- Agent loop with retry support via Anthropic Claude provider
- Test harness executing solutions in sandboxed subprocesses
- LLM-driven questionnaire for subjective code quality evaluation
- HTML report export with charts (matplotlib)
- 8 Category A tasks (write-from-scratch in both languages)
- 4 Category B tasks (verify provided Bash, convert to Lush)
- Lush language reference for agent context
2026-03-29 17:56:30 +01:00

39 lines
544 B
TOML

name = "fizzbuzz"
category = "a"
description = """
Read a single integer N from stdin. Print numbers from 1 to N, one per line.
For multiples of 3, print "Fizz" instead of the number.
For multiples of 5, print "Buzz" instead of the number.
For multiples of both 3 and 5, print "FizzBuzz" instead of the number.
"""
[[test_cases]]
stdin = "15"
expected_stdout = """1
2
Fizz
4
Buzz
Fizz
7
8
Fizz
Buzz
11
Fizz
13
14
FizzBuzz"""
[[test_cases]]
stdin = "5"
expected_stdout = """1
2
Fizz
4
Buzz"""
[[test_cases]]
stdin = "1"
expected_stdout = "1"