Initial commit: Lush vs Bash AI benchmarking framework
Benchmark harness that uses LLM agents to solve shell scripting tasks in both Bash and Lush, then compares correctness and code quality. - CLI with run, run-all, list-tasks, report, and export commands - Agent loop with retry support via Anthropic Claude provider - Test harness executing solutions in sandboxed subprocesses - LLM-driven questionnaire for subjective code quality evaluation - HTML report export with charts (matplotlib) - 8 Category A tasks (write-from-scratch in both languages) - 4 Category B tasks (verify provided Bash, convert to Lush) - Lush language reference for agent context
This commit is contained in:
32
tasks/category_a/env_config.toml
Normal file
32
tasks/category_a/env_config.toml
Normal file
@@ -0,0 +1,32 @@
|
||||
name = "env_config"
|
||||
category = "a"
|
||||
description = """
|
||||
Read a config format from stdin where each line is "KEY=VALUE".
|
||||
For each line, set an environment variable with that key and value.
|
||||
After processing all lines, run the command `env` and print only the variables
|
||||
that were set from the input, sorted alphabetically by key, in "KEY=VALUE" format.
|
||||
|
||||
You must actually set these as environment variables and retrieve them back
|
||||
(not just echo the input).
|
||||
"""
|
||||
|
||||
[[test_cases]]
|
||||
stdin = """APP_NAME=myapp
|
||||
APP_PORT=8080
|
||||
APP_DEBUG=true"""
|
||||
expected_stdout = """APP_DEBUG=true
|
||||
APP_NAME=myapp
|
||||
APP_PORT=8080"""
|
||||
env = {}
|
||||
|
||||
[[test_cases]]
|
||||
stdin = """DB_HOST=localhost
|
||||
DB_PORT=5432"""
|
||||
expected_stdout = """DB_HOST=localhost
|
||||
DB_PORT=5432"""
|
||||
env = {}
|
||||
|
||||
[[test_cases]]
|
||||
stdin = "SINGLE_VAR=hello"
|
||||
expected_stdout = "SINGLE_VAR=hello"
|
||||
env = {}
|
||||
Reference in New Issue
Block a user