From 41b2095ed983b5b13413e3dae97416f50ba81429 Mon Sep 17 00:00:00 2001 From: Cormac Shannon <> Date: Sun, 1 Mar 2026 19:35:09 +0000 Subject: [PATCH] Rewrite issue #10 for scripts+REPL, add issue #12, add interactive command tests MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Redesign issue #10: bare-word commands now work in both scripts and the REPL via a parser-level heuristic (identifier + non-exception-list token → shell command). Add runtime fallback for string-arg syntax (echo "hello"), double-dash flag handling, and classification examples. Add issue #12 for path-based command execution (./script, /bin/ls, ~/bin/deploy). Add testes/lush/commands-interactive.lua as a design playground covering result table structure, exit codes, commands inside Lua blocks, _ behaviour, runtime fallback, Lua variable shadowing, and interleaved Lua/shell. --- issues/10-interactive-command-execution.md | 188 ++++++++++- issues/12-path-based-command-execution.md | 77 +++++ testes/lush/commands-interactive.lua | 361 +++++++++++++++++++++ 3 files changed, 610 insertions(+), 16 deletions(-) create mode 100644 issues/12-path-based-command-execution.md create mode 100644 testes/lush/commands-interactive.lua diff --git a/issues/10-interactive-command-execution.md b/issues/10-interactive-command-execution.md index 45682166..83a508bb 100644 --- a/issues/10-interactive-command-execution.md +++ b/issues/10-interactive-command-execution.md @@ -23,9 +23,9 @@ Lush has two distinct ways to run external commands: This mirrors how traditional shells work: commands run interactively by default, and you explicitly capture output when you want it (in bash: `var=$(cmd)`; in lush: `` var = `cmd` ``). -## Syntax — bare-word REPL fallback +## Syntax — bare-word fallback -If a line fails to parse as valid Lua, try to interpret it as an interactive shell command: +If a line fails to parse as valid Lua, try to interpret it as an interactive shell command. This works in both the REPL and scripts — bare commands can appear anywhere a Lua statement can, including inside `if`/`for`/`while`/`do`/`function` blocks. ``` > vim foo.lua -- not valid Lua → runs as shell command @@ -34,7 +34,164 @@ If a line fails to parse as valid Lua, try to interpret it as an interactive she > x = `ls`.stdout -- valid Lua → runs as Lua (captured) ``` -This only applies in the REPL. In scripts, use backtick syntax for captured commands (interactive commands in scripts are a separate question — possibly via `!` prefix or `exec()`, deferred). +```lua +-- deploy.lua +local env = os.getenv("ENV") or "staging" +print("deploying to " .. env) +ssh deploy@prod ./restart.sh +ls -la /var/log +if _.code ~= 0 then + print("deploy failed") + os.exit(1) +end +``` + +Bare commands also work inside Lua blocks: + +```lua +do + ls + ssh site.com + btop + ls -lha / + print("hello world") +end +ls +print("hello world again") +``` + +## Parser-level detection + +The bare-word fallback is implemented in the parser, not as a post-hoc retry. This allows bare commands inside Lua blocks and preserves single-chunk compilation for scripts. + +### The heuristic + +When the parser sees an identifier (not a keyword) at statement position, it peeks at the next token. If the next token is in the **exception list**, the statement is parsed as Lua. If not, it's a shell command. + +The **exception list** (tokens that can follow an identifier in a valid Lua statement): + +| Token | Lua meaning | +|-------|-------------| +| `(` | function call: `f(...)` | +| string literal | function call: `f "..."` | +| `{` | function call: `f {...}` | +| `.` | field access: `t.field` | +| `:` | method call: `obj:method()` | +| `[` | index: `t[k]` | +| `=` | assignment: `x = ...` | +| `,` | multi-assignment: `x, y = ...` | + +This is exhaustive — every valid Lua statement starting with an identifier must have one of these tokens second. Anything else (another identifier, `-`, `/`, `+`, a keyword, EOF) means the line cannot be valid Lua, so it's safe to treat as a shell command. No valid Lua syntax is stolen. + +### No newline awareness needed for detection + +The heuristic doesn't need to distinguish same-line vs different-line tokens. A bare identifier like `ls` on its own line is followed by whatever comes next — an identifier, keyword, or EOF — none of which are in the exception list. So `ls` is correctly detected as a shell command. + +Multi-line Lua function calls are preserved because their continuation tokens are in the exception list: + +```lua +f +("hello") -- f + ( → exception list → Lua ✓ + +f +"hello" -- f + string → exception list → Lua ✓ + +f +{1, 2, 3} -- f + { → exception list → Lua ✓ +``` + +### Newline awareness for argument capture + +Once the parser detects a shell command, it needs to know where the arguments end. Shell commands are newline-terminated — the parser extracts the raw text of the rest of the line from the source buffer (rather than reconstructing from tokens). This preserves the exact argument string, including characters the Lua lexer can't tokenize (like `@` in `ssh user@host`). + +The parser emits bytecode equivalent to `_ = __interactive("raw line text")`. + +### Double-dash flags (`--`) + +`--` starts a comment in Lua. This conflicts with double-dash flags in shell commands: `git --version`, `ls --color=auto`, `grep --include="*.lua"`. + +The problem: when the parser detects a shell command and peeks at the next token, the lexer may encounter `--` and consume the rest of the line as a comment. The argument text is lost before the parser can capture it. + +The solution: the parser must record the source buffer position **before** peeking at the next token. When the heuristic determines "shell command," the parser extracts the raw text starting from the saved position (the character right after the first identifier), scanning to the end of the line in the source buffer. This bypasses the lexer entirely for the argument portion — `--version` is captured as raw text, not interpreted as a comment. + +This means the lexer's position may be ahead of the raw-captured text (it already skipped the "comment"). The parser must advance the lexer to the correct position (next line) after extracting the raw command. + +## Classification examples + +### Shell — identifier + non-exception token + +| Input | Next token | Result | +|-------|-----------|--------| +| `ls -la` | `-` | shell | +| `ls` | next statement / EOF | shell | +| `git status` | `status` (identifier) | shell | +| `git commit -m "fix bug"` | `commit` (identifier) | shell | +| `cd /tmp` | `/` | shell | +| `grep -r "pattern" .` | `-` | shell | +| `docker compose up -d` | `compose` (identifier) | shell | +| `tar xzf archive.tar.gz` | `xzf` (identifier) | shell | +| `sudo ls` | `ls` (identifier) | shell | +| `ssh user@host` | lexer error on `@` → raw capture | shell | +| `curl https://example.com` | `https` (identifier) | shell | +| `git --version` | `--` (lexer sees comment) | shell (raw source capture) | +| `ls --color=auto /tmp` | `--` (lexer sees comment) | shell (raw source capture) | +| `nonexistent_cmd` | next statement / EOF | shell (runs, exits 127) | + +### Lua — identifier + exception-list token + +| Input | Next token | Result | +|-------|-----------|--------| +| `print("hello")` | `(` | Lua | +| `print "hello"` | string literal | Lua | +| `f {1, 2, 3}` | `{` | Lua | +| `io.write("x")` | `.` | Lua | +| `obj:method()` | `:` | Lua | +| `t[1] = 5` | `[` | Lua | +| `x = 5` | `=` | Lua | +| `x, y = 1, 2` | `,` | Lua | + +### Lua — keywords (heuristic doesn't apply) + +| Input | Result | +|-------|--------| +| `local x = 1` | Lua | +| `if x then ... end` | Lua | +| `for i = 1, 10 do ... end` | Lua | +| `return x` | Lua | +| `while true do ... end` | Lua | + +### Lua syntax + runtime fallback to shell + +Lua's syntactic sugar means `echo "hello"` parses as `echo("hello")` — the string literal is in the exception list, so the parser treats it as Lua. A **runtime fallback** catches the error: + +1. Lua executes the call → "attempt to call a nil value" +2. Extract the undefined name, look it up in PATH +3. Found → run as interactive shell command, assign result to `_` +4. Not found → report the original Lua error + +| Input | Parses as | Runtime | +|-------|----------|---------| +| `echo "hello"` | `echo("hello")` | `echo` nil → PATH → found → shell | +| `grep "pattern"` | `grep("pattern")` | `grep` nil → PATH → found → shell | +| `man "ls"` | `man("ls")` | `man` nil → PATH → found → shell | +| `pirnt "hello"` | `pirnt("hello")` | `pirnt` nil → PATH → not found → Lua error | + +### Bare expressions (already invalid Lua) + +| Input | Next token | Notes | +|-------|-----------|-------| +| `x -y` | `-` | shell — `x - y` as a bare statement is a syntax error in Lua anyway | +| `a + b` | `+` | shell — same; `a` probably not in PATH → exits 127 | + +## The fallback chain + +1. **Parser heuristic** — identifier + non-exception token at statement position → emit `_ = __interactive("line")` bytecode. Handles most bare commands during compilation. Works in both REPL and scripts, inside any block. + +2. **Normal Lua execution** — if the parser accepted the line as Lua, execute it. If OK → done. + +3. **Runtime "attempt to call a nil value"** — extract the undefined name, check PATH. Found → re-execute as `_ = __interactive("line")`. Not found → report the original Lua error. Catches the `echo "hello"` edge case. + +4. **Post-hoc fallback (REPL only)** — if the lexer itself errors before the parser heuristic can run (e.g., unusual characters not after an identifier), try the raw input line as a shell command. ## Return value and `_` @@ -70,27 +227,26 @@ Add `luaB_interactive` to `lcmd.c`, registered as `__interactive` global: - Parent: ignore `SIGINT`/`SIGQUIT` (so Ctrl-C goes to child, not lush), then `waitpid()` and return result table - Reuses existing `parse_argv()` for command string tokenization -## REPL implementation +Add `luaB_command_exists` to `lcmd.c` — PATH lookup utility: -In `lua.c`'s `loadline()`, after both `addreturn()` and `multiline()` fail: - -1. Get the raw input line -2. Pass it to `__interactive(line)` -3. Assign the result to `_` -4. Continue the REPL loop (don't report a Lua syntax error) +- Takes a command name, searches each directory in `PATH` +- Returns 1 (found) or 0 (not found) +- Used by the runtime fallback (step 3) to decide whether to attempt a shell command or report a Lua error ## Open questions - Should `_` be set after captured (backtick) commands too? (So `_.code` always reflects the most recent command regardless of mode.) -- Should the bare-word fallback support `${}` interpolation? (Probably not in the REPL fallback — the line isn't parsed by the lexer at all, it's just a raw string.) +- Should the bare-word fallback support `${}` interpolation? (Probably not — the line isn't parsed by the lexer at all, it's just a raw string.) - Job control: should Ctrl-Z suspend the child and return to lush? Requires `tcsetpgrp()` / process group work. Defer to a later issue. -- How to handle bare commands in scripts (not the REPL)? Defer — `!` prefix or `exec()` builtin are options for later. +- Path-based commands (`./script.sh`, `/usr/bin/foo`, `~/bin/script`) — first token isn't an identifier, so the parser heuristic doesn't apply. See issue #12. ## Files touched | File | Description | |------|-------------| -| `lcmd.c` | Add `luaB_interactive` — fork/exec without pipes, return result table | -| `lcmd.h` | Declare `luaB_interactive` | -| `linit.c` | Register `__interactive` global in `opencommand()` | -| `lua.c` | REPL fallback: on parse failure, try as interactive command, assign result to `_` | +| `lcmd.c` | Add `luaB_interactive` (fork/exec without pipes), `luaB_command_exists` (PATH lookup) | +| `lcmd.h` | Declare new functions | +| `linit.c` | Register `__interactive` and `__command_exists` globals in `opencommand()` | +| `llex.c` | Track line boundaries so parser can extract raw source text for shell command arguments | +| `lparser.c` | Bare-word heuristic in statement parsing: detect shell commands via exception list, emit `__interactive` calls with raw source text | +| `lua.c` | Runtime fallback: catch "attempt to call a nil value" in `docall`, check PATH, re-execute as shell | diff --git a/issues/12-path-based-command-execution.md b/issues/12-path-based-command-execution.md new file mode 100644 index 00000000..1f1d2701 --- /dev/null +++ b/issues/12-path-based-command-execution.md @@ -0,0 +1,77 @@ +# Issue #12 — Path-based command execution + +**Status:** open +**Related:** #10 (interactive command execution) + +## Problem + +Issue #10's parser heuristic detects bare-word commands when an **identifier** at statement position is followed by a non-exception-list token. This covers `ls -la`, `git status`, etc. — but not commands invoked by path, because the first token isn't an identifier: + +``` +./script/update -- first token is . +../lua -- first token is . +/bin/ls -- first token is / +~/bin/deploy -- first token is ~ +``` + +In standard Lua, none of these are valid statement starts, so claiming them for shell commands doesn't steal any Lua syntax. + +## Proposed heuristic + +Add parser rules for path-prefixed commands. When the parser sees one of these tokens at statement position, treat the rest of the line as a shell command: + +| First token(s) | Pattern | Example | +|----------------|---------|---------| +| `.` `/` | dot-slash | `./script.sh` | +| `.` `.` `/` | dot-dot-slash | `../other/build` | +| `/` | absolute path | `/bin/ls -la` | +| `~` `/` | home-relative path | `~/bin/deploy` | + +### Detection + +- `.` at statement position → peek next token: + - `/` on same line → path command (`./ ...`) + - `.` on same line → peek again: `/` → path command (`../ ...`) + - anything else → parse as Lua (`.` could be a concat operator fragment, though this isn't valid at statement start anyway) +- `/` at statement position → path command (not valid Lua statement start) +- `~` at statement position → peek next token: + - `/` on same line → path command (`~/ ...`) + - anything else → parse as Lua (`~` is bitwise NOT in Lua 5.4, though `~expr` as a bare statement is already a syntax error) + +### Argument capture + +Same as issue #10: once detected, extract the raw text of the entire line from the source buffer. Emit bytecode equivalent to `_ = __interactive("raw line text")`. + +## Examples + +```lua +-- all of these run as shell commands +./configure --prefix=/usr/local +../build/lush test.lua +/usr/bin/env python3 -c "print('hello')" +~/bin/deploy staging + +-- check exit code like any other interactive command +./run-tests +if _.code ~= 0 then + print("tests failed") +end +``` + +## Non-conflicts with Lua + +- `/` at statement start: not valid Lua (`/` is division, needs a left operand) +- `./` at statement start: `.` is concat, `/` is division — `./foo` as Lua would be `. / foo` (concat divided by foo), which needs a left operand. Not valid at statement start. +- `../` at statement start: same reasoning, not valid +- `~/` at statement start: `~` is bitwise NOT, `/` is division — `~/foo` as Lua would be `(~()) / foo`, which isn't valid at statement start as a bare expression statement + +## Open questions + +- Should `~` without `/` be supported? (e.g., `~user/bin/script` — tilde expansion for other users). Probably not initially. +- Should bare `.` or `..` (without `/`) be treated as commands? In shells, `source` is sometimes aliased to `.`. Probably not — too ambiguous. + +## Files touched + +| File | Description | +|------|-------------| +| `lparser.c` | Add path-prefix detection alongside the identifier heuristic in statement parsing | diff --git a/testes/lush/commands-interactive.lua b/testes/lush/commands-interactive.lua new file mode 100644 index 00000000..31d671de --- /dev/null +++ b/testes/lush/commands-interactive.lua @@ -0,0 +1,361 @@ +-- testes/lush/commands-interactive.lua +-- Tests for interactive command execution (issue #10). +-- This file serves as a design playground: it documents how bare-word +-- commands should behave alongside Lua in both scripts and the REPL. + +print "testing interactive commands" + +-- ===== RESULT TABLE STRUCTURE ===== + +-- basic command, result is a table with code/stdout/stderr +do + echo hello + assert(type(_) == "table") + assert(type(_.code) == "number") + assert(type(_.stdout) == "string") + assert(type(_.stderr) == "string") +end + +-- ===== EXIT CODES ===== + +-- successful command returns exit code 0 +do + sh -c "exit 0" + assert(_.code == 0) +end + +-- failed command returns non-zero exit code +do + sh -c "exit 1" + assert(_.code == 1) +end + +-- specific exit codes are preserved +do + sh -c "exit 42" + assert(_.code == 42) +end + +-- command not found returns 127 +do + nonexistent_command_xyz_999 + assert(_.code == 127) +end + +-- ===== INTERACTIVE MODE: NO STDOUT/STDERR CAPTURE ===== +-- interactive commands inherit the terminal; stdout/stderr go directly +-- to the user's screen, so _.stdout and _.stderr are always empty. + +do + echo hello + assert(_.stdout == "") +end + +do + sh -c "echo err >&2" + assert(_.stderr == "") + assert(_.stdout == "") +end + +do + sh -c "echo out; echo err >&2" + assert(_.stdout == "") + assert(_.stderr == "") +end + +do + echo hello world + assert(_.stdout == "") + assert(_.code == 0) +end + +-- ===== PARSER HEURISTIC: IDENTIFIER + NON-EXCEPTION TOKEN ===== +-- the parser detects shell commands when an identifier at statement +-- position is followed by a token NOT in the exception list: +-- ( string { . : [ = , + +-- bare identifier, no arguments (next token is keyword/identifier/EOF) +do + ls + assert(_.code == 0) +end + +-- identifier + dash flag +do + ls -la / + assert(_.code == 0) +end + +-- identifier + slash (path argument) +do + ls /tmp + assert(_.code == 0) +end + +-- identifier + identifier (subcommand pattern) +do + git --version + assert(_.code == 0) +end + +-- ===== COMMANDS INSIDE LUA BLOCKS ===== +-- bare commands work anywhere a Lua statement can appear. + +-- inside do/end +do + echo inside-do + assert(_.code == 0) +end + +-- inside if/end +do + if true then + echo inside-if + assert(_.code == 0) + end +end + +-- inside for/end +do + for i = 1, 3 do + echo loop + assert(_.code == 0) + end +end + +-- inside while/end +do + local n = 0 + while n < 2 do + echo while-loop + assert(_.code == 0) + n = n + 1 + end +end + +-- inside function body +do + local function run_cmd() + ls / + return _.code + end + assert(run_cmd() == 0) +end + +-- nested blocks +do + if true then + for i = 1, 2 do + echo nested + assert(_.code == 0) + end + end +end + +-- ===== _ BEHAVIOR ===== + +-- _ is overwritten by subsequent commands +do + sh -c "exit 0" + assert(_.code == 0) + sh -c "exit 5" + assert(_.code == 5) +end + +-- _ persists across block boundaries (it's a global) +do + sh -c "exit 3" +end +assert(_.code == 3) + +-- ===== RUNTIME FALLBACK: STRING-ARG FUNCTION CALL SYNTAX ===== +-- echo "hello" parses as Lua echo("hello") because string literal +-- is in the exception list. at runtime, echo is nil → "attempt to +-- call a nil value" → check PATH → found → run as shell command. + +do + echo "hello" + assert(_.code == 0) + assert(_.stdout == "") +end + +do + printf "hello\n" + assert(_.code == 0) +end + +-- undefined name NOT in PATH → original Lua error preserved +do + local ok, err = pcall(function() + pirnt "hello" + end) + assert(not ok) + assert(string.find(err, "pirnt")) +end + +-- ===== LUA VARIABLE SHADOWS SHELL COMMAND ===== +-- if a Lua variable with the same name as a shell command is defined, +-- the Lua variable wins. the shell fallback only triggers on nil. + +-- string-arg sugar: echo "hello" parses as echo("hello"). +-- echo is a local function, so Lua calls it — NOT /bin/echo. +do + local func_called = false + local echo = function(x) func_called = true end + echo "hello" + assert(func_called == true) +end + +-- table-arg sugar: same principle with { } syntax +do + local received = nil + local grep = function(t) received = t end + grep {"pattern", "file.txt"} + assert(received[1] == "pattern") +end + +-- paren call: unambiguously Lua, local wins +do + local func_called = false + local ls = function(...) func_called = true end + ls("/tmp") + assert(func_called == true) +end + +-- global function shadows command name +do + local func_called = false + function echo(x) func_called = true end + echo "test" + assert(func_called == true) + echo = nil -- clean up global +end + +-- ===== LUA SYNTAX PRESERVED ===== +-- all exception-list tokens correctly route to Lua parsing. + +-- multi-line function call: string arg on next line +do + local function my_func(arg) + return arg + end + + local r = my_func + "hello world" + assert(r == "hello world") +end + +-- multi-line function call: paren arg on next line +do + local function add(a, b) return a + b end + + local r = add + (1, 2) + assert(r == 3) +end + +-- multi-line function call: table arg on next line +do + local function first(t) return t[1] end + + local r = first + {42} + assert(r == 42) +end + +-- assignment +do + local x = 5 + assert(x == 5) +end + +-- multi-assignment +do + local a, b = 1, 2 + assert(a == 1 and b == 2) +end + +-- field access +do + local t = {field = 10} + assert(t.field == 10) + t.field = 20 + assert(t.field == 20) +end + +-- method calls +do + local s = "hello" + assert(s:upper() == "HELLO") +end + +-- indexing +do + local t = {10, 20, 30} + assert(t[2] == 20) + t[2] = 99 + assert(t[2] == 99) +end + +-- table-arg function call +do + local function f(t) return t[1] end + assert(f {42} == 42) +end + +-- keyword-led statements +do + local x = 1 + if x == 1 then x = 2 end + assert(x == 2) + for i = 1, 1 do x = 3 end + assert(x == 3) + while x > 3 do x = x - 1 end + assert(x == 3) + repeat x = x - 1 until x == 0 + assert(x == 0) +end + +-- ===== INTERLEAVED LUA AND SHELL ===== + +do + local x = 10 + ls / + assert(_.code == 0) + local y = x + 20 + assert(y == 30) + echo hello + assert(_.code == 0) + local z = y * 2 + assert(z == 60) +end + +-- ===== EDGE CASES ===== + +-- double-dash flags (--) look like Lua comments to the lexer. +-- the parser must capture raw source text BEFORE the lexer consumes +-- the comment, so the full argument string is preserved. +do + git --version + assert(_.code == 0) + ls --color=auto /tmp + assert(_.code == 0) -- may fail if ls doesn't support --color +end + +-- commands where first arg is another known command name +do + env ls + -- env runs ls; both are valid commands, this is identifier + identifier + assert(type(_.code) == "number") +end + +-- semicolons: Lua uses ; as optional statement separator. +-- with the heuristic, ls followed by ; is ambiguous. +-- for now, use separate lines instead: +do + ls /tmp + echo done + assert(_.code == 0) +end + +print "OK"