From 08df164692e9c0b261a244d4320b8e6bb888b44f Mon Sep 17 00:00:00 2001 From: Cormac Shannon <> Date: Sun, 15 Mar 2026 19:33:19 +0000 Subject: [PATCH] Add issues #25, #26, #27; update #7 and #22 statuses - #25: $VAR expansion in commands - #26: Shell abbreviations and user-extensible builtins - #27: $(cmd) subcommand syntax - #22: Updated with implementation details (in progress) - #7: Standardize status label --- issues/07-redirection.md | 20 +++ issues/22-implicit-interactive-commands.md | 79 +++++++++-- issues/25-envvar-expansion-in-commands.md | 51 +++++++ ...6-shell-abbreviations-and-user-builtins.md | 87 ++++++++++++ issues/27-subcommand-syntax.md | 129 ++++++++++++++++++ 5 files changed, 353 insertions(+), 13 deletions(-) create mode 100644 issues/25-envvar-expansion-in-commands.md create mode 100644 issues/26-shell-abbreviations-and-user-builtins.md create mode 100644 issues/27-subcommand-syntax.md diff --git a/issues/07-redirection.md b/issues/07-redirection.md index a5748f51..a7128e64 100644 --- a/issues/07-redirection.md +++ b/issues/07-redirection.md @@ -13,6 +13,16 @@ `cmd` < "input.txt" -- redirect stdin from file ``` +### Alternatively + +```lua lush +`ls -l > output.txt` -- redirect stdout to file +`ls -l >> output.txt` -- append stdout to file +`cmd 2> err.txt` -- redirect stderr to file +`cmd 2>&1` -- merge stderr into stdout +`cmd < input.txt` -- redirect stdin from file +``` + ## Implementation - Before `execvp()` in the child process, use `dup2()` to redirect file descriptors @@ -26,3 +36,13 @@ ## Challenge `>` and `>>` conflict with Lua's greater-than and right-shift operators. Like piping, these operators must only be valid in command context. The parser can disambiguate because the left-hand side is a command expression. + +## Alternatives + +- Although shell redirection isn’t available, simple redirection can be achieved using the `io` library. + +```lua lush +file = io.open("OUTFILE.txt", "w") +file:write(`ls /`.stdout) +file:close() +``` diff --git a/issues/22-implicit-interactive-commands.md b/issues/22-implicit-interactive-commands.md index 8db216fe..4f909513 100644 --- a/issues/22-implicit-interactive-commands.md +++ b/issues/22-implicit-interactive-commands.md @@ -1,21 +1,74 @@ # 22 — Implicit interactive commands (drop `!` prefix) -**Status:** open +**Status:** in progress -In the REPL, maybe we can get away with not needing the `!` prefix. +## Goal -Lua already attempts to run input as a statement. If that fails, it assumes it might be an expression (e.g. `1 + 2`) and wraps it in `return(...)`. If *that* also fails, we could try wrapping it with `!` prefix semantics as a third fallback. +In the REPL, treat unrecognized input as shell commands so users can type `git status` or `ls -la` without the `!` prefix. -## Execution order +## Implementation -1. Try as Lua statement -2. Try as Lua expression (`return ...`) -3. Try as shell command (interactive execution) +### REPL fallback chain (`loadline()` in `lua.c`) -## Considerations +The standard Lua REPL tries two compilations. We add a third: -- Ambiguity: `ls` is not valid Lua, so it would fall through to shell — this is the desired behaviour -- But `print` is valid Lua (it's a value) — so `print` alone wouldn't trigger shell -- What about `git status`? Not valid Lua, would correctly fall through to shell -- Error messages: if all three fail, which error do we show? -- Performance: three parse attempts per input line +1. Try as expression (`return `) +2. Try as statement (with continuation for incomplete input) +3. **Try as shell command (`!`)** + +This is done by adding `addshellcmd()` which prepends `!` to the line and compiles it, triggering the existing lexer/parser path for interactive commands. + +### The bare identifier problem + +Multi-word input like `git status` fails both Lua paths and correctly falls through to shell. But single-word commands like `ls` compile as `return ls;` — a valid Lua expression that returns `nil` — so they never reach the shell fallback. + +Worse: if `addreturn` is bypassed, `ls` as a statement is *incomplete* Lua (the parser expects `ls(...)` or `ls = ...`), so `multiline()` enters continuation mode and the REPL hangs waiting for more input. + +### Fix: check `_G` before the expression path + +Before trying `return `, check if the line is a bare identifier (single word matching `[a-zA-Z_][a-zA-Z0-9_]*`). If it is, look it up in `_G`: + +- **In `_G`** → proceed normally (`return print` shows the function, `return x` shows its value) +- **Not in `_G`** → skip expression and statement paths entirely, go straight to shell + +This is a compile-time check — no runtime error interception, no flags, no special subroutines. ~10 lines of C in `loadline()`: + +```c +line = lua_tostring(L, 1); +bare = isbareid(line); +if (bare && lua_getglobal(L, line) == LUA_TNIL) { + lua_pop(L, 1); + status = addshellcmd(L); /* straight to shell */ +} +else { + if (bare) lua_pop(L, 1); + /* normal expression → statement → shell fallback chain */ +} +``` + +### Summary of behavior + +| Input | Path taken | Result | +|-------|-----------|--------| +| `print("hi")` | expression | Lua expression | +| `x = 42` | statement | Lua statement | +| `if true then print("x") end` | statement | Lua statement | +| `print` | expression (in `_G`) | shows function value | +| `x` (after `x=42`) | expression (in `_G`) | shows 42 | +| `git status` | expression fails → statement fails → shell | runs git | +| `ls -la` | expression fails → statement fails → shell | runs ls | +| `echo hello` | expression fails → statement fails → shell | runs echo | +| `ls` | bare id, not in `_G` → shell | runs ls | +| `!pwd` | expression (already valid `!` syntax) | runs pwd | + +### Edge cases + +- A global explicitly set to `nil` (`x = nil`) would still try the expression path — `return x` compiles and returns nil. This is correct: the user defined it, Lua should handle it. +- Lua keywords (`if`, `for`, `while`) are not valid identifiers in `isbareid` terms — they fail `addreturn`, enter `multiline` for continuation, and behave normally. +- `!cmd` still works — it compiles as a valid expression in the first path. + +## Files + +| File | Changes | +|------|---------| +| `lua.c` | `isbareid()`, `addshellcmd()`, modified `loadline()` | diff --git a/issues/25-envvar-expansion-in-commands.md b/issues/25-envvar-expansion-in-commands.md new file mode 100644 index 00000000..d603b92c --- /dev/null +++ b/issues/25-envvar-expansion-in-commands.md @@ -0,0 +1,51 @@ +# 25 — Environment variable expansion in commands + +**Status:** open + +## Problem + +`$VAR` syntax works in Lua expressions but not inside commands: + +```lua +print($PATH) -- works: $PATH is lexed as TK_ENVVAR, expanded via getenv() +!echo $PATH -- broken: $PATH is kept as literal text, passed unexpanded +``` + +The same applies to backtick commands: + +```lua +local r = `echo $HOME` -- $HOME is literal, not expanded +``` + +## Why it happens + +The lexer has two separate paths for `$`: + +1. **In Lua code** (`llex.c:667`): `$NAME` → `TK_ENVVAR` token → compiled to `getenv("NAME")` call +2. **In command mode** (`llex.c:518`): `$` without `{` is saved as a literal character in the command string buffer + +Only `${expr}` interpolation works in commands — it enters a Lua expression context and returns the result inline. Bare `$NAME` is passed through verbatim. + +Since commands are executed via `fork`/`exec` (not through `/bin/sh`), there is no shell to expand `$PATH` at runtime either. + +## Expected behavior + +`$VAR` should expand in commands the same way it does in Lua expressions: + +```lua +!echo $HOME -- should print /Users/nik +!echo $PATH -- should print the PATH value +local r = `echo $HOME` -- r.stdout should contain /Users/nik +``` + +## Workaround + +Use `${expr}` interpolation: + +```lua +!echo ${$HOME} +``` + +## Possible fix + +In `read_command_body()` (`llex.c:518`), when `$` is followed by an identifier, emit the current command fragment and produce a `TK_ENVVAR`-equivalent expansion inline — similar to how `${expr}` already splits the command string to insert interpolated values. diff --git a/issues/26-shell-abbreviations-and-user-builtins.md b/issues/26-shell-abbreviations-and-user-builtins.md new file mode 100644 index 00000000..9fa06a47 --- /dev/null +++ b/issues/26-shell-abbreviations-and-user-builtins.md @@ -0,0 +1,87 @@ +# 26 — Shell abbreviations and user-extensible builtins + +**Status:** open + +## Problem + +Users cannot define shell abbreviations or custom commands. For example, there's no way to alias `gs` to `git status` or add a custom `mkcd` that creates a directory and cd's into it. + +## Current architecture + +Commit `f88b1795` moved all shell internals (`__command`, `__interactive`, `__getenv`, `__setenv`) out of `_G` and into a hidden registry table at `LUA_RIDX_LUSH`. Builtins (`cd`, `exec`, `umask`) live at `LUA_RIDX_LUSH.builtins`. The dispatch path in `try_builtin()` (`lcmd.c:875`) already does a table lookup: + +```c +lua_getfield(L, -1, "builtins") /* get builtins table */ +lua_getfield(L, -1, pa->argv[0]) /* look up command name */ +``` + +This means builtins are **already data-driven** — if a function appears in the table, it gets called. But the table is invisible to user code, so there's no way to add entries. + +## Proposal: expose `lush.builtins` to user code + +Expose the `LUA_RIDX_LUSH` table (or a curated view of it) as a global like `lush` or `__lush`. Users extend the shell by adding functions to `lush.builtins`: + +```lua +-- abbreviation +lush.builtins.gs = function(cmd) + !git status +end + +-- custom builtin +lush.builtins.mkcd = function(cmd, dir) + !mkdir -p ${dir} + !cd ${dir} +end + +-- override (with access to original) +local orig_cd = lush.builtins.cd +lush.builtins.cd = function(cmd, dir) + orig_cd(cmd, dir) + print("now in: " .. $PWD) +end +``` + +This requires no new dispatch mechanism — `try_builtin()` already does the right thing. The only change is making the table accessible. + +## Design considerations + +### What to expose + +Option A: Expose the entire `LUA_RIDX_LUSH` table as `lush`: +```lua +lush.builtins -- cd, exec, umask, user additions +lush.command -- __command (backtick execution) +lush.interactive -- __interactive (! execution) +lush.getenv -- __getenv ($VAR read) +lush.setenv -- __setenv ($VAR write) +``` + +This gives power users access to the shell primitives directly, which is useful for building abstractions. But it also means users could break internals. + +Option B: Expose only `lush.builtins` — safer, sufficient for the abbreviation use case. + +### Relationship to f88b1795 + +That commit deliberately hid these from `_G` to avoid polluting the namespace. Exposing a single `lush` global is a middle ground: one clean entry point instead of scattered `__double_underscore` globals, and users can only extend via a structured API rather than accidentally shadowing internals. + +### Abbreviations vs builtins + +Shell abbreviations (fish-style text expansion) and builtins (function dispatch) are different features in traditional shells. But in lush, a builtin that calls `!git status` achieves the same effect as an abbreviation — the `!` prefix runs the command interactively with terminal inheritance. So a single mechanism covers both use cases. + +### Builtin protocol + +Current builtins receive `(cmd_name, arg1, arg2, ...)` and return a result table `{code, stdout, stderr}`. User builtins should follow the same protocol. Document this as the contract. + +## Files likely affected + +| File | Changes | +|------|---------| +| `linit.c` | Expose `LUA_RIDX_LUSH` table as `lush` global | +| `lcmd.c` | No changes — `try_builtin()` already works via table lookup | +| `lbuiltin.c` | No changes — existing builtins stay as-is | + +## Open questions + +- Name: `lush`, `shell`, `__shell`? `lush` is clean and matches the project name. +- Should `lush.command` / `lush.interactive` be exposed or kept hidden? +- Should there be a `builtin` command to bypass user overrides (like bash's `builtin cd`)? diff --git a/issues/27-subcommand-syntax.md b/issues/27-subcommand-syntax.md new file mode 100644 index 00000000..f398c44f --- /dev/null +++ b/issues/27-subcommand-syntax.md @@ -0,0 +1,129 @@ +# 27 — Subcommand syntax in commands + +**Status:** open + +## Problem + +Running a command inside another command is unnecessarily verbose: + +```lua +`ls ${`pwd`.stdout}` +``` + +The inner backtick returns a result table (`{code, stdout, stderr}`), so `.stdout` is required to extract the string. This is clunky compared to other shells. + +## Proposed syntax: `$(cmd)` + +Use `$()` for inline subcommands, consistent with the existing `$` interpolation family: + +```lua +`ls $(pwd)` +!echo $(whoami) +`tar -czf $(date +%F).tar.gz src/` +``` + +This is consistent with existing lush syntax: + +| Syntax | Context | Meaning | +|--------|---------|---------| +| `$VAR` | Lua code | `getenv("VAR")` | +| `${expr}` | command body | interpolate Lua expression | +| **`$(cmd)`** | command body | **run subcommand, insert stdout** | + +### Comparison with other shells + +| Shell | Syntax | +|-------|--------| +| bash | `$(cmd)` | +| fish | `(cmd)` | +| lush current | `` `ls ${`pwd`.stdout}` `` | +| **lush proposed** | `` `ls $(pwd)` `` | + +## Behavior + +`$(cmd)` runs a shell command (same as backtick) and inserts its stdout into the outer command, with trailing newline stripped. + +```lua +`ls $(pwd)` -- list files in pwd's output +!echo $(whoami)@$(hostname) -- multiple subcommands +`echo $(ls $(pwd))` -- nested: inner runs first +``` + +`$(cmd)` is **not** for Lua expressions — that's what `${expr}` is for. `$()` runs a shell command; `${}` evaluates Lua. + +### Nesting + +Nested subcommands are supported: + +```lua +`echo $(ls $(pwd))` +``` + +This works naturally because `$(cmd)` enters command parsing, which can itself contain `$()`. + +## Syntax clash analysis + +Currently in `read_command_body()` (`llex.c:508`), `$` followed by anything other than `{` saves a literal `$`. Adding `(` as a second trigger alongside `{` is a minimal change. No conflicts: + +- `$VAR` in command mode is currently a literal `$` + `VAR` (see issue #25) — not affected +- `${expr}` continues to work unchanged +- Literal `$(` in commands is not meaningful today (falls through as literal text) + +## Implementation sketch + +### Lexer (`llex.c`) + +In `read_command_body()`, extend the `$` case to also trigger on `(`. This starts a new command body parse (not a Lua expression like `${}`): + +```c +case '$': { + next(ls); + if (ls->current == '{') { + next(ls); /* skip '{' */ + /* existing ${expr} interpolation path */ + seminfo->ts = luaX_newstring(ls, luaZ_buffer(ls->buff), + luaZ_bufflen(ls->buff)); + ls->saved_cmd_mode = ls->cmd_mode; + return interactive ? TK_INTERACTIVE : TK_COMMAND; + } + else if (ls->current == '(') { + next(ls); /* skip '(' */ + /* subcommand: start a new command parse, closed by ')' */ + seminfo->ts = luaX_newstring(ls, luaZ_buffer(ls->buff), + luaZ_bufflen(ls->buff)); + ls->saved_cmd_mode = ls->cmd_mode; + /* signal parser that this is a subcommand, not a Lua expr */ + ... + return interactive ? TK_INTERACTIVE : TK_COMMAND; + } + else { + save(ls, '$'); + } + break; +} +``` + +The `$()` body is parsed as a command (like backtick), terminated by `)` instead of `` ` ``. The result is run via `lushCmd_command` and `.stdout` is extracted with trailing newline stripped. + +### Parser (`lparser.c`) + +The parser needs to distinguish `$()` from `${}`: + +- `${expr}` → parse Lua expression, `tostring()` the result (existing behavior) +- `$(cmd)` → parse as a command (like backtick), run it, extract `.stdout`, strip trailing `\n` + +### Lexer state (`llex.h`) + +Track whether the current interpolation is a subcommand (`$(`) or expression (`${`) so the parser knows which path to take. + +## Files affected + +| File | Changes | +|------|---------| +| `llex.h` | Add field to `LexState` to distinguish `$()` from `${}` | +| `llex.c` | Extend `$` case in `read_command_body()`, handle `)` as command terminator | +| `lparser.c` | Add subcommand path: parse as command, extract `.stdout` | + +## Related + +- Issue #25 — `$VAR` expansion in commands (also touches `$` handling in `read_command_body()`)