Add issues #25, #26, #27; update #7 and #22 statuses

- #25: $VAR expansion in commands
- #26: Shell abbreviations and user-extensible builtins
- #27: $(cmd) subcommand syntax
- #22: Updated with implementation details (in progress)
- #7: Standardize status label
This commit is contained in:
Cormac Shannon
2026-03-15 19:33:19 +00:00
parent 5135b16375
commit 08df164692
5 changed files with 353 additions and 13 deletions

View File

@@ -13,6 +13,16 @@
`cmd` < "input.txt" -- redirect stdin from file
```
### Alternatively
```lua lush
`ls -l > output.txt` -- redirect stdout to file
`ls -l >> output.txt` -- append stdout to file
`cmd 2> err.txt` -- redirect stderr to file
`cmd 2>&1` -- merge stderr into stdout
`cmd < input.txt` -- redirect stdin from file
```
## Implementation
- Before `execvp()` in the child process, use `dup2()` to redirect file descriptors
@@ -26,3 +36,13 @@
## Challenge
`>` and `>>` conflict with Lua's greater-than and right-shift operators. Like piping, these operators must only be valid in command context. The parser can disambiguate because the left-hand side is a command expression.
## Alternatives
- Although shell redirection isnt available, simple redirection can be achieved using the `io` library.
```lua lush
file = io.open("OUTFILE.txt", "w")
file:write(`ls /`.stdout)
file:close()
```

View File

@@ -1,21 +1,74 @@
# 22 — Implicit interactive commands (drop `!` prefix)
**Status:** open
**Status:** in progress
In the REPL, maybe we can get away with not needing the `!` prefix.
## Goal
Lua already attempts to run input as a statement. If that fails, it assumes it might be an expression (e.g. `1 + 2`) and wraps it in `return(...)`. If *that* also fails, we could try wrapping it with `!` prefix semantics as a third fallback.
In the REPL, treat unrecognized input as shell commands so users can type `git status` or `ls -la` without the `!` prefix.
## Execution order
## Implementation
1. Try as Lua statement
2. Try as Lua expression (`return ...`)
3. Try as shell command (interactive execution)
### REPL fallback chain (`loadline()` in `lua.c`)
## Considerations
The standard Lua REPL tries two compilations. We add a third:
- Ambiguity: `ls` is not valid Lua, so it would fall through to shell — this is the desired behaviour
- But `print` is valid Lua (it's a value) — so `print` alone wouldn't trigger shell
- What about `git status`? Not valid Lua, would correctly fall through to shell
- Error messages: if all three fail, which error do we show?
- Performance: three parse attempts per input line
1. Try as expression (`return <line>`)
2. Try as statement (with continuation for incomplete input)
3. **Try as shell command (`!<line>`)**
This is done by adding `addshellcmd()` which prepends `!` to the line and compiles it, triggering the existing lexer/parser path for interactive commands.
### The bare identifier problem
Multi-word input like `git status` fails both Lua paths and correctly falls through to shell. But single-word commands like `ls` compile as `return ls;` — a valid Lua expression that returns `nil` — so they never reach the shell fallback.
Worse: if `addreturn` is bypassed, `ls` as a statement is *incomplete* Lua (the parser expects `ls(...)` or `ls = ...`), so `multiline()` enters continuation mode and the REPL hangs waiting for more input.
### Fix: check `_G` before the expression path
Before trying `return <line>`, check if the line is a bare identifier (single word matching `[a-zA-Z_][a-zA-Z0-9_]*`). If it is, look it up in `_G`:
- **In `_G`** → proceed normally (`return print` shows the function, `return x` shows its value)
- **Not in `_G`** → skip expression and statement paths entirely, go straight to shell
This is a compile-time check — no runtime error interception, no flags, no special subroutines. ~10 lines of C in `loadline()`:
```c
line = lua_tostring(L, 1);
bare = isbareid(line);
if (bare && lua_getglobal(L, line) == LUA_TNIL) {
lua_pop(L, 1);
status = addshellcmd(L); /* straight to shell */
}
else {
if (bare) lua_pop(L, 1);
/* normal expression → statement → shell fallback chain */
}
```
### Summary of behavior
| Input | Path taken | Result |
|-------|-----------|--------|
| `print("hi")` | expression | Lua expression |
| `x = 42` | statement | Lua statement |
| `if true then print("x") end` | statement | Lua statement |
| `print` | expression (in `_G`) | shows function value |
| `x` (after `x=42`) | expression (in `_G`) | shows 42 |
| `git status` | expression fails → statement fails → shell | runs git |
| `ls -la` | expression fails → statement fails → shell | runs ls |
| `echo hello` | expression fails → statement fails → shell | runs echo |
| `ls` | bare id, not in `_G` → shell | runs ls |
| `!pwd` | expression (already valid `!` syntax) | runs pwd |
### Edge cases
- A global explicitly set to `nil` (`x = nil`) would still try the expression path — `return x` compiles and returns nil. This is correct: the user defined it, Lua should handle it.
- Lua keywords (`if`, `for`, `while`) are not valid identifiers in `isbareid` terms — they fail `addreturn`, enter `multiline` for continuation, and behave normally.
- `!cmd` still works — it compiles as a valid expression in the first path.
## Files
| File | Changes |
|------|---------|
| `lua.c` | `isbareid()`, `addshellcmd()`, modified `loadline()` |

View File

@@ -0,0 +1,51 @@
# 25 — Environment variable expansion in commands
**Status:** open
## Problem
`$VAR` syntax works in Lua expressions but not inside commands:
```lua
print($PATH) -- works: $PATH is lexed as TK_ENVVAR, expanded via getenv()
!echo $PATH -- broken: $PATH is kept as literal text, passed unexpanded
```
The same applies to backtick commands:
```lua
local r = `echo $HOME` -- $HOME is literal, not expanded
```
## Why it happens
The lexer has two separate paths for `$`:
1. **In Lua code** (`llex.c:667`): `$NAME``TK_ENVVAR` token → compiled to `getenv("NAME")` call
2. **In command mode** (`llex.c:518`): `$` without `{` is saved as a literal character in the command string buffer
Only `${expr}` interpolation works in commands — it enters a Lua expression context and returns the result inline. Bare `$NAME` is passed through verbatim.
Since commands are executed via `fork`/`exec` (not through `/bin/sh`), there is no shell to expand `$PATH` at runtime either.
## Expected behavior
`$VAR` should expand in commands the same way it does in Lua expressions:
```lua
!echo $HOME -- should print /Users/nik
!echo $PATH -- should print the PATH value
local r = `echo $HOME` -- r.stdout should contain /Users/nik
```
## Workaround
Use `${expr}` interpolation:
```lua
!echo ${$HOME}
```
## Possible fix
In `read_command_body()` (`llex.c:518`), when `$` is followed by an identifier, emit the current command fragment and produce a `TK_ENVVAR`-equivalent expansion inline — similar to how `${expr}` already splits the command string to insert interpolated values.

View File

@@ -0,0 +1,87 @@
# 26 — Shell abbreviations and user-extensible builtins
**Status:** open
## Problem
Users cannot define shell abbreviations or custom commands. For example, there's no way to alias `gs` to `git status` or add a custom `mkcd` that creates a directory and cd's into it.
## Current architecture
Commit `f88b1795` moved all shell internals (`__command`, `__interactive`, `__getenv`, `__setenv`) out of `_G` and into a hidden registry table at `LUA_RIDX_LUSH`. Builtins (`cd`, `exec`, `umask`) live at `LUA_RIDX_LUSH.builtins`. The dispatch path in `try_builtin()` (`lcmd.c:875`) already does a table lookup:
```c
lua_getfield(L, -1, "builtins") /* get builtins table */
lua_getfield(L, -1, pa->argv[0]) /* look up command name */
```
This means builtins are **already data-driven** — if a function appears in the table, it gets called. But the table is invisible to user code, so there's no way to add entries.
## Proposal: expose `lush.builtins` to user code
Expose the `LUA_RIDX_LUSH` table (or a curated view of it) as a global like `lush` or `__lush`. Users extend the shell by adding functions to `lush.builtins`:
```lua
-- abbreviation
lush.builtins.gs = function(cmd)
!git status
end
-- custom builtin
lush.builtins.mkcd = function(cmd, dir)
!mkdir -p ${dir}
!cd ${dir}
end
-- override (with access to original)
local orig_cd = lush.builtins.cd
lush.builtins.cd = function(cmd, dir)
orig_cd(cmd, dir)
print("now in: " .. $PWD)
end
```
This requires no new dispatch mechanism — `try_builtin()` already does the right thing. The only change is making the table accessible.
## Design considerations
### What to expose
Option A: Expose the entire `LUA_RIDX_LUSH` table as `lush`:
```lua
lush.builtins -- cd, exec, umask, user additions
lush.command -- __command (backtick execution)
lush.interactive -- __interactive (! execution)
lush.getenv -- __getenv ($VAR read)
lush.setenv -- __setenv ($VAR write)
```
This gives power users access to the shell primitives directly, which is useful for building abstractions. But it also means users could break internals.
Option B: Expose only `lush.builtins` — safer, sufficient for the abbreviation use case.
### Relationship to f88b1795
That commit deliberately hid these from `_G` to avoid polluting the namespace. Exposing a single `lush` global is a middle ground: one clean entry point instead of scattered `__double_underscore` globals, and users can only extend via a structured API rather than accidentally shadowing internals.
### Abbreviations vs builtins
Shell abbreviations (fish-style text expansion) and builtins (function dispatch) are different features in traditional shells. But in lush, a builtin that calls `!git status` achieves the same effect as an abbreviation — the `!` prefix runs the command interactively with terminal inheritance. So a single mechanism covers both use cases.
### Builtin protocol
Current builtins receive `(cmd_name, arg1, arg2, ...)` and return a result table `{code, stdout, stderr}`. User builtins should follow the same protocol. Document this as the contract.
## Files likely affected
| File | Changes |
|------|---------|
| `linit.c` | Expose `LUA_RIDX_LUSH` table as `lush` global |
| `lcmd.c` | No changes — `try_builtin()` already works via table lookup |
| `lbuiltin.c` | No changes — existing builtins stay as-is |
## Open questions
- Name: `lush`, `shell`, `__shell`? `lush` is clean and matches the project name.
- Should `lush.command` / `lush.interactive` be exposed or kept hidden?
- Should there be a `builtin` command to bypass user overrides (like bash's `builtin cd`)?

View File

@@ -0,0 +1,129 @@
# 27 — Subcommand syntax in commands
**Status:** open
## Problem
Running a command inside another command is unnecessarily verbose:
```lua
`ls ${`pwd`.stdout}`
```
The inner backtick returns a result table (`{code, stdout, stderr}`), so `.stdout` is required to extract the string. This is clunky compared to other shells.
## Proposed syntax: `$(cmd)`
Use `$()` for inline subcommands, consistent with the existing `$` interpolation family:
```lua
`ls $(pwd)`
!echo $(whoami)
`tar -czf $(date +%F).tar.gz src/`
```
This is consistent with existing lush syntax:
| Syntax | Context | Meaning |
|--------|---------|---------|
| `$VAR` | Lua code | `getenv("VAR")` |
| `${expr}` | command body | interpolate Lua expression |
| **`$(cmd)`** | command body | **run subcommand, insert stdout** |
### Comparison with other shells
| Shell | Syntax |
|-------|--------|
| bash | `$(cmd)` |
| fish | `(cmd)` |
| lush current | `` `ls ${`pwd`.stdout}` `` |
| **lush proposed** | `` `ls $(pwd)` `` |
## Behavior
`$(cmd)` runs a shell command (same as backtick) and inserts its stdout into the outer command, with trailing newline stripped.
```lua
`ls $(pwd)` -- list files in pwd's output
!echo $(whoami)@$(hostname) -- multiple subcommands
`echo $(ls $(pwd))` -- nested: inner runs first
```
`$(cmd)` is **not** for Lua expressions — that's what `${expr}` is for. `$()` runs a shell command; `${}` evaluates Lua.
### Nesting
Nested subcommands are supported:
```lua
`echo $(ls $(pwd))`
```
This works naturally because `$(cmd)` enters command parsing, which can itself contain `$()`.
## Syntax clash analysis
Currently in `read_command_body()` (`llex.c:508`), `$` followed by anything other than `{` saves a literal `$`. Adding `(` as a second trigger alongside `{` is a minimal change. No conflicts:
- `$VAR` in command mode is currently a literal `$` + `VAR` (see issue #25) — not affected
- `${expr}` continues to work unchanged
- Literal `$(` in commands is not meaningful today (falls through as literal text)
## Implementation sketch
### Lexer (`llex.c`)
In `read_command_body()`, extend the `$` case to also trigger on `(`. This starts a new command body parse (not a Lua expression like `${}`):
```c
case '$': {
next(ls);
if (ls->current == '{') {
next(ls); /* skip '{' */
/* existing ${expr} interpolation path */
seminfo->ts = luaX_newstring(ls, luaZ_buffer(ls->buff),
luaZ_bufflen(ls->buff));
ls->saved_cmd_mode = ls->cmd_mode;
return interactive ? TK_INTERACTIVE : TK_COMMAND;
}
else if (ls->current == '(') {
next(ls); /* skip '(' */
/* subcommand: start a new command parse, closed by ')' */
seminfo->ts = luaX_newstring(ls, luaZ_buffer(ls->buff),
luaZ_bufflen(ls->buff));
ls->saved_cmd_mode = ls->cmd_mode;
/* signal parser that this is a subcommand, not a Lua expr */
...
return interactive ? TK_INTERACTIVE : TK_COMMAND;
}
else {
save(ls, '$');
}
break;
}
```
The `$()` body is parsed as a command (like backtick), terminated by `)` instead of `` ` ``. The result is run via `lushCmd_command` and `.stdout` is extracted with trailing newline stripped.
### Parser (`lparser.c`)
The parser needs to distinguish `$()` from `${}`:
- `${expr}` → parse Lua expression, `tostring()` the result (existing behavior)
- `$(cmd)` → parse as a command (like backtick), run it, extract `.stdout`, strip trailing `\n`
### Lexer state (`llex.h`)
Track whether the current interpolation is a subcommand (`$(`) or expression (`${`) so the parser knows which path to take.
## Files affected
| File | Changes |
|------|---------|
| `llex.h` | Add field to `LexState` to distinguish `$()` from `${}` |
| `llex.c` | Extend `$` case in `read_command_body()`, handle `)` as command terminator |
| `lparser.c` | Add subcommand path: parse as command, extract `.stdout` |
## Related
- Issue #25 — `$VAR` expansion in commands (also touches `$` handling in `read_command_body()`)