From 34bfabccbd29a1489e18f5ecc7c412793dd97e1f Mon Sep 17 00:00:00 2001 From: Cormac Shannon <> Date: Thu, 12 Mar 2026 19:26:05 +0000 Subject: [PATCH] Expand issue #23 with detailed cleanup analysis; close issues #5 and #24 Issue #23 now covers function naming conventions, file organisation, global namespace cleanup options, and a concrete implementation plan. --- issues/05-env-var-access.md | 2 +- issues/23-cleanup-refactor.md | 165 ++++++++++++++++++++++++++-- issues/24-test-lua-compatibility.md | 2 +- 3 files changed, 158 insertions(+), 11 deletions(-) diff --git a/issues/05-env-var-access.md b/issues/05-env-var-access.md index 3ef5655e..92ae6682 100644 --- a/issues/05-env-var-access.md +++ b/issues/05-env-var-access.md @@ -1,6 +1,6 @@ # 05 — Implement environment variable access (read/write) -**Status:** open +**Status:** done **Blocked by:** #01 Add syntax for reading and writing environment variables directly from Lua. diff --git a/issues/23-cleanup-refactor.md b/issues/23-cleanup-refactor.md index 7199cb0a..ed9e7a03 100644 --- a/issues/23-cleanup-refactor.md +++ b/issues/23-cleanup-refactor.md @@ -4,17 +4,164 @@ Housekeeping pass over the codebase. -## Function names +--- -Function names should be reviewed and improved for consistency and clarity. +## 1. Function naming conventions -## Global namespace (`_G`) +### Current state -Review what's exposed in `_G`. Do we need to expose everything? Candidates for cleanup: +Lua's convention: every non-static function in a file uses a common prefix (e.g. `luaX_` for llex.c, `luaK_` for lcode.c, `luaB_` for lbaselib.c). -- `__command` — internal, could be hidden -- `__interactive` — internal, could be hidden -- `__getenv` / `__setenv` — internal, could be hidden -- `__builtins` — should this be user-facing or internal? +Our custom files don't follow a consistent prefix scheme: + + +| File | Current names | Issue | +| ------------ | ----------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------- | +| `lcmd.c` | `luaB_command`, `luaB_interactive` | Uses `luaB_` prefix, which belongs to `lbaselib.c` | +| `linit.c` | `luaB_getenv`, `luaB_setenv` | Same — `luaB_` is for the base library | +| `lbuiltin.c` | `builtin_cd`, `builtin_exec`, `builtin_umask`, `luaopen_builtins` | `builtin_` prefix is fine for statics; `luaopen_builtins` follows the `luaopen_` convention correctly | + + +### Proposed fix + +Introduce a `luaCmd_` prefix for lcmd.c exports and `luaEnv_` (or keep in lcmd) for env functions: + +- `luaB_command` → `luaCmd_command` (lcmd.c) +- `luaB_interactive` → `luaCmd_interactive` (lcmd.c) +- `luaB_getenv` → `luaCmd_getenv` (linit.c, or move to lcmd.c) +- `luaB_setenv` → `luaCmd_setenv` (linit.c, or move to lcmd.c) + +Update `lcmd.h` exports and all call sites (`linit.c`). + +Static helper functions in lcmd.c (`expand_tilde`, `parse_argv`, `exec_pipeline`, etc.) don't need prefixes — they're already file-scoped. This is consistent with Lua's approach. + +### Files to change + +- `lcmd.c` — rename `luaB_command`, `luaB_interactive` +- `lcmd.h` — update declarations +- `linit.c` — rename `luaB_getenv`, `luaB_setenv`; update `opencommand()` + +--- + +## 2. File organisation + +### Current state + +The custom code lives in two files: + +- `**lcmd.c`** (~1100 lines) — command execution, argv parsing, glob/tilde/brace expansion, pipeline execution, signal handling, dynamic buffer management, builtin dispatch +- `**lbuiltin.c`** (~160 lines) — cd, exec, umask implementations + +Plus env var functions live in `**linit.c**` rather than in a dedicated file. + +### Questions to resolve + +- **Should `luaB_getenv`/`luaB_setenv` move out of linit.c?** They're the only non-init logic in that file. Could move to `lcmd.c` (they're command-adjacent) or a new `lenv.c`. +- **Is lcmd.c too large?** At ~1100 lines it bundles argv parsing, shell expansion, pipeline execution, and signal handling. Consider whether expansion logic (`expand_tilde`, `expand_braces`, `glob_token`, `expand_argv`) should be split into a separate `lexpand.c`. +- `**try_builtin` in lcmd.c** looks up `__builtins` from `_G` at runtime. This couples lcmd.c to lbuiltin.c's global registration. Could use registry instead. + +### Recommendation + +Keep two files for now. The code is manageable. Move env functions into lcmd.c (or a small lenv.c) to keep linit.c clean. + +--- + +## 3. Global namespace (`_G`) cleanup + +### Current state + +Five entries are added to `_G`: + + +| Global | Set in | Used by | User-callable? | +| --------------- | ---------- | ----------------------------- | ------------------------------------------------------------- | +| `__command` | linit.c | parser (backtick syntax) | No — compiler-generated calls only | +| `__interactive` | linit.c | parser (`!cmd` syntax) | No — compiler-generated calls only | +| `__getenv` | linit.c | parser (`$VAR` syntax) | No — compiler-generated calls only | +| `__setenv` | linit.c | parser (`$VAR = expr` syntax) | No — compiler-generated calls only | +| `__builtins` | lbuiltin.c | lcmd.c `try_builtin()` | Possibly — user might want to call `__builtins.cd()` directly | + + +All four `__`-prefixed functions are **internal implementation details** — the parser compiles syntactic sugar into calls to these. Users should never need to call them directly. Exposing them in `_G` means: + +1. Users can accidentally overwrite them, breaking shell syntax +2. They pollute `_G` with implementation details +3. They show up in `pairs(_G)` / tab completion + +### Option A: Move to Lua registry (recommended) + +Store internal functions in `LUA_REGISTRYINDEX` instead of `_G`. The parser already looks up globals via `_ENV` — we'd change it to emit registry lookups instead. + +**Pros:** Clean `_G`, functions can't be accidentally overwritten +**Cons:** Requires parser changes (how `commandexp()`, `codeenvget()`, `codeenvset()`, `interactivestat()` emit lookups) + +Implementation sketch: + +- In `linit.c`: store functions in registry with known string keys +- In `lparser.c`: change `singlevar()` calls to registry lookups via `luaK_codek` + `OP_GETFIELD` on the registry + +### Option B: Group under `__lush` table + +Put all internal functions in a single `__lush` table in `_G`: + +```lua +__lush = { + command = function(...) end, + interactive = function(...) end, + getenv = function(...) end, + setenv = function(...) end, + builtins = { cd = ..., exec = ..., umask = ... }, +} +``` + +**Pros:** Reduces `_G` pollution to 1 entry; still inspectable for debugging +**Cons:** Still in `_G`; still overwritable + +Implementation: + +- In `linit.c`: create table, register functions as fields, `lua_setglobal(L, "__lush")` +- In `lparser.c`: change generated code from `__command(...)` to `__lush.command(...)` +- In `lcmd.c`: change `try_builtin` to look up `__lush.builtins` instead of `__builtins` + +### Option C: Keep as-is but protect + +Use `__newindex` metamethod on `_G` to prevent overwriting `__`-prefixed keys, or mark them as read-only. + +**Pros:** Minimal code change +**Cons:** Still pollutes `_G`; metamethod approach is fragile + +### Recommendation + +**Option B** is the best balance — simple to implement, reduces global pollution from 5 entries to 1, keeps things debuggable. Option A is cleaner but requires more invasive parser changes. + +--- + +## 4. `__builtins` — user-facing or internal? + +### Current state + +`__builtins` is a table with `cd`, `exec`, `umask` functions. It's looked up by `try_builtin()` in lcmd.c at runtime via `lua_getglobal(L, "__builtins")`. + +### Questions + +- Should users be able to add their own builtins? (e.g. `__builtins.pushd = function(cmd, dir) ... end`) +- Should users be able to override builtins? (e.g. `__builtins.cd = my_custom_cd`) + +If yes to either, `__builtins` should remain accessible (but possibly under `__lush.builtins`). +If no, it should be in the registry. + +### Recommendation + +Keep `__builtins` user-accessible (under `__lush.builtins` if we go with Option B above). Allowing users to extend/override builtins is a useful shell feature. + +--- + +## Implementation plan + +1. **Rename functions** — change `luaB_` prefix to `luaCmd_` in lcmd.c/h and linit.c +2. **Move env functions** — relocate `luaB_getenv`/`luaB_setenv` from linit.c into lcmd.c +3. **Consolidate globals** — implement Option B (group under `__lush` table) +4. **Update parser** — change generated code to reference `__lush.command` etc. +5. **Update lcmd.c** — change `try_builtin` to use `__lush.builtins` +6. **Test** — run existing test suite, verify backtick/env/interactive/builtin functionality -Consider whether internal functions should live in a single `__lush` table or use the registry instead of polluting globals. diff --git a/issues/24-test-lua-compatibility.md b/issues/24-test-lua-compatibility.md index cbad24dc..13c5d58b 100644 --- a/issues/24-test-lua-compatibility.md +++ b/issues/24-test-lua-compatibility.md @@ -1,6 +1,6 @@ # 24 — Test against major Lua projects -**Status:** open +**Status:** done ## Problem