# 23 — General cleanup and refactor **Status:** open Housekeeping pass over the codebase. --- ## 1. Function naming conventions ### Current state Lua's convention: every non-static function in a file uses a common prefix (e.g. `luaX_` for llex.c, `luaK_` for lcode.c, `luaB_` for lbaselib.c). Our custom files don't follow a consistent prefix scheme: | File | Current names | Issue | | ------------ | ----------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------- | | `lcmd.c` | `luaB_command`, `luaB_interactive` | Uses `luaB_` prefix, which belongs to `lbaselib.c` | | `linit.c` | `luaB_getenv`, `luaB_setenv` | Same — `luaB_` is for the base library | | `lbuiltin.c` | `builtin_cd`, `builtin_exec`, `builtin_umask`, `luaopen_builtins` | `builtin_` prefix is fine for statics; `luaopen_builtins` follows the `luaopen_` convention correctly | ### Proposed fix Introduce a `luaCmd_` prefix for lcmd.c exports and `luaEnv_` (or keep in lcmd) for env functions: - `luaB_command` → `luaCmd_command` (lcmd.c) - `luaB_interactive` → `luaCmd_interactive` (lcmd.c) - `luaB_getenv` → `luaCmd_getenv` (linit.c, or move to lcmd.c) - `luaB_setenv` → `luaCmd_setenv` (linit.c, or move to lcmd.c) Update `lcmd.h` exports and all call sites (`linit.c`). Static helper functions in lcmd.c (`expand_tilde`, `parse_argv`, `exec_pipeline`, etc.) don't need prefixes — they're already file-scoped. This is consistent with Lua's approach. ### Files to change - `lcmd.c` — rename `luaB_command`, `luaB_interactive` - `lcmd.h` — update declarations - `linit.c` — rename `luaB_getenv`, `luaB_setenv`; update `opencommand()` --- ## 2. File organisation ### Current state The custom code lives in two files: - `**lcmd.c`** (~1100 lines) — command execution, argv parsing, glob/tilde/brace expansion, pipeline execution, signal handling, dynamic buffer management, builtin dispatch - `**lbuiltin.c`** (~160 lines) — cd, exec, umask implementations Plus env var functions live in `**linit.c**` rather than in a dedicated file. ### Questions to resolve - **Should `luaB_getenv`/`luaB_setenv` move out of linit.c?** They're the only non-init logic in that file. Could move to `lcmd.c` (they're command-adjacent) or a new `lenv.c`. - **Is lcmd.c too large?** At ~1100 lines it bundles argv parsing, shell expansion, pipeline execution, and signal handling. Consider whether expansion logic (`expand_tilde`, `expand_braces`, `glob_token`, `expand_argv`) should be split into a separate `lexpand.c`. - `**try_builtin` in lcmd.c** looks up `__builtins` from `_G` at runtime. This couples lcmd.c to lbuiltin.c's global registration. Could use registry instead. ### Recommendation Keep two files for now. The code is manageable. Move env functions into lcmd.c (or a small lenv.c) to keep linit.c clean. --- ## 3. Global namespace (`_G`) cleanup ### Current state Five entries are added to `_G`: | Global | Set in | Used by | User-callable? | | --------------- | ---------- | ----------------------------- | ------------------------------------------------------------- | | `__command` | linit.c | parser (backtick syntax) | No — compiler-generated calls only | | `__interactive` | linit.c | parser (`!cmd` syntax) | No — compiler-generated calls only | | `__getenv` | linit.c | parser (`$VAR` syntax) | No — compiler-generated calls only | | `__setenv` | linit.c | parser (`$VAR = expr` syntax) | No — compiler-generated calls only | | `__builtins` | lbuiltin.c | lcmd.c `try_builtin()` | Possibly — user might want to call `__builtins.cd()` directly | All four `__`-prefixed functions are **internal implementation details** — the parser compiles syntactic sugar into calls to these. Users should never need to call them directly. Exposing them in `_G` means: 1. Users can accidentally overwrite them, breaking shell syntax 2. They pollute `_G` with implementation details 3. They show up in `pairs(_G)` / tab completion ### Option A: Move to Lua registry (recommended) Store internal functions in `LUA_REGISTRYINDEX` instead of `_G`. The parser already looks up globals via `_ENV` — we'd change it to emit registry lookups instead. **Pros:** Clean `_G`, functions can't be accidentally overwritten **Cons:** Requires parser changes (how `commandexp()`, `codeenvget()`, `codeenvset()`, `interactivestat()` emit lookups) Implementation sketch: - In `linit.c`: store functions in registry with known string keys - In `lparser.c`: change `singlevar()` calls to registry lookups via `luaK_codek` + `OP_GETFIELD` on the registry ### Option B: Group under `__lush` table Put all internal functions in a single `__lush` table in `_G`: ```lua __lush = { command = function(...) end, interactive = function(...) end, getenv = function(...) end, setenv = function(...) end, builtins = { cd = ..., exec = ..., umask = ... }, } ``` **Pros:** Reduces `_G` pollution to 1 entry; still inspectable for debugging **Cons:** Still in `_G`; still overwritable Implementation: - In `linit.c`: create table, register functions as fields, `lua_setglobal(L, "__lush")` - In `lparser.c`: change generated code from `__command(...)` to `__lush.command(...)` - In `lcmd.c`: change `try_builtin` to look up `__lush.builtins` instead of `__builtins` ### Option C: Keep as-is but protect Use `__newindex` metamethod on `_G` to prevent overwriting `__`-prefixed keys, or mark them as read-only. **Pros:** Minimal code change **Cons:** Still pollutes `_G`; metamethod approach is fragile ### Recommendation **Option B** is the best balance — simple to implement, reduces global pollution from 5 entries to 1, keeps things debuggable. Option A is cleaner but requires more invasive parser changes. --- ## 4. `__builtins` — user-facing or internal? ### Current state `__builtins` is a table with `cd`, `exec`, `umask` functions. It's looked up by `try_builtin()` in lcmd.c at runtime via `lua_getglobal(L, "__builtins")`. ### Questions - Should users be able to add their own builtins? (e.g. `__builtins.pushd = function(cmd, dir) ... end`) - Should users be able to override builtins? (e.g. `__builtins.cd = my_custom_cd`) If yes to either, `__builtins` should remain accessible (but possibly under `__lush.builtins`). If no, it should be in the registry. ### Recommendation Keep `__builtins` user-accessible (under `__lush.builtins` if we go with Option B above). Allowing users to extend/override builtins is a useful shell feature. --- ## Implementation plan 1. **Rename functions** — change `luaB_` prefix to `luaCmd_` in lcmd.c/h and linit.c 2. **Move env functions** — relocate `luaB_getenv`/`luaB_setenv` from linit.c into lcmd.c 3. **Consolidate globals** — implement Option B (group under `__lush` table) 4. **Update parser** — change generated code to reference `__lush.command` etc. 5. **Update lcmd.c** — change `try_builtin` to use `__lush.builtins` 6. **Test** — run existing test suite, verify backtick/env/interactive/builtin functionality