Issue #23 now covers function naming conventions, file organisation, global namespace cleanup options, and a concrete implementation plan.
168 lines
7.5 KiB
Markdown
168 lines
7.5 KiB
Markdown
# 23 — General cleanup and refactor
|
|
|
|
**Status:** open
|
|
|
|
Housekeeping pass over the codebase.
|
|
|
|
---
|
|
|
|
## 1. Function naming conventions
|
|
|
|
### Current state
|
|
|
|
Lua's convention: every non-static function in a file uses a common prefix (e.g. `luaX_` for llex.c, `luaK_` for lcode.c, `luaB_` for lbaselib.c).
|
|
|
|
Our custom files don't follow a consistent prefix scheme:
|
|
|
|
|
|
| File | Current names | Issue |
|
|
| ------------ | ----------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------- |
|
|
| `lcmd.c` | `luaB_command`, `luaB_interactive` | Uses `luaB_` prefix, which belongs to `lbaselib.c` |
|
|
| `linit.c` | `luaB_getenv`, `luaB_setenv` | Same — `luaB_` is for the base library |
|
|
| `lbuiltin.c` | `builtin_cd`, `builtin_exec`, `builtin_umask`, `luaopen_builtins` | `builtin_` prefix is fine for statics; `luaopen_builtins` follows the `luaopen_` convention correctly |
|
|
|
|
|
|
### Proposed fix
|
|
|
|
Introduce a `luaCmd_` prefix for lcmd.c exports and `luaEnv_` (or keep in lcmd) for env functions:
|
|
|
|
- `luaB_command` → `luaCmd_command` (lcmd.c)
|
|
- `luaB_interactive` → `luaCmd_interactive` (lcmd.c)
|
|
- `luaB_getenv` → `luaCmd_getenv` (linit.c, or move to lcmd.c)
|
|
- `luaB_setenv` → `luaCmd_setenv` (linit.c, or move to lcmd.c)
|
|
|
|
Update `lcmd.h` exports and all call sites (`linit.c`).
|
|
|
|
Static helper functions in lcmd.c (`expand_tilde`, `parse_argv`, `exec_pipeline`, etc.) don't need prefixes — they're already file-scoped. This is consistent with Lua's approach.
|
|
|
|
### Files to change
|
|
|
|
- `lcmd.c` — rename `luaB_command`, `luaB_interactive`
|
|
- `lcmd.h` — update declarations
|
|
- `linit.c` — rename `luaB_getenv`, `luaB_setenv`; update `opencommand()`
|
|
|
|
---
|
|
|
|
## 2. File organisation
|
|
|
|
### Current state
|
|
|
|
The custom code lives in two files:
|
|
|
|
- `**lcmd.c`** (~1100 lines) — command execution, argv parsing, glob/tilde/brace expansion, pipeline execution, signal handling, dynamic buffer management, builtin dispatch
|
|
- `**lbuiltin.c`** (~160 lines) — cd, exec, umask implementations
|
|
|
|
Plus env var functions live in `**linit.c**` rather than in a dedicated file.
|
|
|
|
### Questions to resolve
|
|
|
|
- **Should `luaB_getenv`/`luaB_setenv` move out of linit.c?** They're the only non-init logic in that file. Could move to `lcmd.c` (they're command-adjacent) or a new `lenv.c`.
|
|
- **Is lcmd.c too large?** At ~1100 lines it bundles argv parsing, shell expansion, pipeline execution, and signal handling. Consider whether expansion logic (`expand_tilde`, `expand_braces`, `glob_token`, `expand_argv`) should be split into a separate `lexpand.c`.
|
|
- `**try_builtin` in lcmd.c** looks up `__builtins` from `_G` at runtime. This couples lcmd.c to lbuiltin.c's global registration. Could use registry instead.
|
|
|
|
### Recommendation
|
|
|
|
Keep two files for now. The code is manageable. Move env functions into lcmd.c (or a small lenv.c) to keep linit.c clean.
|
|
|
|
---
|
|
|
|
## 3. Global namespace (`_G`) cleanup
|
|
|
|
### Current state
|
|
|
|
Five entries are added to `_G`:
|
|
|
|
|
|
| Global | Set in | Used by | User-callable? |
|
|
| --------------- | ---------- | ----------------------------- | ------------------------------------------------------------- |
|
|
| `__command` | linit.c | parser (backtick syntax) | No — compiler-generated calls only |
|
|
| `__interactive` | linit.c | parser (`!cmd` syntax) | No — compiler-generated calls only |
|
|
| `__getenv` | linit.c | parser (`$VAR` syntax) | No — compiler-generated calls only |
|
|
| `__setenv` | linit.c | parser (`$VAR = expr` syntax) | No — compiler-generated calls only |
|
|
| `__builtins` | lbuiltin.c | lcmd.c `try_builtin()` | Possibly — user might want to call `__builtins.cd()` directly |
|
|
|
|
|
|
All four `__`-prefixed functions are **internal implementation details** — the parser compiles syntactic sugar into calls to these. Users should never need to call them directly. Exposing them in `_G` means:
|
|
|
|
1. Users can accidentally overwrite them, breaking shell syntax
|
|
2. They pollute `_G` with implementation details
|
|
3. They show up in `pairs(_G)` / tab completion
|
|
|
|
### Option A: Move to Lua registry (recommended)
|
|
|
|
Store internal functions in `LUA_REGISTRYINDEX` instead of `_G`. The parser already looks up globals via `_ENV` — we'd change it to emit registry lookups instead.
|
|
|
|
**Pros:** Clean `_G`, functions can't be accidentally overwritten
|
|
**Cons:** Requires parser changes (how `commandexp()`, `codeenvget()`, `codeenvset()`, `interactivestat()` emit lookups)
|
|
|
|
Implementation sketch:
|
|
|
|
- In `linit.c`: store functions in registry with known string keys
|
|
- In `lparser.c`: change `singlevar()` calls to registry lookups via `luaK_codek` + `OP_GETFIELD` on the registry
|
|
|
|
### Option B: Group under `__lush` table
|
|
|
|
Put all internal functions in a single `__lush` table in `_G`:
|
|
|
|
```lua
|
|
__lush = {
|
|
command = function(...) end,
|
|
interactive = function(...) end,
|
|
getenv = function(...) end,
|
|
setenv = function(...) end,
|
|
builtins = { cd = ..., exec = ..., umask = ... },
|
|
}
|
|
```
|
|
|
|
**Pros:** Reduces `_G` pollution to 1 entry; still inspectable for debugging
|
|
**Cons:** Still in `_G`; still overwritable
|
|
|
|
Implementation:
|
|
|
|
- In `linit.c`: create table, register functions as fields, `lua_setglobal(L, "__lush")`
|
|
- In `lparser.c`: change generated code from `__command(...)` to `__lush.command(...)`
|
|
- In `lcmd.c`: change `try_builtin` to look up `__lush.builtins` instead of `__builtins`
|
|
|
|
### Option C: Keep as-is but protect
|
|
|
|
Use `__newindex` metamethod on `_G` to prevent overwriting `__`-prefixed keys, or mark them as read-only.
|
|
|
|
**Pros:** Minimal code change
|
|
**Cons:** Still pollutes `_G`; metamethod approach is fragile
|
|
|
|
### Recommendation
|
|
|
|
**Option B** is the best balance — simple to implement, reduces global pollution from 5 entries to 1, keeps things debuggable. Option A is cleaner but requires more invasive parser changes.
|
|
|
|
---
|
|
|
|
## 4. `__builtins` — user-facing or internal?
|
|
|
|
### Current state
|
|
|
|
`__builtins` is a table with `cd`, `exec`, `umask` functions. It's looked up by `try_builtin()` in lcmd.c at runtime via `lua_getglobal(L, "__builtins")`.
|
|
|
|
### Questions
|
|
|
|
- Should users be able to add their own builtins? (e.g. `__builtins.pushd = function(cmd, dir) ... end`)
|
|
- Should users be able to override builtins? (e.g. `__builtins.cd = my_custom_cd`)
|
|
|
|
If yes to either, `__builtins` should remain accessible (but possibly under `__lush.builtins`).
|
|
If no, it should be in the registry.
|
|
|
|
### Recommendation
|
|
|
|
Keep `__builtins` user-accessible (under `__lush.builtins` if we go with Option B above). Allowing users to extend/override builtins is a useful shell feature.
|
|
|
|
---
|
|
|
|
## Implementation plan
|
|
|
|
1. **Rename functions** — change `luaB_` prefix to `luaCmd_` in lcmd.c/h and linit.c
|
|
2. **Move env functions** — relocate `luaB_getenv`/`luaB_setenv` from linit.c into lcmd.c
|
|
3. **Consolidate globals** — implement Option B (group under `__lush` table)
|
|
4. **Update parser** — change generated code to reference `__lush.command` etc.
|
|
5. **Update lcmd.c** — change `try_builtin` to use `__lush.builtins`
|
|
6. **Test** — run existing test suite, verify backtick/env/interactive/builtin functionality
|
|
|