diff --git a/issues/29-lush-table-cleanup.md b/issues/29-lush-table-cleanup.md new file mode 100644 index 00000000..1bef43e0 --- /dev/null +++ b/issues/29-lush-table-cleanup.md @@ -0,0 +1,80 @@ +# Issue 29: Clean Up Integer Keys in lush Global Table + +**Status:** open + +## Problem + +The `lush` global table currently contains both string-keyed entries (for user access) and integer-keyed entries (for fast VM access via `OP_LUSH`). This is confusing when inspecting the table: + +``` +lush> vars(lush) +1 function: 0x104a5d098 +2 function: 0x104a5ed34 +3 function: 0x104a5f17c +4 function: 0x104a5f1c8 +5 function: 0x104a5ec7c +builtins table: 0x15a707f50 +command function: 0x104a5d098 +getenv function: 0x104a5f17c +interactive function: 0x104a5ed34 +setenv function: 0x104a5f1c8 +subcmd function: 0x104a5ec7c +``` + +The integer keys 1-5 are meaningless to users and clutter the table. + +## Why Integer Keys Exist + +`OP_LUSH` in the VM (`lvm.c`) needs to look up lush functions by a compile-time constant index. Currently it does: + +```c +luaH_getint(hvalue(&lushtv), idx + 1, &func); /* 1-based */ +``` + +`luaH_getint` for small integers hits the table's array part directly — a single bounds check + pointer offset. This is as fast as table access gets. + +## Performance of String-Keyed Lookup + +Using `luaH_getshortstr` instead would involve: +1. Compute hash position: `lmod(key->hash, sizenode(t))` — a single AND operation since the hash is pre-computed on the TString +2. Walk the hash chain (usually 1 node for a small table with 5 entries) +3. Compare the string pointer (interned strings use pointer equality) + +For a 5-entry table, the hash part will have 8 slots (next power of 2), so collisions are rare. The practical overhead per call is ~2-5 extra instructions vs the array path. This runs once per command execution, so the difference is negligible compared to the cost of fork/exec. + +## Options + +### A. VM looks up by interned string key + +Change `OP_LUSH` to use pre-interned `TString*` pointers (stored in registry or as constants) and call `luaH_getshortstr`. Remove the integer keys from the lush table. + +- Pro: Clean table, users can wrap functions and the VM picks up the wrapper +- Pro: Negligible performance cost for a 5-entry table +- Con: Need to store/manage the interned string pointers somewhere + +This is the best-of-both: the VM resolves through the same string keys users see, so `lush.command = my_wrapper` works transparently and there are no mysterious integer entries. + +### B. Cache function references at compile time + +Store the function TValues directly (e.g. as upvalues or in a fixed registry slot per function) so the VM doesn't go through the lush table at all. + +- Pro: Fastest possible (no table lookup at all) +- Con: Users wrapping `lush.command` would NOT affect VM behavior — breaks the power-user story + +### C. Keep current dual-key approach, hide integers from iteration + +Override `__pairs` on the lush table to skip integer keys. + +- Pro: Minimal change +- Con: Integer keys still exist and can be discovered; `ipairs` or raw access still shows them + +## Recommendation + +**Option A** — it's the cleanest solution, has negligible performance cost, and preserves the wrapping/interception use case that motivated exposing the table in the first place. + +## Implementation Notes + +- Store the 5 interned `TString*` pointers in a fixed location (e.g. adjacent registry slots, or a small struct in `global_State`) +- In `luaopen_lush`, intern the strings once and store them +- In `OP_LUSH`, map the `LUSH_OP_*` index to the corresponding `TString*` and call `luaH_getshortstr` +- Remove the `lua_rawseti` integer-key registration from `luaopen_lush`