Add(issue): lush table cleanup

This commit is contained in:
Cormac Shannon
2026-03-20 00:59:11 +00:00
parent 8a61a47543
commit 5c1eca312f

View File

@@ -0,0 +1,80 @@
# Issue 29: Clean Up Integer Keys in lush Global Table
**Status:** open
## Problem
The `lush` global table currently contains both string-keyed entries (for user access) and integer-keyed entries (for fast VM access via `OP_LUSH`). This is confusing when inspecting the table:
```
lush> vars(lush)
1 function: 0x104a5d098
2 function: 0x104a5ed34
3 function: 0x104a5f17c
4 function: 0x104a5f1c8
5 function: 0x104a5ec7c
builtins table: 0x15a707f50
command function: 0x104a5d098
getenv function: 0x104a5f17c
interactive function: 0x104a5ed34
setenv function: 0x104a5f1c8
subcmd function: 0x104a5ec7c
```
The integer keys 1-5 are meaningless to users and clutter the table.
## Why Integer Keys Exist
`OP_LUSH` in the VM (`lvm.c`) needs to look up lush functions by a compile-time constant index. Currently it does:
```c
luaH_getint(hvalue(&lushtv), idx + 1, &func); /* 1-based */
```
`luaH_getint` for small integers hits the table's array part directly — a single bounds check + pointer offset. This is as fast as table access gets.
## Performance of String-Keyed Lookup
Using `luaH_getshortstr` instead would involve:
1. Compute hash position: `lmod(key->hash, sizenode(t))` — a single AND operation since the hash is pre-computed on the TString
2. Walk the hash chain (usually 1 node for a small table with 5 entries)
3. Compare the string pointer (interned strings use pointer equality)
For a 5-entry table, the hash part will have 8 slots (next power of 2), so collisions are rare. The practical overhead per call is ~2-5 extra instructions vs the array path. This runs once per command execution, so the difference is negligible compared to the cost of fork/exec.
## Options
### A. VM looks up by interned string key
Change `OP_LUSH` to use pre-interned `TString*` pointers (stored in registry or as constants) and call `luaH_getshortstr`. Remove the integer keys from the lush table.
- Pro: Clean table, users can wrap functions and the VM picks up the wrapper
- Pro: Negligible performance cost for a 5-entry table
- Con: Need to store/manage the interned string pointers somewhere
This is the best-of-both: the VM resolves through the same string keys users see, so `lush.command = my_wrapper` works transparently and there are no mysterious integer entries.
### B. Cache function references at compile time
Store the function TValues directly (e.g. as upvalues or in a fixed registry slot per function) so the VM doesn't go through the lush table at all.
- Pro: Fastest possible (no table lookup at all)
- Con: Users wrapping `lush.command` would NOT affect VM behavior — breaks the power-user story
### C. Keep current dual-key approach, hide integers from iteration
Override `__pairs` on the lush table to skip integer keys.
- Pro: Minimal change
- Con: Integer keys still exist and can be discovered; `ipairs` or raw access still shows them
## Recommendation
**Option A** — it's the cleanest solution, has negligible performance cost, and preserves the wrapping/interception use case that motivated exposing the table in the first place.
## Implementation Notes
- Store the 5 interned `TString*` pointers in a fixed location (e.g. adjacent registry slots, or a small struct in `global_State`)
- In `luaopen_lush`, intern the strings once and store them
- In `OP_LUSH`, map the `LUSH_OP_*` index to the corresponding `TString*` and call `luaH_getshortstr`
- Remove the `lua_rawseti` integer-key registration from `luaopen_lush`