From 0d237418916ae396c33179517556057d479b4cb1 Mon Sep 17 00:00:00 2001 From: Cormac Shannon <> Date: Mon, 2 Mar 2026 00:37:13 +0000 Subject: [PATCH] Add issue #13: shell globbing (*, ?, **, {a,b}, [abc]) --- issues/13-shell-globbing.md | 117 ++++++++++++++++++++++++++++++++++++ 1 file changed, 117 insertions(+) create mode 100644 issues/13-shell-globbing.md diff --git a/issues/13-shell-globbing.md b/issues/13-shell-globbing.md new file mode 100644 index 00000000..9a7f0551 --- /dev/null +++ b/issues/13-shell-globbing.md @@ -0,0 +1,117 @@ +# Issue #13 — Shell globbing + +**Status:** open +**Blocked by:** #03, #04 + +## Problem + +Commands inside backticks have no glob expansion. The argv parser treats `*`, `?`, and `{...}` as literal characters, so patterns like `ls *.lua` pass `*.lua` as a literal string to the command rather than expanding it to matching filenames. + +```lua +-- currently: passes literal "*.lua" to ls +`ls *.lua` + +-- expected: expands to matching files, like a shell would +`ls *.lua` -- becomes ls foo.lua bar.lua ... +``` + +## Glob patterns to support + +| Pattern | Description | Example | +|---------|-------------|---------| +| `*` | Match any characters (not `/`) | `*.lua` → `foo.lua bar.lua` | +| `?` | Match single character | `?.c` → `a.c b.c` | +| `**` | Recursive directory match | `**/*.lua` → `src/a.lua lib/b.lua` | +| `{a,b}` | Brace expansion | `*.{o,h}` → `foo.o bar.h` | +| `[abc]` | Character class | `[Mm]akefile` → `Makefile makefile` | + +### Brace expansion details + +Brace expansion happens *before* glob matching (same as bash). It's purely textual — not a filesystem operation: + +``` +echo {o,h} → o h (two words) +echo hello{o,h} → helloo helloh (prefix attached) +echo hello{world,go} → helloworld hellogo (prefix attached) +echo *.{o,h} → (expand braces first, then glob each: *.o *.h) +echo hello{world, x} → hello{world, x} (space inside braces = literal, no expansion) +``` + +Rules: +- Braces with commas and no spaces expand: `{a,b,c}` → `a b c` +- Prefix/suffix attach to each alternative: `pre{a,b}suf` → `preasuf prebsuf` +- Spaces inside braces cancel the expansion (treated literally) +- Nested braces: not needed initially + +## Implementation + +Glob expansion should happen in `lcmd.c` during argv construction, after tokenizing but before `execvp()`. Each token that contains glob metacharacters (`*`, `?`, `[`, `{`) gets expanded into zero or more filenames. + +### Approach + +1. **Brace expansion** (textual, first pass): scan each token for `{a,b,...}` patterns. Expand into multiple tokens. No filesystem access needed. + +2. **Glob matching** (filesystem, second pass): for each token containing `*`, `?`, or `[`, call `glob(3)` (POSIX) to expand against the filesystem. If no matches, keep the literal token (like bash default). + +3. **`**` recursive matching**: `glob(3)` doesn't support `**` on all platforms. May need a custom recursive walk, or use `GLOB_ALTDIRFUNC` where available. Could also use `nftw()` + `fnmatch()`. + +### Where it runs + +Expansion happens inside `parse_argv()` or in a post-processing step after `parse_argv()` returns. Each original token potentially becomes multiple argv entries. + +### Quoting suppresses globbing + +Quoted strings are already literal in `parse_argv()`: +- `"*.lua"` → literal `*.lua` (no expansion) +- `'*.o'` → literal `*.o` (no expansion) +- `\*` → literal `*` (backslash escape) + +This matches shell behaviour — quoting suppresses glob expansion. + +## Edge cases + +- **No matches**: keep the literal pattern (bash default behaviour with `nullglob` off) +- **Dot files**: `*` should not match files starting with `.` unless the pattern starts with `.` (standard shell convention) +- **Expansion in pipelines**: each pipeline stage gets its own glob expansion +- **Very large expansions**: `**/*` in a large tree could produce thousands of entries — may need a reasonable limit + +## Tests + +```lua +-- basic wildcard +local r = `echo *.lua` +-- stdout should contain space-separated .lua files (non-empty) + +-- no match keeps literal +local r = `echo *.nonexistent_extension_xyz` +assert(r.stdout == "*.nonexistent_extension_xyz\n") + +-- quoted glob is literal +local r = `echo "*.lua"` +assert(r.stdout == "*.lua\n") + +-- single-quoted glob is literal +local r = `echo '*.lua'` +assert(r.stdout == "*.lua\n") + +-- brace expansion +local r = `echo {a,b,c}` +assert(r.stdout == "a b c\n") + +-- brace expansion with prefix +local r = `echo hello{world,there}` +assert(r.stdout == "helloworld hellothere\n") + +-- brace with spaces = no expansion +local r = `echo {a, b}` +assert(r.stdout == "{a, b}\n") + +-- ? single char match +-- [abc] character class +``` + +## Files to modify + +| File | Change | +|------|--------| +| `lcmd.c` | Add brace expansion + glob expansion in argv construction |