This changes the parse/callback flow so that subcommands are run
iteratively from the base command rather than recursively. The primary
advantages of this approach are some stack space savings and much less
convoluted backtraces for deeply nested command hierarchies. The
overall order of operations has not changed, i.e. the full command
line is parsed before command callback dispatch starts.
Fixes: #12