6 Commits

Author SHA1 Message Date
1683197bc0
state: parse whitespace in flow objects a bit differently
There were (and probably still are) some weird and ugly edge cases
here. For example, `[ 1 ]` would parse to a list of `1 `. This
implementation allows a single space to precede the closing ] and
errors out if there is more than one. Additionally, it rejects any
spaces before the item separator comma. This also applies to flow
maps, with the addition that they do not permit whitespace before `:`
now, either.

Leading spaces are still consumed with reckless abandon, so, for
example, `[   lopsided]` is valid. There is also some state sloppiness
flying around so `[   val,    ]` probably currently works as well.
Tightening up the handling of leading whitespace will be a bigger
restructuring that may involve state machine changes. I'll have to
think about it.
2023-10-03 23:25:58 -07:00
c5e8921eb2
state: use inferred error sets
As far as I can tell, the only reason ever not to use an inferred error
set is when you would get a dependency loop otherwise.
2023-10-03 23:19:01 -07:00
34ec58e0d2
value: implement parsing to objects
There are still some untested codepaths here, but this does seem to
work for nontrivial objects, so, woohoo. It's worth noting that this
is a recursive implementation (which seems silly after I hand-rolled
the non-recursive main parser). The thinking is that if you have a
deeply-enough nested object that you run out of stack space here, you
probably shouldn't be converting it directly to an object.

I may revisit this, though I am still not 100% certain how
straightforward it would be to make this nonrecursive with all the
weird comptime objects. Basically the "parse stack" would have to be
created at comptime.
2023-10-03 23:17:37 -07:00
0028092a4e
parser: in theory, hook up the rest of the diagnostics
In practice, there are probably still things I missed here, and I
should audit this to make sure there aren't any egregious copy paste
errors remaining. Also, it's pretty likely that the diagnostics
line_offset field isn't correct in most of these messages. More work
will need to be done to update that correctly.
2023-10-01 21:15:21 -07:00
01f98f9aff
parser: start the arduous journey of hooking up diagnostics
The errors in the line buffer and tokenizer now have diagnostics. The
line number is trivial to keep track of due to the line buffer, but
the column index requires quite a bit of juggling, as we pass
successively trimmed down buffers to the internals of the parser.
There will probably be some column index counting problems in the
future. Also, handling the diagnostics is a bit awkward, since it's a
mandatory out-parameter of the parse functions now. The user must
provide a valid diagnostics object that survives for the life of the
parser.
2023-09-27 23:44:06 -07:00
1d65b072ee
parser: stateful reentrancy
finally the flow parser has been "integrated" with the main parser in
that they now share a stack. The bigger thing is that the parsing has
been decoupled from the tokenization, which will allow parsing
documents without loading them fully into memory first.

I've been calling this the streaming parser, but it's worth noting that
I am referring to streaming input, not streaming output. It would
certainly be possible to do streaming output, but I am not interested
in that at the moment (it would be the lowest-memory-overhead
approach, but it's a lot of work for little gain, and it is less
flexible for converting input to objects).
2023-09-24 22:24:33 -07:00