10 Commits

Author SHA1 Message Date
d6e1e85ea1
parser: make tagged union field names respect expect_enum_dot
It's possible that this change may get reverted in the future, but I
think it makes things more consistent and has some other minor
benefits, so it probably won't be.

Consistency: tagged union fields are enum members by definition in zig,
so it makes these act like enumerations that accept values, which is
really how tagged unions work in zig.

Other benefits: tagged unions do not behave like structs, and having
their key start with a leading . helps to distinguish them visually.
You could say that it makes communicating intent more precise.

Here's an example: by default, given the following type:

    union(enum) {
        any: []const u8,
        int: i32,
    };

A corresponding nice document would now look like:

    .int: 42069

Whereas it used to be:

    int: 42069

My only concern here is that this potentially makes the serialization
noisier. But if so that's true of the enum handling, too.
2023-11-06 20:43:21 -08:00
f371aa281c
parser: default expect enum values with leading .
I prefer this, personally. And this is all about personal preference.
2023-10-22 16:48:45 -07:00
ce65dee71f
parser: ostensibly fix sentinel handling
I guess arrays don't need special handling because their memory is
explicitly accounted for, but it would probably be good to check that
a sentinel-terminated array initialized as `undefined` does get the
correct sentinel value.
2023-10-22 16:38:41 -07:00
f371f16e2f
slam dunk that minimum viable product vibe 2023-10-22 16:16:57 -07:00
6d2c08878d
examples: add parsing to an object example 2023-10-22 15:36:34 -07:00
4690f0b808
parser: add option for case-insensitive scalar comparison
This does not support unicode case folding, which is very much a
sorry-not-sorry situation because unicode is a disgusting labyrinthine
chaotic hellformat. Actually, our unicode support isn't very good from
the standpoint that we don't do any form of normalization, so
specifying non-ASCII values for scalar comparisons is probably asking
for trouble.
2023-10-18 21:34:07 -07:00
25386ac87a
rename flow_(list|map) to inline_(list|map)
This is simply better word choice.
2023-10-18 00:07:12 -07:00
8dd5463683
parser: change string and | semantics and expose slices in Value
The way I implemented these changes ended up being directly coupled and
I am not interested in trying to decouple them, so instead here's a
single commit that makes changes to both the API and the format. Let's
go over these.

| now acts as a direct concatenation operator, rather than
concatenating with a space. This is because the format allows the
specification of a trailing space (by using | to fence the string just
before the newline). So it's now possible to spread a long string
without spaces over multiple lines, which couldn't be done before.
This does have the downside that the common pattern of concatenating
strings with a space now requires some extra trailing line noise. I
may introduce a THIRD type of concatenating string (thinking of
using + as the prefix) because I am a jerk. We will see.

The way multi-line strings are concatenated has changed. Partially this
has to do with increasing the simplicity of the aforementioned
implementation change (the parser forgets the string type from the
tokenizer. This worked before because there would always be a trailing
character that could be popped off. But since one type now appends no
character, this would have to be tracked through the parsing to
determine if a character would need to be popped at the end). But I
was also not terribly satisfied with the semantics of multiline
strings before. I wrote several words about this in
429734e6e813b225654aa71c283f4a8b4444609f, where I reached the opposite
conclusion from what is implemented in this commit.

Basically, when different types of string concatenation are mixed, the
results may be surprising. The previous approach would append the line
terminator at the end of the line specified. The new approach prepends
the line terminator at the beginning of the line specified. Since the
specifier character is at the beginning of the line, I feel like this
reads a little better simply due to the colocation of information. As
an example:

  > first
  | second
  > third

Would previously have resulted in "first\nsecondthird" but it will now
result in "firstsecond\nthird". The only mildly baffling part about
this is that the string signifier on the first line has absolutely no
impact on the string. In the old design, it was the last line that had
no impact.

Finally, this commit also changes Value so that it uses []const u8
slices directly to store strings instead of ArrayLists. This is
because everything downstream of the value was just reaching into
string.items to access the slice directly, so cut out the middleman.
It was unintuitive to access a field named .string and get an
arraylist rather than a slice, anyway.
2023-10-08 16:57:52 -07:00
34ec58e0d2
value: implement parsing to objects
There are still some untested codepaths here, but this does seem to
work for nontrivial objects, so, woohoo. It's worth noting that this
is a recursive implementation (which seems silly after I hand-rolled
the non-recursive main parser). The thinking is that if you have a
deeply-enough nested object that you run out of stack space here, you
probably shouldn't be converting it directly to an object.

I may revisit this, though I am still not 100% certain how
straightforward it would be to make this nonrecursive with all the
weird comptime objects. Basically the "parse stack" would have to be
created at comptime.
2023-10-03 23:17:37 -07:00
38e47b39dc
all: do some restructuring
I don't like big monolithic source files, so let's restructure a bit.
parser.zig is still bigger than I would like it to be, but there isn't
a good way to break up the two state machine parsers, which take up
most of the space. This is the last junk commit before I am seriously
going to implement the "streaming" parser. Which is the last change
before implementing deserialization to object. I am definitely not
just spinning my wheels here.
2023-09-24 18:22:12 -07:00