nice-data

factotum/nice-data

Fork 0

Commit Graph

Author	SHA1	Message	Date
torque	0e60719c85	linebuffer: add strictness options When the buffer was separated from the tokenizer, we lost some validation, including really aggressive carriage return detection. This brings this back in full force and adds some additional validation on top of it.	2023-09-26 00:06:39 -07:00
torque	7f82c24584	parser: implement streaming parser With my pathological 50MiB 10_000 line nested list test, this is definitely slower than the one shot parser, but it has peak memory usage of 5MiB compared to the 120MiB of the one-shot parsing. Not bad. Obviously this result is largely dependent on the fact that this particular benchmark is 99% whitespace, which does not get copied into the resulting document. A (significantly) smaller improvement will be observed in files that are mostly data with little indentation or empty lines. But a win is a win.	2023-09-25 01:18:09 -07:00
torque	38e47b39dc	all: do some restructuring I don't like big monolithic source files, so let's restructure a bit. parser.zig is still bigger than I would like it to be, but there isn't a good way to break up the two state machine parsers, which take up most of the space. This is the last junk commit before I am seriously going to implement the "streaming" parser. Which is the last change before implementing deserialization to object. I am definitely not just spinning my wheels here.	2023-09-24 18:22:12 -07:00

Author

SHA1

Message

Date

torque

0e60719c85

linebuffer: add strictness options

When the buffer was separated from the tokenizer, we lost some
validation, including really aggressive carriage return detection.
This brings this back in full force and adds some additional
validation on top of it.

2023-09-26 00:06:39 -07:00

torque

7f82c24584

parser: implement streaming parser

With my pathological 50MiB 10_000 line nested list test, this is
definitely slower than the one shot parser, but it has peak memory
usage of 5MiB compared to the 120MiB of the one-shot parsing. Not bad.
Obviously this result is largely dependent on the fact that this
particular benchmark is 99% whitespace, which does not get copied into
the resulting document. A (significantly) smaller improvement will be
observed in files that are mostly data with little indentation or
empty lines.

But a win is a win.

2023-09-25 01:18:09 -07:00

torque

38e47b39dc

all: do some restructuring

I don't like big monolithic source files, so let's restructure a bit.
parser.zig is still bigger than I would like it to be, but there isn't
a good way to break up the two state machine parsers, which take up
most of the space. This is the last junk commit before I am seriously
going to implement the "streaming" parser. Which is the last change
before implementing deserialization to object. I am definitely not
just spinning my wheels here.

2023-09-24 18:22:12 -07:00

3 Commits