3 Commits

Author SHA1 Message Date
95aa6d01c6
tokenator: contextualize struct fields
this also decouples the CLI from the toknization functions so they can
be called from other programs.
2023-05-11 10:13:13 -07:00
e142fb5676
docs: functioning contextualization of more tokens
This recognizes block labels. Actually implementing this successfully
took more attempts than I'd like to admit. I originally had a
streaming version using a tail queue, but it had problems that seemed
to be intractable. So instead, everything is just jammed in an
arraylist and processed as a whole once the tokenizing is complete. It
increases the maximum memory usage to store all the intermediates
during tokenizing the whole file, but it does work, and frankly I'm ok
with it using a few MB of memory. It can tokenize itself.
2023-04-06 18:31:29 -07:00
d011974b1f
docs: start a boondoggle
Trying to make this smarter is a rabbit hole I may not survive.
2023-04-06 18:31:29 -07:00