most of the articles found related to parser never talk about the feeding data API. it's only about what's the parser is doing inside (the type of gramar LR, the algorithm, ..)
Unfortunately lots of things that do parsing, offers to the user of the facility the choice of parsing a file as in:
object *parse_file(const char *filename); object *parse_file(FILE *file);
What if you source is available in memory only; there's no easy way to feed the data to this API short of dumping everything to disk and reading it back, or creating a pipe between self data. pretty convulated.
Sometimes you have the choice to parse a in memory string that is slighlty more useful than beeing limited to parsing a file. whilst it's a nice improvement over file or file descriptor only, it's not as nice as it could be.
imagine your data is several gigabytes or that you have major memory constraint that just prevent you from loading the whole data in memory, how can you use this string api. that's correct you just can't, and most of the time people fallback to the file feeding api.
what happens if your data is neither available as a file descriptor, there's no way to store the data, and having the data in memory is just not going to work. At this stage you really want to be able to feed data incrementally so that you just need to have in memory only character at a time (or a small string).
char c; while (c = read_next_char()) { parse_data(c); }
incremental parser are very simple to have when the underlying technology is a state machine. each time the parsing function returns to the caller, the state is kept (either by the caller to have something reentrant or inside the parser itself which is less recommended), and when recalling the parser the state is passed back to it.
this is the holy grail, since every other API can be develop on top of this very simple API, and even better, the wrapper are completly trivial (couple of lines each):
file API: parse_fd(int fd) { while ((c = read_fd_next_char(fd)) != EOF) { parse_data(c); } } string API: parse_string(char *s) { int i; for (i = 0; s[i]; i++) parse_data(s[i]); }
In more abstracted language, parser usually take stream which are abstracted API to hide where's data is coming from, and when data is available. resulting in more or less the same behavior as the incremental api. however you lose control of the parser execution, meaning this is hard to stop the parser short of injecting unexpected data through the input stream.
A simple example, would be cancelling the parsing of some arbitrary data because the user requested it. just by the fact that you're processing the data by small chunk, and that's the caller is in charge of the scheduling of the parsing function.
parse(char *s) { int i; for (i = 0; s[i]; i++) { if (user_cancel) break; parse_data(s[i]); } }
As a user of the parsing functions, you should ask for nothing less than incremental parser. as a developper of the parsing functions, should offer nothing less.