Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add loop_parser and loop_gen #318

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

uzytkownik
Copy link

I run into trouble using combine when writing a parser which used internal state to parse next element and accumulate them together. There are several ways to achieve it but each has some drawbacks:

Use then or otherwise use right-recursion

enum Token { A, B, C }

fn combine_state((tok, mut state): (Token, HashMap<Token, usize>))
{
    *state.entry(token).or_insert(0) += 1;
    state
}

parser!{
    fn tokens_parser_<Input>[Input](last_token: Option<Token>)(Input) -> HashMap<Token, usize>
    where
         Input: Stream<Token = char>
    {
         let mut choices = vec![];
         if last_token != Some(Token::A) {
             choices.push(token('a').map(|_| Token::A)).left());
         }
         if last_token != Some(Token::B) {
             choices.push(token('b').map(|_| Token::B).left().right());
        }
        if last_token != Some(Token::C) {
             choices.push(token('c').map(|_| Token::C).right().right());
        }
        choices((
            choice(choices).then_ref(|tok| token_parser(Some(tok))).map(combine_state),
            value(HashMap::new())
         ))
    }
}

let mut tokens_parser = tokens_parser_(None);

This however has a drawback of potentially being expensive if the combine is expensive (e.g. pushing to the front of vector). It's possible to avoid by temporarily using data structures such as list but it is additional structures you need to convert to/from.

Write custom function using parser and managed the state yourself

enum Token { A, B, C }

fn token_parser<Input>(last_token: Option<Token>)(Input) -> impl Parser<Input, Output = Option<Token>>
where
     Input: Stream<Token = char>
{
    let mut choices = vec![];
    if last_token != Some(Token::A) {
        choices.push(token('a').map(|_| Token::A)).left());
    }
    if last_token != Some(Token::B) {
        choices.push(token('b').map(|_| Token::B).left().right());
    }
    if last_token != Some(Token::C) {
         choices.push(token('c').map(|_| Token::C).right().right());
    }
    optional(choice(choices))
}

let mut tokens_parser = parser(|input| {
    let mut state = HashMap::new();
    let (first_token, commit) = token_parser(None).parse_stream(input).into_result()?;
    let mut last_token = match first_token {
        Some(first_token) => first_token,
        None => return Ok((state, commit))
    };
    loop {
        *state.entry(token).or_insert(0) += 1;
        match token_parser(Some(last_token)).into_result() {
            Ok((current_token, commit2)) => {
                commit = commit.merge(commit2);
                last_token = current_token;
            },
            Err(Commit(e)) =>
                return Err(Commit(e)),
            Err(Peek(e)) =>
                return state
        }
    }
});

This makes code-flow a bit clearer and avoids the ordering issue but it is probably a bit error-prone due to need of managing commitment yourself. It also require full reparse for partial state

Write custom parser

Most of the methods needed to implement the custom parsers are internal.

@uzytkownik
Copy link
Author

I will take a look at failure tomorrow. For whatever reason it passed locally.

@uzytkownik uzytkownik changed the title Add while_parser Add loop_parser and loop_gen Jun 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant