B.log

Random notes mostly on Machine Learning

Crazy Expression Parsing

Suppose we have an expression like (5+5 * (x^x-5 | y && 3)) and we'd like to get some computer-understandable representation of that expression, like:

ADD Token[5] (MUL Token[5] (AND (BIT_OR (XOR Token[x] (SUB Token[x] Token[5])) Token[y]) Token[3])

In case if you don't know how to do that or are looking for the solutin right now, you should know that I'm not going to present a correct solution. This post is just a joke. You should use either a Shunting-yard algorithm or a recursive descent parser.

So if you're ready for madness... Let's go!

Let's take Don't repeat yourself principle as a justification. Moreover, let's take it to extreme "Don't repeat". Indeed, why do we need to repeat what compiler's developers already did?

Here we go

In case you're wondering what the heck is going on: all constants are converted to instances of Token class, for which I overloaded all the operators. Overloading is done in a way to preserve structure of an expression. The only thing we have to do then is to extract that information. In case you're not familiar with C++, I recommend you to read something about operator overloading.

As you can see, g++ and python are required for that "parser". Unfortunatelly, priority of a bitwise xor is too low to serve as a power operator.

Tagged as python, madness
comments powered by Disqus