Advanced Sheet Formulas: Expresso 2000 Expression Parser, a Progress Snapshot

In this post, I am going to dive into a recent project that I’ve been working on: a google sheet cell formula that will evaluate text math expressions.

A note to Excel users (Click to Expand)

This entire effort of creating a formula to solve text math expressions has already been addressed in Excel. There is a function “EVALUATE” that you can supposedly use. I haven’t tested it out, and from various references online you may need to make a named function and wrap it in a LAMBDA, but I will leave that as an exercise to the reader. Note, however, that although the Excel function may by more polished, my project has a few features, such as named variables, that the Excel function does not.

Did that make sense? Lets try an example. We can all solve math expressions in Google Sheets and Excel, right? Just type an ‘=’, some expression, and press enter:

This is what sheets were designed for. But let me ask you this: what if you wanted the formula to be user selectable? What if you wanted the user to be able to use a dropdown to make a selection, and you want to use a different expression based on their selection? You might end up with something like this:

And this works in a pinch. But what about when there are more than a handful of expressions? What if you have ten unique math expressions? What about 100? I have a personal project planned where I know I will have 23 expressions to start, and where I could potentially collect a library of hundreds. It would be a huge burden to develop and maintain the library of expressions as a giant set of nested if statements. It would be far easier if I could set up a simple lookup table to reference my formulas:

This brings in the expression from Column E based on the user selection in cell A5, but it doesn’t evaluate the expression. That is because the expression is text, and Google Sheets doesn’t have a mechanism for interpreting text as an expression to be resolved. The astute might come up with the idea that we could evaluate the hard coded expressions in a new column F, essentially calculate the answer for every single expression and then have the VLOOKUP grab the result. And that would work, except that I want this system to scale; I want it to be able to be put anywhere on a sheet, with different input variables, without being locked into a certain region of the sheet. I need to be able to handle expressions as text, and then evaluate the expressions when I want to.

What I need is a single cell formula that can parse and evaluate math expressions where the math expressions are stored as simple text.

After a number of weekends, I am happy to introduce the (work in progress) Expresso 2000 (name subject to change):

(Click to reveal the wall of code)

=LET(str,Q133,
USERSYMRANGE,K133:P133,
DEBUG,R133,
CREDIT,"Compjinx.com",
NAME,"Expresso 2000",
TOK,SPLIT(REGEXREPLACE(str,"([\^\-+*\/%\),]|\w*\()","♥$1♥"),"♥"),
SET,LAMBDA(arr,i,val,LET(b,SEQUENCE(1,COUNTA(arr)),c,ARRAYFORMULA(IF(b=i,val,arr)),c)),
GET,LAMBDA(arr,i,INDEX(arr,1,i)),
HASi,LAMBDA(arr,i,AND(i>0,i<=COUNTA(arr))),
HAS,LAMBDA(arr,v,IFERROR(SUM(ARRAYFORMULA(IF(arr=v,1,0)))>0,FALSE)),
AWRITE,LAMBDA(ar,aroff,data,LET(arlen,COUNTA(ar),dlen,COUNTA(data),b,SEQUENCE(1,MAX(arlen,dlen+aroff-1)),c,MAP(b,LAMBDA(i,IF(HASi(data,i-aroff+1),GET(data,i-aroff+1),IF(HASi(ar,i),GET(ar,i),"")))),c)),
PAIRIFY,LAMBDA(ar,IF(ROWS(ar)=1,LET(l,SEQUENCE(1,COUNTA(ar)/2),TRANSPOSE({MAP(l,LAMBDA(i,GET(ar,i*2-1)));MAP(l,LAMBDA(i,GET(ar,i*2)))})),ar)),
USERSYM,PAIRIFY(USERSYMRANGE),
BISYMLIST,{"e","pi"},
BISYMVAL,{EXP(1),PI()},
SYMLIST,AWRITE(BISYMLIST,COUNTA(BISYMLIST)+1,TRANSPOSE(INDEX(USERSYM,0,1))),
SYMVAL,AWRITE(BISYMVAL,COUNTA(BISYMVAL)+1,TRANSPOSE(INDEX(USERSYM,0,2))),
GETSYMVAL,LAMBDA(s,IFERROR(FILTER(SYMVAL,SYMLIST=s),NA())),
SUBSYMBOLS,LAMBDA(arr,MAP(arr,LAMBDA(ar,if(HAS(SYMLIST,ar),GETSYMVAL(ar),ar)))),
FUNCLIST,{"sin(","cos(","tan(","pi(","sqrt("},
OPLIST,{"+","-","/","*","%","^"},
GETD,LAMBDA(arr,i,default,IF(NOT(HASi(arr,i)),default,INDEX(arr,1,i))),
HASPATTERN,LAMBDA(ar,aroff,pat,LET(len,SEQUENCE(1,COUNTA(pat)),REDUCE(1,len,LAMBDA(ac,i,LET(p,GET(pat,i),arrv,GETD(ar,aroff+i-1,"_"),IF(AND(NOT(HASi(ar,aroff+i-1)),p="$"),TRUE,SWITCH(p,"N",ISNUMBER(arrv),"!N",NOT(ISNUMBER(arrv)),"O",HAS(OPLIST,arrv),"!O",NOT(HAS(OPLIST,arrv)),"F",HAS(FUNCLIST,arrv),"",TRUE,p=arrv)))*ac)))),
FUNC,LAMBDA(f,a,b,SWITCH(f,"sin(",sin(a),"cos(",cos(a),"tan(",tan(a),"pi(",PI(),"sqrt(",sqrt(a),NaN)),
FUNCLEN,LAMBDA(f,SWITCH(f,"sin(",1,"cos(",1,"tan(",1,"pi(",0,"sqrt(",1,-1)),
FUNCABLE,LAMBDA(K,i,IF(NOT(HASi(K,i)),FALSE,AND(HASPATTERN(K,i,{"F","N",")"}),FUNCLEN(GET(K,i))=1))),
FUNCTOKLEN,LAMBDA(K,i,IF(NOT(HASi(K,i)),0,LET(flen,FUNCLEN(GET(K,i)),IF(OR(flen=-1,NOT(HASi(K,i+flen*2+IF(flen=0,1,0)))),0,SWITCH(flen,0,IF(GET(K,i+1)=")",2,0),1,IF(AND(ISNUMBER(GET(K,i+1)),GET(K,i+2)=")"),3,0),2,IF(AND(ISNUMBER(GET(K,i+1)),GET(K,i+2)=",",ISNUMBER(GET(K,i+3)),GET(K,i+4)=")"),5,0),0))))),
FUNCR,LAMBDA(K,i,IF(NOT(HASi(K,i)),FALSE,LET(me,GET(K,i),com,AND(GET(K,i)=",",FUNCTOKLEN(K,i-2)=5),len,FUNCTOKLEN(i-1),leno,FUNCTOKLEN(K,i-3),lent,FUNCTOKLEN(K,i-5),OR(com,AND(GET(K,i)=")",OR(len=2,leno=3,lent=5)))))),
OPORDER,LAMBDA(t,SWITCH(t,"^",6,"*",5,"/",5,"%",3,"+",2,"-",2,0)),
DOP,LAMBDA(l,t,r,SWITCH(t,"%",mod(l,r),"*",l*r,"/",l/r,"+",l+r,"-",l-r,"^",l^r,"")),
DUN,LAMBDA(u,r,SWITCH(u,"-",-r,"")),
BADPAREN,LAMBDA(K,i,IF(AND(GET(K,i)="(",i+2<=COUNTA(K)),GET(K,i+2)=")",IF(AND(GET(K,i)=")",i>2),GET(K,i-2)="(",FALSE))),
OLDOPABLE,LAMBDA(K,i,t,IF(i+2>COUNTA(K),FALSE,LET(isnums,AND(ISNUMBER(GET(K,i)),ISNUMBER(GET(K,i+2))),isop,IF(t="",TRUE,t=GET(K,i+1)),ord,OPORDER(GET(K,i+1)),leftorder,IF(i-1<1,0,OPORDER(GET(K,i-1))),rightorder,IF(i+3>COUNTA(K),0,OPORDER(GET(K,i+3))),AND(isnums,isop,NOT(leftorder>ord),NOT(rightorder>=ord))))),
UNARY,LAMBDA(K,i,IF(OR(HASPATTERN(K,i-1,{"O","-","N"}),HASPATTERN(K,i-1,{"(","-","N"}),HASPATTERN(K,i-1,{"F","-",","}),HASPATTERN(K,i-1,{"$","-","N"})),TRUE,FALSE)),
OPABLE,LAMBDA(K,i,t,IF(HASPATTERN(K,i,{"N","O","N"}),LET(op,GET(K,i+1),ord,OPORDER(op),leftorder,OPORDER(GETD(K,i-1,"!")),rightorder,OPORDER(GETD(K,i+3,"!")),AND(NOT(leftorder>=ord),NOT(rightorder>ord))),FALSE)),
OPMAP,LAMBDA(K,LET(ind,SEQUENCE(1,COUNTA(K)),ops,MAP(ind,LAMBDA(in,IF(UNARY(K,in),"U",IF(OPABLE(K,in,""),"O",IF(FUNCABLE(K,in),"F","N"))))),oprem,REDUCE(ops,ind,LAMBDA(op,i,IF(GET(op,i)="O",AWRITE(op,i,{"O","R","R"}),IF(GET(op,i)="U",AWRITE(op,i,{"U","R"}),IF(GET(op,i)="F",AWRITE(op,i,{"F","R","R"}),IF(HASPATTERN(K,i,{"(","N",")"}),AWRITE(op,i,{"R","N","R"}),op)))))),oprem)),
OPMAPEX,LAMBDA(K,ops,LET(ind,SEQUENCE(1,COUNTA(K)),Kn,MAP(ind,K,ops,LAMBDA(i,tok,op,IF(op="O",DOP(tok,GET(K,i+1),GET(K,i+2)),IF(op="U",DUN(tok,GET(K,i+1)),IF(op="F",FUNC(tok,GET(K,i+1),0),tok))))),IF(FALSE,Kn,FILTER(Kn,ops<>"R")))),
LOOPOPS,LAMBDA(self,K,its,IF(its=0,K,LET(ops,OPMAP(K),Kn,OPMAPEX(K,ops),IF(OR(COUNTA(Kn)=1,its=0),Kn,self(self,Kn,its-1))))),
res,LOOPOPS(LOOPOPS,SUBSYMBOLS(TOK),DEBUG),
IF(DEBUG=100,{res},{OPMAP(res);res}))

Simple, right? Ok, yes, I admit, it is more of a Frankenstein’s Monster than a work of art. Maybe I will figure out some tricks to make it a little more digestible. I could certainly “farm out” some of the LAMBDA functions into named functions, but ultimately I wanted to keep everything self-contained for easy copy-paste usage. You can copy this into a Google Sheet as a cell formula, or even make a named function with it. No extra doo-dads, extensions, or scripts need to be activated or installed. The first three lines are the inputs: a cell for the text math expression, a range of custom variables, and finally a debug number (set to 100 to just solve without any debug). To continue the previous example:

In this case, column D is used as a validation range for cell A5, and with a simple VLOOKUP grabs the expression from column E and dumps it into our Expresso 2000. The range A1:B3 is also passed to Expresso 2000 to define our variables. One thing that I want to point out is that the expressions have the aesthetic readability that you can only get with spreadsheets when using named ranges, but we aren’t locked into the sheet-wide static nature of named ranges. With every call to the Expresso 2000, we can provide it a different range than A1:B3 for the variables.

A potential use case in the engineering world could involve making a worksheet that calculates friction loss of fittings in a pipe, where the equation can change based on a library of fittings, and the properties of the fitting. My library of expressions could look something like this:

Note that this table could be huge. It doesn’t matter. If you wanted to get fancy, you could even dynamically concatenate several expressions together if the application required it. The next part is the implementation:

Amazing! Note that the user inputs are A-G. You select the proper fitting with a dropdown on A, then define different variables in columns B-G. The expression populates column H from our library (simple VLOOKUP), and then that behemoth of a cell formula that I provided earlier lives in column I. On each row, the Expresso 2000 parses the expression, substitutes in the user defined variables for each row, and provides a resulting value. Note that column H doesn’t need to exist; I could just as easily include the VLOOKUP directly into the Expresso 2000. However, one thing that I believe is very important in a number of technical professions, like engineering, is having self-documenting calculation sheets. Showing the formula, although sometimes excessive, can be a great way to make things easy on anyone reviewing your calculations.

I would love to do a deep dive into how the Expresso 2000 works, but that will have to wait for another time.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *