Part 1: Fixing the Stella R Python Script

To fix along yourself, first download StellaR.py v. 1.3. First, I’ll start with in-place changes so I can reference the original line numbers. Line 321 needs an added test to see if there’s really anything “to” the model, even if the INIT is found:

if init_match and len(lines[i-1].strip().split()[6:-2]) > 0:

Line 354: I added the R modulus %% to the optlist since in my version of the script, I have already translated it.

In various places in the script, look-behinds are needed:

Line 357: txtline=re.split('(=|\*|>(?!=)|<(?!=)|,|\^|\+|\-|[)]|[(]|[{]|[}]|/|\s+|>=|<=)',txtline)

Also, at various places, a parenthesis is needed after “if” when searching a line. Otherwise, you get variable names with the “if” string in them.

Line 363: if_num=len(re.findall('(if\s+|IF\s+|If\s+|if\(|IF|If)',txtline))
Line 367: conv_temp = re.split('(=|\*|>(?!=)|<(?!=)|,|\^|\+|\-|[)]|[(]|[{]|[}]|/|\s+|>=|<=|%%)', txtline)
Line 395: if (len(re.findall('(if\s+|IF\s+|If\s+|if\(|IF|If)',txtline)) > 0):
Line 420: if (len(re.findall('(if\s+|IF\s+|If\s+|if\(|IF|If)',txtline)) > 0):

There are some errors with temporary variable naming starting at the loop in line 443. The uses of variable names “l” and “c” (line 634 on) don’t work for Python. Search carefully for these occurrences as variable names and replace them down the script. Where “l” occurs in the mentioned loop, I’d recommend replacing it with 2 different variable names. They in fact reference different things and earn different names. I get that maybe “parts” isn’t important and can be overwritten, but for the sake of clarity, let’s keep it for now.

parts=lines[ln].strip().split() #splits the line into a list of strings by spaces
var=re.split('([)]|[(]|[{]|[}]|\s+)',parts[0])[0] #pulls the key out of the string

The method used to check for an installed package seems deprecated, so I just changed this line to loading the package, with a comment note to install it if it’s missing (this should be basic knowledge of anyone wanting to use an R script).

Line 495: ff.writelines(["library(deSolve) #If you don't have the package 'deSolve' installed, install it.\n"])

Before, this line didn’t check for the possibility that the “xname” variable referenced wasn’t itself data.

Line 523: if (convertors[conv].Data.xname not in convT2 and convertors[conv].Data.xname in convlist and not convertors[(convertors[conv].Data.xname)].isData)
Line 620: ff.writelines(["\t return(list(c(d", EqsL[0]])
Line 622: ff.writelines([")))\n}"])

From here, I can’t really easily reference line numbers from the original script here because I changed blank line spacing/formatting. Yet, the whole script will be posted, so this is just an explanation of the changes. I only kept 1 of the “explore” functions since they were redundant, and updated it to be more efficient.

def convExplore(lines, convname):
    i = [line for line,s in enumerate(lines) if s.startswith(convname + " =")][0]
    convEq = lines[i].split("=",1)[1].strip()
    tempc=i+1
    multilineconv=True
    while (multilineconv):
       if tempc == len(lines): multilineconv=False
       elif ('=' not in lines[tempc].strip().split() or re.search("(ELSE|else)$",lines[tempc-1].strip()) or re.search("^else",lines[tempc].strip()) or re.search("^then",lines[tempc].strip())):
          convEq = convEq + ' ' + ' '.join(lines[tempc].strip().split())
          tempc+=1
       else: multilineconv=False
    convEq=re.sub('{.*}', '', convEq).strip()
    return (convEq)

I deleted “single_string()” because join basically does the same thing. I rewrote the “if_extract” function, and deleted the “convWrite” function.

def if_extract(stella_var,conv):
   stella = stella_var
   txtline = str(stella[conv].Eq)
   if stella[conv].isIf:
      pattern = '(if\s+|IF\s+|If\s+|if\(|IF|If)'
      if_pattern = re.compile(pattern)
      function_pattern = '(\^|\*|DELAY|exp|min|max|mean|sum|abs|sin|cos|tan|log10|sqrt|round|log|atan|acos|floor)\(.*\)'
      protectedstrings = []
      for m in re.finditer(function_pattern,txtline):
         if m.group().count("(") < m.group().count(")"):
            endpos = [m.start() for m in re.finditer('\)', m.group())][m.group().count("(")-m.group().count(")")] + m.start()
         else: endpos = m.group().rfind(")") + m.start() + 1
         startpos = m.group().find("(") + m.start()
         protectedstrings.append(startpos)
         protectedstrings.append(endpos)
      protectedstrings.insert(0,0)
      if protectedstrings[-1] == len(txtline):
         parts = [txtline[i:j] for i,j in zip(protectedstrings, protectedstrings[1:])]
      else: parts = [txtline[i:j] for i,j in zip(protectedstrings, protectedstrings[1:]+[None])] 
      for i in range(0, len(parts), 2):
         if ('(' in parts[i]): parts[i] = parts[i].replace('(', ' ')
         if (')' in parts[i]): parts[i] = parts[i].replace(')', ' ')
      txtline = ''.join(parts)
      ifs = if_pattern.finditer(txtline)
      ifspos = []
      for y in ifs: ifspos.append(y.start())
      for ifst in reversed(ifspos):
         if_clause = re.match('(if\s+|IF\s+|If\s+|if(?!else)|IF|If)(?P<ifclause>.*?)\s+(then\s+|THEN\s+|Then\s+|then|THEN|Then)', txtline[ifst:])
         ifclause = if_clause.group('ifclause')
         then_clause = re.search('(then\s+|THEN\s+|Then\s+|then|THEN|Then)(?P<thenclause>.*?)\s*((?<!if)else\s+|ELSE\s+|Else\s+|(?<!if)else|ELSE|Else)', txtline[ifst:])
         thenclause = then_clause.group('thenclause')
         if len(re.split('((?<!if)else\s+|ELSE\s+|Else\s+|(?<!if)else|ELSE|Else)',txtline[ifst:])) > 2:
            elseclause = re.split('((?<!if)else\s+|ELSE\s+|Else\s+|(?<!if)else|ELSE|Else)',txtline[ifst:])[2].strip()
         else: elseclause = None
         ifelse = "ifelse(" + ifclause + "," + thenclause + "," + elseclause + ")" + ' '.join(re.split('((?<!if)else\s+|ELSE\s+|Else\s+|(?<!if)else|ELSE|Else)',txtline[ifst:])[3:])
         txtline = txtline[:ifst] + ifelse
      txtline = re.sub('(?<!<|>)=', '==', txtline) 
      if (' OR ' in txtline): txtline = txtline.replace(' OR ', ' | ')
      if (' or ' in txtline): txtline = txtline.replace(' or ', ' | ')
      if (' AND ' in txtline): txtline = txtline.replace(' AND ', ' & ')
      if (' and ' in txtline): txtline = txtline.replace(' and ', ' & ')
   return(txtline)

I deleted all the “function” functions. I wrote my own function to broaden the criteria to find meaningless lines.

def special_match(strg, search=re.compile(r'^(?:(?:.*[^A-Za-z0-9:()_\s\\])|(?:THEN|ELSE|then|else)).*$').search):
   return not bool(search(strg))

Accordingly, I deleted the “WN” line, and changed the test in the first loop to…

lines[ln] = lines[ln].expandtabs()
if special_match(lines[ln]) and not re.search("ELSE$",lines[ln-1].strip()): lines.remove(lines[ln])

Since at this point, the script is already looping over all lines in the file, I just did in-place line changes in lieu of all the “function” functions. Also, % isn’t a special character in Stella, so it was popping up in variable names, etc. In my equation layer, I had arrays, but in this case, I didn’t see much difference between the way an array was being used vs. just an ordinary variable. So, that’s why the brackets are being changed/replaced.

lines[ln]=lines[ln].replace('%', 'percent')
lines[ln]=lines[ln].replace('[', '_')
lines[ln]=lines[ln].replace(']', '')
lines[ln]=lines[ln].replace(' MOD ', ' %% ')
lines[ln]=lines[ln].replace('delay(', 'DELAY(')
lines[ln]=lines[ln].replace(' EXP', ' exp')
lines[ln]=lines[ln].replace(' MIN', ' min')
lines[ln]=lines[ln].replace(' MAX', ' max')
lines[ln]=lines[ln].replace(' MEAN', ' mean')
lines[ln]=lines[ln].replace(' SUM', ' sum')
lines[ln]=lines[ln].replace(' ABS', ' abs')
lines[ln]=lines[ln].replace(' SIN', ' sin')
lines[ln]=lines[ln].replace(' COS', ' cos')
lines[ln]=lines[ln].replace('TAN', 'tan')
lines[ln]=lines[ln].replace('LOG10', 'log10')
lines[ln]=lines[ln].replace(' SQRT', ' sqrt')
lines[ln]=lines[ln].replace(' ROUND', ' round')
lines[ln]=lines[ln].replace(' LOGN', ' log')
lines[ln]=lines[ln].replace(' ARCTAN', ' atan')
lines[ln]=lines[ln].replace(' ARCCOS', ' acos')
lines[ln]=lines[ln].replace('TIME', 't')
lines[ln]=lines[ln].replace('(time', '(t')
lines[ln]=lines[ln].replace(' PI', ' pi')
lines[ln]=lines[ln].replace(' INT', ' floor')
lines[ln]=re.sub(r'\(0\-([a-zA-Z0-9_]+)\)<0', r'\1>0', lines[ln])

At the end of that block is somewhat of a stylistic preference: the authors were testing if 0 – x was less than 0, so I just changed it to be if x > 0. According to my in-place line changes above, I made a list called…

supported_func=['DELAY','delay','exp','min','max','mean','sum','abs','sin','cos','tan','log10','sqrt','round','log','atan','acos','t','pi','floor','dt']

I make this for use in a restricted word list, to test what’s in the equation. In the equation layer I was given, the previous authors multiplied by 0 to “turn off” flows. So when going through the flows, I added…

if "0*" in txtline:
 init_position = txtline.find("0*")
 txtline = txtline.replace(txtline[init_position:], "0")

…to just change anything multiplied by 0 to 0. When finding a function in the line, I updated the format so that it would be anchored to the parenthesis.

if ((conv_temp[i]) in supported_func):
   flows[fl].hasFunction=True
   Rformatting = []
   for x in re.split('(\()',flows[fl].Eq): Rformatting.append(x.strip())
   flows[fl].Eq = ''.join(Rformatting)
conv_temp = re.split('=|\*|>|<|,|\+|\^|\-|\)|\(|[{]|[}]|/|\s+|%%', convertors[conv].Eq)
restricted_words = ['if','If','IF','AND','and','THEN','then','ELSE','else','OR','','or'] + supported_func
if ((conv_temp[i] not in restricted_words) and (not is_number(conv_temp[i]))):
   if (conv_temp[i] not in list(convertors.keys()) and conv_temp[i] not in list(models.keys()) and conv_temp[i] not in list(flows.keys())):  convertors[conv_temp[i]]=convertor(conv_temp[i])

I changed that block of text for each loop where the convertors are analyzed. When determining what will go into the “additional lines”…

var=re.split('([)]|[(]|[{]|[}]|\s+|\*)', parts[0])[0]
 special_words = ['INIT','','THEN','then','if','IF','else','ELSE'] + supported_func + list(flows.keys()) + list(convertors.keys()) + list(models.keys())
 if (var not in special_words and not lines[ln].startswith(var + '(t)')):

As mentioned above, in the data write out loop, I had to add some complexity in case the variable it references is itself data:

if (convertors[conv].Data.xname not in convT2 and convertors[conv].Data.xname in convlist and not convertors[(convertors[conv].Data.xname)].isData):
   convT2.append(convertors[conv].Data.xname)
   ff.writelines(["\t", convertors[conv].Data.xname, " <- ", convertors[convertors[conv].Data.xname].Eq, "\n"])
elif (convertors[conv].Data.xname not in convT2 and convertors[conv].Data.xname in convlist and convertors[(convertors[conv].Data.xname)].isData):
   ff.writelines(["\t", convertors[conv].Data.xname, " <- inputData(", convertors[convertors[conv].Data.xname].Data.xname, ", '", convertors[conv].Data.xname, "')\n"])

Then, I overhauled the part that writes out the convertors and flows.

convsflows = convlist + flowlist
while convsflows:
   for fl in convsflows:
   if fl in flowlist: dependents = flows[fl].in_flows + flows[fl].in_convertors
   elif fl in convlist: dependents = convertors[fl].in_convertors
   flowWrite=False
   if len([i for i in dependents if i in initialized]) < len(dependents) and [i for i in dependents if i in convsflows]: flowWrite=False
   else: flowWrite = True
   if (flowWrite):
      if fl in flowlist:
         if flows[fl].isIf: flows[fl].Eq = if_extract(flows,fl)
         converted = flows[fl]
      elif fl in convlist:
         if convertors[fl].isIf: convertors[fl].Eq = if_extract(convertors,fl)
         converted = convertors[fl]
      ff.writelines(["\t", fl, " <- ", str(converted.Eq), "\n"])
      initialized.append(fl)
      convsflows.remove(fl)

I had a line change earlier that went along with the fact that I decided to keep parameter names separate, and then join them up…

ff.write('parm_names <- c("' + convs[0] + '"')
for i in range(1, len(convs)): ff.write(',\n"' + convs[i] + '"')
ff.writelines(")\n")
ff.write("names(parms) <- parm_names\n")

This avoids some mess in assigning tricky names. I took the liberty of taking out several functions of the “x_functions.R” file; for instance, the MOD function at the end is just as easily replaced with the %% in line. There were also several others that I thought would warrant translation within the script written out.

Leave a Reply Cancel reply