Home > Blockchain >  python cgi interaction fails to find graphviz program twopi
python cgi interaction fails to find graphviz program twopi

Time:11-14

I've got a maddening bug where my python script succeeds in doing something when run independently, but fails when run as a cgi script called by jquery's $.ajax(). Any insights would be welcome.

I'm developing this web application locally on my new MacbookPro (macOS 11.6) using the local apache2 server which I've configured to run .py files in the relevant directories as cgi programs.

The relevant working parts are these:

  • local on my new MacbookPro (macOS 11.6)
  • the graphviz binaries are installed: /opt/homebrew/bin/twopi:
    twopi - graphviz version 2.49.2 (20211016.1639)
  • pygraphviz 1.7
  • Python 3.9.7: /opt/homebrew/opt/[email protected]/bin/python3.9
  • rdflib version : '6.03a'

The aim of this application, driven by a python cgi script, is to retrieve some data from the local file system, and some RDF data from an AllegroGraph instance on the web using python's requests module, and then layout and display a graph visualization in a web page using graphviz and python's pygraphviz module.

the javascript makes a GET request like this:

function graphMe(charter){
    $.ajax({
        type: "get",
        url: "cartametallon.py",
        data: {"graphMe": charter},
        dataType: 'json',
        success: deploySVG,
        error: function(jqXHR, textStatus, errorThrown) {
            console.log(jqXHR.response, textStatus, errorThrown);
        }
    });
}

The python cgi script fields this request using the cgi module like this:

import cgi, cgitb
cgitb.enable(format="text")
form = cgi.FieldStorage()

try:
    if 'graphMe' in form:
        charter = form.getvalue('graphMe')
        uri = "<http://chartex.org/graphid/"   charter   ">"
        
        print ("Content-Type: application/json\r\n\r\n")
        print (json.dumps(visualizeDocumentGraph(uri)))
        
except Exception:
    print ("Content-Type: text/plain\n")
    print("Exception in user code:")
    print("~"*20, __file__.split('/')[-1], "~"*20)
    traceback.print_exc(file=sys.stdout)
    print("~"*60)

The visualizeDocumentGraph() function assembles graph data and metadata from several sources and stores it in a dict which should then be returned to the referring page as a json object. One of the things stored in this object is an SVG string of the graph as laid out by the graphviz's twopi algorithm. I've verified that each element of this python function works as expected, and when run at the command line, it returns the expected object; however, the response to the jQuery.ajax() request looks like this:

Content-Type: text/plain

Exception in user code:
~~~~~~~~~~~~~~~~~~~~ cartametallon.py ~~~~~~~~~~~~~~~~~~~~
Traceback (most recent call last):
  File "/opt/homebrew/lib/python3.9/site-packages/pygraphviz/agraph.py", line 1344, in _get_prog
    runprog = self._which(prog)
  File "/opt/homebrew/lib/python3.9/site-packages/pygraphviz/agraph.py", line 1800, in _which
    raise ValueError(f"No prog {name} in path.")
ValueError: No prog twopi in path.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "~/Sites/cartametallon/cartametallon.py", line 661, in <module>
    print (json.dumps(visualizeDocumentGraph(uri)))
  File "~/Sites/cartametallon/cartametallon.py", line 292, in visualizeDocumentGraph
    dgsvg = makedot(g).draw(format='svg', prog='twopi')
  File "/opt/homebrew/lib/python3.9/site-packages/pygraphviz/agraph.py", line 1596, in draw
    data = self._run_prog(prog, args)
  File "/opt/homebrew/lib/python3.9/site-packages/pygraphviz/agraph.py", line 1360, in _run_prog
    runprog = r'"%s"' % self._get_prog(prog)
  File "/opt/homebrew/lib/python3.9/site-packages/pygraphviz/agraph.py", line 1346, in _get_prog
    raise ValueError(f"Program {prog} not found in path.")
ValueError: Program twopi not found in path.

This is the puzzle then: in the cgi interaction it "appears" that the twopi program can't be found, but run on its own, my python script has no trouble. The twopi binary is installed at /opt/homebrew/bin/twopi, is readily accessible via the env variable PATH:

% echo $PATH
~/opt/anaconda3/condabin:/opt/homebrew/bin:/opt/homebrew/sbin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin

it's clear that my python script knows about it too. Not only does it execute it successfully when the script runs on its own, it knows explicitly where to find it:

>>> os.get_exec_path()
['~/opt/anaconda3/condabin', '/opt/homebrew/bin', '/opt/homebrew/sbin', '/usr/local/bin', '/usr/bin', '/bin', '/usr/sbin', '/sbin']

I can't get my python script to return the json I need for the web page, ARGH! This is made all the more maddening by the fact that what I'm trying to do is refactor and update an existing application that runs just fine at https://neolography.com/chartex/. This working program was recently transfered to a new web host and that required a little tinkering to get it working again, and the graph output is not as good as on my old host because A2 hosting insisted on installing an ancient version of graphviz (don't ask). So, the relevant working parts of this working program are these:

  • twopi - graphviz version 2.30.1 (20201013.1554)
  • pygraphviz 1.5
  • Python 2.7.18 (default, Jul 8 2021, 01:00:23)
    [GCC 4.8.5 20150623 (Red Hat 4.8.5-44)] on linux2
    (this python had to be running in a virtual env so that I could install this:)
  • RDFLib Version: 5.0.0

CodePudding user response:

It took me days, but thanks to Thomas's comment, some of the observations made here, and the docs for the python os module itself, I finally understood that the "PATH" environmental variable, when I run a script at the command line, is quite different from the same variable in the context of an executing cgi program.

 % echo $PATH
~/opt/anaconda3/condabin:/opt/homebrew/bin:/opt/homebrew/sbin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin

and

>>> os.environ["PATH"]
'~/opt/anaconda3/condabin:/opt/homebrew/bin:/opt/homebrew/sbin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin'

are not the same as the os.environ["PATH"] when it's referenced by apache when it runs the cgi script.

Worse, I was conflating the $PATH variable with python's sys.path which has quite a different purpose. My cgi program, using pygraphviz, was trying to execute twopi, a binary at /opt/homebrew/bin, and in the context of an executing cgi program, the PATH variable available to it looks like this:

"/usr/bin:/bin:/usr/sbin:/sbin:"

It's easy enough to add the necessary path to that variable within the cgi program like this:

os.environ["PATH"] = f"{os.environ['PATH']}:/opt/homebrew/bin"

And that solved my problem. But, I'm still uneasy. This seems like a hacky approach. I still don't fully understand why apache's PATH is different from the $PATH available to the python interpreter, or in a script run at the command line. I gather that the PATH available to a cgi program is different because apache executes it as a different user (_www).

It would be good to know if there's a more canonical way to solve this problem. Any suggestions as to documentation that would clarify my understanding of this issue would be gratefully received.

  • Related