Upload csv for processing in PyScript-CodePudding

The following is my failed attempt at writing HTML/Javascript/PyScript code that allows a user to upload a csv (or excel) file, which is then available for use in PyScript (e.g., Pandas). As you might notice, I am not very experienced with Javascript.

<html>
  <head>
    <link rel="stylesheet" href="https://pyscript.net/latest/pyscript.css" />
        
    <script defer src="https://pyscript.net/latest/pyscript.js"></script>
    
    <py-config>
        packages = ["pandas"]
    </py-config>
  </head>
  <body>
    <input type="file" id="fileinput" accept=".csv, .xlsx"/>

    <script>
        var test = "test";

        function assignFile(e) {
            // Assign file to js variable
            window.file = e.target.files[0];
            console.log("File assigned")
        }
    </script>

    <py-script>
import pandas as pd
import js
import pyodide
from js import test
js.console.log(test)

def get_file(e):
    js.console.log("Attempting file upload: "   e.target.value)

    # Assign file to a js variable
    js.assignFile(e)
    # Import variable to python
    from js import file

    display(pd.read_csv(file))

get_file_proxy = pyodide.ffi.create_proxy(get_file)

js.document.getElementById("fileinput").addEventListener("change", get_file_proxy)
    </py-script>
  </body>
</html>

Some notes to this code: I am aware that the part written in the script element could also be written in the py-script element, but decided not to. Furthermore, the line from js import file imports the variable created in Javascript, but when I try to use this variable with Pandas, it gives me the error in the console Invalid file path or buffer object type: <class 'pyodide.JsProxy'>. This is contrary to the properly functioning line from js import test. However, the specific error is unimportant to my question, which is:

What would be a simple way to allow a user to upload a csv (or xlsx) file for use in PyScript?

CodePudding user response：

First, you need to use the event.target.files attribute of the FileEvent. This can be read using the common async JavaScript methods, like .text() or .arrayBuffer(). To read a CSV file from text with pandas you need to use an in-memory stream, like io.StringIO or io.BytesIO.

I adapted your PyScript code like this:

import pandas as pd
import pyodide
import io
import js

async def get_file(e):
    files = e.target.files.to_py()
    for file in files:
      file_content = await file.text()
      df = pd.read_csv(io.StringIO(file_content))
      js.console.log(df)

get_file_proxy = pyodide.ffi.create_proxy(get_file)

js.document.getElementById("fileinput").addEventListener("change", get_file_proxy)

Keep in mind, that large files might be better read using a buffer, but I do not exactly know best practices for this in PyScript.