I have one use case in which I want to read only top 5 rows of a large CSV file which is present in one of my sftp server and I don't want to download the complete file to just read the top 5 rows. I am using pysftp
in python3 to interact with my SFTP server. Do we have any way in which I can download only the chunk of the file instead of downloading the complete file in pysftp
.
If there are any other libraries in python or any technique I can use, please guide me. Thanks
CodePudding user response:
Yes, it is possible to download only a portion of a file from an SFTP server using pysftp. One way to do this is to use the getfo method, which allows you to download a file and write its contents to a file-like object. You can use this method in combination with the io module's StringIO class, which allows you to create a file-like object in memory that you can read from and write to.
Here is an example of how you might use these methods to download the first 5 lines of a CSV file from an SFTP server:
import pysftp
import io
# Connect to the SFTP server
cnopts = pysftp.CnOpts()
cnopts.hostkeys = None
with pysftp.Connection('sftp.example.com', username='user', password='pass', cnopts=cnopts) as sftp:
# Open the CSV file on the SFTP server
with sftp.open('path/to/file.csv', 'r') as f:
# Create a file-like object in memory
output = io.StringIO()
# Download the first 5 lines of the file and write them to the file-like object
for i in range(5):
line = f.readline()
output.write(line)
# Reset the file pointer to the beginning of the file-like object
output.seek(0)
# Read the contents of the file-like object
print(output.read())
This example reads the first 5 lines of the file and writes them to a file-like object in memory. You can then read the contents of the file-like object using the read method, or you can process the lines in any other way that you like
CodePudding user response:
First, do not use pysftp. It's dead unmaintained project. Use Paramiko instead. See pysftp vs. Paramiko.
If you want to read data from specific point in the file, you can open a file-like object representing the remote file using Paramiko SFTPClient.open
method (or equivalent pysftp Connection.open
) and then use it as if you were accessing data from any local file:
- Use
.seek
to set read pointer to the desired offset. - Use
.read
to read data.
with sftp.open("/remote/path/file", "r", bufsize=32768) as f:
f.seek(offset)
data = f.read(count)
For the purpose of bufsize
, see:
Writing to a file on SFTP server opened using Paramiko/pysftp "open" method is slow