I want to use some commands with the aws cli on large files that are stored in a s3 bucket without copying the files to the local directory (I'm familiar with the aws cp command, it's not what I want).
For example, let's say I want to use a simple bash commands like "head" or "more". If I try to use it like this:
head s3://bucketname/file.txt
but then I get:
head: cannot open ‘s3://bucketname/file.txt’ for reading: No such file or directory How else can I do it?
How else can I do it?
Thanks in advance.
CodePudding user response:
Wether a command will be able to access a file over s3 bucket or not depends entirely on the command. Under the hood, every command is just a program. When you run something like head filename
, the argument filename
is passed as an argument to the head command's main() function. You can check out the source code here: https://github.com/coreutils/coreutils/blob/master/src/head.c
Essentially, since the head command does not support S3 URIs, you cannot do this. You can either:
- Copy the s3 file to stdout and then pipe it to head:
aws s3 cp fileanme - | head
. This doesn't seem the likely option if file is too big for the pipe buffer. - Use s3curl to copy a range of bytes: how to access files in s3 bucket for other commands using cli