have to use a .sh script to unpack and prep some databases. The code is the following:
#
# Downloads and unzips all required data for AlphaFold.
#
# Usage: bash download_all_data.sh /path/to/download/directory
set -e
DOWNLOAD_DIR="$1"
for f in $(ls ${DOWNLOAD_DIR}/*.tar.gz)
do
tar --extract --verbose --file="${DOWNLOAD_DIR}/${f}" /
--directory="${DOWNLOAD_DIR}/mmseqs_dbs"
rm "${f}"
BASENAME="$(basename {f%%.*})"
DB_NAME="${BASENAME}_db"
OLD_PWD=$(pwd)
cd "${DOWNLOAD_DIR}/mmseqs_dbs"
mmseqs tar2exprofiledb "${BASENAME}" "${DB_NAME}"
mmseqs createindex "${DB_NAME}" "${DOWNLOAD_DIR}/tmp/"
cd "${OLD_PWD}"
done
When I run the code, I got that error:
(openfold_venv) watson@watson:~/pedro/openfold$ sudo bash scripts/prep_mmseqs_dbs.sh data/
tar: data//data//colabfold_envdb_202108.tar.gz: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
I don`t understand why the code repeats my "DOWNLOAD_DIR", the correct should be :
data/colabfold_envdb_202108.tar.gz
and not
data//data//colabfold_envdb_202108.tar.gz
Could anyone help me?
New code:
set -e
DOWNLOAD_DIR="$1"
for f in ${DOWNLOAD_DIR}/*.tar.gz;
do
tar --extract --verbose --file="$f" /
--directory="${DOWNLOAD_DIR}/mmseqs_dbs"
rm "${f}"
BASENAME="$(basename {f%%.*})"
DB_NAME="${BASENAME}_db"
OLD_PWD=$(pwd)
cd "${DOWNLOAD_DIR}/mmseqs_dbs"
mmseqs tar2exprofiledb "${BASENAME}" "${DB_NAME}"
mmseqs createindex "${DB_NAME}" "${DOWNLOAD_DIR}/tmp/"
cd "${OLD_PWD}"
done
CodePudding user response:
To answer your first question: why is it repeating? Because you are repeating it in your code:
for f in ${DOWNLOAD_DIR}/*.tar.gz;
do
tar --extract --verbose --file="${DOWNLOAD_DIR}/$f"
If f
is downloads/file.tar.gz
then ${DOWNLOAD_DIR}/${f}
will resolve to downloads/downloads/file.tar.tgz
.
As to your second question: the escape character is the backslash \
, not the forward slash. Your multiline command should look like this:
tar --extract --verbose --file="${DOWNLOAD_DIR}/${f}" \
--directory="${DOWNLOAD_DIR}/mmseqs_dbs"