I have a dataset of MRI images, contained within directories (one for each subject; over 200 subjects) and multiple subdirectories (for structural, functional, and RS data, each containing more subdirectories), which contain the files from the scans. Some of these directories & files contain subject numbers (sub-001 > structural > sub-001_T1_structural > sub-001_T1_structural.nii.gz
).
As this pre-existing data was taken from 3 separate studies, there is overlap in some of the participant numbers. What I need is to rename all of the folders and files containing subject numbers into a numerical sequence starting at 500 (to avoid overlap with the original datasets). So my desired output for sub-001 would be sub-501 > structural > sub-501_T1_structural > sub-501_T1_structural.nii.gz
. Essentially, I need to change the subject numbers for each participant into a sequence starting at 501. I cannot simply add 500 to each subject number as some subjects have been excluded, so there are gaps in the string of numbers (e.g. jumps from sub-011 to sub-013).
Currently I have tried this short script, which works for each individual subject, but I can't figure out a way to apply it to all subjects at once:
find . -depth -name "*sub-001*" | \
while IFS= read -r ent; do mv $ent ${ent%sub-001*}sub-501${ent##*sub-001}; done
Apologies for any mistakes, misunderstandings, or basic errors -- I'm completely new to scripting so am having to figure it out myself to complete my analysis.
Edit: code changed to most recent edit
Edit: added context for clarity
I attempted this command to apply it to each directory (plus subdirectories and files) containing the original subject number, with the correct renamed subject number (e.g. sub-001 > sub-501, sub-002 > sub-502, sub-003 > sub-503, etc.):
for a in 001 002 003 004 005 006 007 008 009 010 011 013 014 015
016 017 018 019 020 021 022 023 024 026 027 028 029 030 031 032 034
036 037 038 039 045 046 047 049 050 051 052 053 054 055 056 057 058
059 060 061 062 063 064 065 066 067 068 069 070 071 072 073 074 075
076 077 078 079 080 081 082 083 084 085 086 087 088 089 090 091 092
093 094 095 096 097 098 099 100 101 102 104 105 106 107 109 110 111
112 113 114 115 116 117 118 119 120
for b in 501 502 503 504 505 506 507 508 509 510 511 512 513 514
515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531
532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548
549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565
566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582
583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599
600 601 602 603 604 605 606 607 608
find . -depth -name "*sub-${a}*" | \
while IFS= read -r ent; do mv $ent ${ent%sub-${a}*}sub-${b}${ent##*sub-${a}}; done
But I'm met with the error: line 7: syntax error near unexpected token 'for'
. Is there a way to apply the command to each subject in order, and ensure that the renamed folders and files follow sequentially from 501 upwards, as attempted above?
Final edit: Thanks to @tripleee for their answer -- a few changes and it seems to work. What I used can be found below:
i=500
for a in 001 002 003 004 005 006 007 008 009 010 011 013 014 015 016
017 018 019 020 021 022 023 024 026 027 028 029 030 031 032 034 036
037 038 039 045 046 047 049 050 051 052 053 054 055 056 057 058 059
060 061 062 063 064 065 066 067 068 069 070 071 072 073 074 075 076
077 078 079 080 081 082 083 084 085 086 087 088 089 090 091 092 093
094 095 096 097 098 099 100 101 102 104 105 106 107 109 110 111 112
113 114 115 116 117 118 119 120 ; do
((i ))
b=$i
find . -depth -name "*sub-${a}*" | \
while IFS= read -r ent; do mv $ent ${ent%sub-${a}*}sub-$b${ent##*sub-${a}};
done
done
CodePudding user response:
You need a do
after for
but before the loop body, but you also probably want just one loop, not two nested loops. Something like this?
i=501
for a in sub-[0-9][0-9][0-9]; do
((i ))
b=sub-$i
find . -depth -name "*sub-${a}*" | \
while IFS= read -r ent; do
dest="${ent%sub-${a}*}sub-${b}${ent##*sub-${a}}"
mkdir -p "${dest%/*}"
mv "$ent" "$dest"
done
done
The (( ... ))
expression is a mathematical integer expression evaluation; we simply add one to to the previous value of i
, and create the new value of b
by prepending sub-
to the result.
This loops over the sub-[0-9][0-9][0-9]
values in the current directory, which as per your specification should get what you want without explicitly enumerating all of them.
This takes care of creating the destination directory if it does not exist, but will leave the empty source directories behind. If you want to rename directories, you need a slightly more sophisticated approach (first do the files within a directory, then rename the directories moving upwards in the directory tree) but perhaps for the time being this is simple and straightforward enough. If you need to do this repeatedly, automating it in full would probably make sense.
The while read
obviously will not work for file names with newlines in them. A somewhat more elegant and robust solution would use find ... -exec sh -c '...' _ {}
but that's perhaps for a time when you are more familiar with the basics.