Home > front end >  How to remove leading whitespace denoted with ? using rename in MacOS
How to remove leading whitespace denoted with ? using rename in MacOS

Time:08-22

I have directories that looks like this in my MacOs:

enter image description here

For example ???9-24_v_hMrgprx2 where ??? is actually white spaces. What I want to do is to use rename to remove those leading white spaces.

I tried this but failed.

rename  "s/\s*//g"  *

What's the right way to do it?


Update

Hexdump looks like this:

    ls  | hexdump -C
00000000  e3 80 80 39 2d 32 34 5f  76 5f 68 4d 72 67 70 72  |...9-24_v_hMrgpr|
00000010  78 32 0a                                          |x2.|
00000013

CodePudding user response:

Verify what those characters are first, since macOS doesn't display ASCII whitespace characters in filenames as ? (unless you have some weird encoding issue going on). It would help if you added information like this to your question:

$ touch "   touched"
$ ls -l *touched
-rw-rw-r--@ 1 brian  staff  0 Aug 18 13:52    touched
$ ls *touched | hexdump -C
00000000  20 20 20 74 6f 75 63 68  65 64 0a                 |   touched.|
0000000b

For rename, you've almost got it right if those leading characters were whitespace. However, you want to anchor the pattern so you only match whitespace at the beginning of the name:

rename 's/\A\s //' *

Now that we know your filename's start with U 3000 (which is whitespace), I can see what's going on.

There are various versions of rename. Larry Wall wrote one, @tchrist wrote one based on that (and I use that), and File::Rename is another modification of Larry's original. Then there is Aristotle's version.

The problem with my rename (from @tchrist) is that it doesn't interpret the filenames as UTF-8. So, U 3000, looks like the three bytes you see: e3 80 80. I'm guessing that your font might not support any of those. There could be all sorts of things going on. See @tchrist's Unicode answer.

I can create the file:

% perl -CS -le 'print qq(\x{3000}abc)' | xargs touch

I can easily see the file, but I have a font that can display that character:

$ ls -l *abc
-rw-rw-r--  1 brian  staff  0 Aug 22 02:44  abc

But, when I try to rename it, using the -n for a dry run, I get no output (so, no matching files to change):

$ rename -n 's/\A\s //' *abc

If I run perl directly and give it -CSA to treat the standard file handles (-CS) and the command-line arguments (-CA) as UTF-8, the file matches and the replacement happens:

$ perl -CSA `which rename` -n 's/\A\s //' *abc
rename  abc abc

So, for my particular version, I can edit the shebang line to have the options I need. This works for me because I know my terminal settings, so it might not work everywhere for all settings:

#!/usr/bin/env perl -CSA

But the trick is how did I get that version of rename? I'm pretty sure I installed some from CPAN that gave it to me, but what? I'd supply a patch if I could.

CodePudding user response:

E3 80 80 is the UTF-8 encoding of U 3000 which is a CJK whitespace character.

rename is not a standard utility on MacOS, and there are several popular utilities with this name, so what exactly works will depend on which version you have installed. The syntax looks like you have this one, from the Perl distribution. Maybe try

rename 's/\xe3\x80\x80//' *
  • Related