Home > other >  Ruby IO redirection to a file by using Kernel.puts,How can I ensure that the stored file's enco
Ruby IO redirection to a file by using Kernel.puts,How can I ensure that the stored file's enco

Time:01-04

In windows: visual studio code IDE,I wrote the command like:

ruby -E UTF-8 -e "puts '汤姆,Welcome to my home'" > test.txt

,then I use the command:

ruby -E UTF-8 -e "puts gets" < test.txt

,but error codes occurred when reading.Like:

��ķ,Welcome to my home.

enter image description here

Finally I found the "test.txt" file's encoding type is unicode.

If I insist on using IO redirection to a file and choose kernel.puts, how can I ensure that the stored file's encoding type is UTF-8?

What should I do to ensure that the file encoding type after redirection is UTF-8?Please help me.

CodePudding user response:

To ensure that the file encoding type after redirection is UTF-8, you can use the force_encoding method to set the encoding of the output string to UTF-8 before writing it to the file. For example:

ruby -e "puts 'Welcome to my home'.force_encoding('UTF-8')" > test.txt

This will write the output string to the file with UTF-8 encoding.

You can also specify the encoding of the output file explicitly using the -E option. For example:

ruby -E UTF-8 -e "puts 'Welcome to my home'" > test.txt

This will tell Ruby to use UTF-8 encoding for the output file.

To read the file with UTF-8 encoding, you can use the force_encoding method again to set the encoding of the input string to UTF-8 before reading it. For example:

ruby -e "puts gets.force_encoding('UTF-8')" < test.txt

Alternatively, you can specify the encoding of the input file explicitly using the -E option again. For example:

ruby -E UTF-8 -e "puts gets" < test.txt

This will tell Ruby to use UTF-8 encoding for the input file.

I hope this helps! :)

CodePudding user response:

Your screenshot implies that you're using cmd.exe, the legacy Windows shell.

By default, it uses the system's active legacy OEM code page, which is typically a fixed 8-bit character encoding limited to 256 characters, i.e. a single-byte encoding, e.g. Code Page 437 on US-English systems, as reported when you run chcp.

If you want cmd.exe to use UTF-8 encoding instead, run chcp 65001 first.


For character-encoding considerations with respect to PowerShell, cmd.exe's modern successor, see this answer to your related later question.

  • Related