I have this string in a code.txt file.
"class Solution {\u000Apublic:\u000A vector\u003Cvector\u003Cint\u003E\u003E insert(vector\u003Cvector\u003Cint\u003E\u003E\u0026 intervals, vector\u003Cint\u003E\u0026 newInterval) {\u000A int len \u003D intervals.size()\u003B\u000A int index \u003D 0\u003B\u000A vector\u003Cvector\u003Cint\u003E \u003E ans\u003B\u000A \u000A\u000A while(index \u003C len \u0026\u0026 intervals[index][1] \u003C newInterval[0]) ans.push_back(intervals[index ])\u003B\u000A \u000A while(index \u003C len \u0026\u0026 intervals[index][0] \u003C\u003D newInterval[1]) {\u000A newInterval[0] \u003D min(intervals[index][0], newInterval[0])\u003B\u000A newInterval[1] \u003D max(intervals[index][1], newInterval[1])\u003B\u000A index \u003B\u000A }\u000A \u000A ans.push_back(newInterval)\u003B\u000A \u000A while(index \u003C len) ans.push_back(intervals[index ])\u003B\u000A\u000A return ans\u003B \u000A }\u000A}\u003B "
I would like to convert this string to C syntex and write to solution.cpp file.
The content in solution.cpp will look like..
class Solution {
public:
vector<vector<int>> insert(vector<vector<int>>& intervals, vector<int>& newInterval) {
int len = intervals.size();
int index = 0;
vector<vector<int> > ans;
while(index < len && intervals[index][1] < newInterval[0]) ans.push_back(intervals[index ]);
while(index < len && intervals[index][0] <= newInterval[1]) {
newInterval[0] = min(intervals[index][0], newInterval[0]);
newInterval[1] = max(intervals[index][1], newInterval[1]);
index ;
}
ans.push_back(newInterval);
while(index < len) ans.push_back(intervals[index ]);
return ans;
}
};
I have tried enforcing/converting encoding to UTF-8
but the string stays the same.
code = File.read('code.txt')
code = code.encode('UTF-8')
file = File.open('solution.cpp', "w:UTF-8")
file.write(code)
How can I do this? Thank you.
CodePudding user response:
So, I have tried to reproduce your problem and got the same result as described by using your solution.
I have noticed that \u003B
(for example) is a unicode code for semicolon character. So, I analyzed the string for each "U " notation using regex /\\u(.{4})/
, as it marks "hexadecimal digits" as being Unicode code points. Then used gsub! and Array#pack to convert and substitute each of the Unicode chars.
[$1.to_i(16)].pack('U') # => "\n", "\n", "<", "&", "\n", "=" ...etc.
And finally wrote the result to a file. So, my final approach looks like this:
code = File.read('code.txt')
code.gsub!(/\\u(.{4})/) do |match|
[$1.to_i(16)].pack('U')
end
File.open('solution.cpp', 'w') { |f| f.puts code.gsub!(/\A"|"\Z/, '') }
Also note, I have used gsub
again at the end, to search for the leading or trailing quote and replace it with an empty string when writing to a file.