Home > Mobile >  Remove quotation mark from .txt
Remove quotation mark from .txt

Time:12-15

I've a txt file with the following row type:

"Hello I'm in Tensorflow"
"My name is foo"
'Mr "alias" is running'
...

So at it can be seen, just one string per row. When I try to create a tf.data.Dataset, the output looks like this:

conver = TextLineDataset('path_to.txt')
for utter in conver:
    print(utter)
   break
# tf.Tensor(b'"Hello I'm in Tensorflow"', shape=(), dtype=string)

If you notice, the quotation mark " is still present at the beginning and end of the string (plus the defined by the tensor '). My desired output would be:

# tf.Tensor(b'Hello I'm in Tensorflow', shape=(), dtype=string)

That is, without the quotation marks. Thank you in advance

CodePudding user response:

You could use tf.strings.regex_replace:

import tensorflow as tf
conver = tf.data.TextLineDataset('/content/text.txt')

def remove_quotes(text):
  text = tf.strings.regex_replace(text, '\"', '')
  text = tf.strings.regex_replace(text, '\'', '')
  return text

conver = conver.map(remove_quotes)
for s in conver:
  print(s)
tf.Tensor(b'Hello Im in Tensorflow', shape=(), dtype=string)
tf.Tensor(b'My name is foo', shape=(), dtype=string)
tf.Tensor(b'Mr alias is running', shape=(), dtype=string)

Or if you just want to remove the leading and trailing quotes then try this:

text = tf.strings.regex_replace(text, '^[\"\']*|[\"\']*$', '')

CodePudding user response:

The eval() function should do it.

for utter in conver:
    print(eval(utter))
   break

or you can simply use replace -

for utter in conver:
    print(utter.replace('"',''))
   break

CodePudding user response:

If you want to preserve quotation marks in the string that are not in the end or the start of the string -

for utter in conver:
    print(''.join([utter[i] if not (utter[i] == '"' and (i==0 or i==len(utter)-1)) else '' for i in range(len(utter))]))
  break
  • Related