Home > database >  How can I get rid of elements containing exact symbols in the list? (python)
How can I get rid of elements containing exact symbols in the list? (python)

Time:03-08

I have a list called "list", which consists of about 20K strings, and I need to remove from it elements which have "text": "" in it.

I'm creating a new clean list like this

clean_list = []

for i in list:
  if '"text": ""' in i == False:
    clean_list.append(i)
  print(i)

But elements don't append and clean_list is empty. What can be the problem? Smth is wrong with the cycle.

How else can I get rid of some elements in the list?

CodePudding user response:

The reason this doesn't work is that the operators don't associate the way you think they do:

>>> '"text": ""' in "foo" == False
False
>>> ('"text": ""' in "foo") == False
True

Using in ... == False is awkward/un-Pythonic in any case; it's better to do the more natural not in ...:

>>> '"text": ""' not in "foo"
True

CodePudding user response:

First, you should not name variables with protected keywords like list.

For your use case you could use list comprehension:

clean_list = [string for string in list if '"text": ""' not in string]

CodePudding user response:

if '"text": ""' in i == False:

Don't use that syntax. The i == False is unnecessary (and looks awkward), and in this specific case, it actually causes the problem you're having.

Use this syntax instead:

if '"text": ""' not in i:

If you want to know why this happens, keep reading.

This problem is due to operator chaining.

When you have an expression that contains two (or more) operators, such as this:

a < b < c:

Python treats that expression as if you had typed this:

a < b and b < c:

In your example, in and == are both operators, so Python treated your expression as though you had typed this:

if '"text": ""' in i and i == False:

The first part of that is true, but the second part is not. So the expression as a whole is false.

CodePudding user response:

You shouldn't be using built-in named as variable names in Python.

s = 'This is a sample "text": "" and it should not have it.'
if r'"text": ""' in s:
    print("Found.")

The output will be `Found.`

Now, with the help of this you can use:

clean_list = [i for i in list if r'"text": ""' not in i]
# This just creates a new list if an item `i` is found not having the pattern '"text": ""'. The r' refers to raw strings and can be helpful when using a lot of symbols and characters. 
  • Related