Here is toy-example, I've string like this:
import numpy as np
z = str([np.nan, "ab", "abc"])
Printed it looks like "[nan, 'ab', 'abc']"
but I've to process z = str([np.nan, "ab", "abc"])
I want to get from z
list of strings excluding nan:
zz = ["ab", "abc"]
To be clear: z
is input (string, that look list-like), zz
is wanted output (list)
There is no problem if z
doesn't contain nan, in such ast.literal_eval(z)
do the job, but with nan I get error about malformed node or string.
Note: np.nan
doesn't have to be first.
CodePudding user response:
ast.literal_eval
is suggested over eval
exactly because it allows a very limited set of statements. As stated in the docs: "Safely evaluate an expression node or a string containing a Python literal or container display. The string or node provided may only consist of the following Python literal structures: strings, bytes, numbers, tuples, lists, dicts, sets, booleans, None and Ellipsis." np.nan
is none of those so it cannot be evaluated.
There are few choices to handle this.
- Remove
nan
by operating on the string before doing evaluation on it. Might be problematic if you want to avoid also removing nan from inside the actual strings. - NOT ADVISED - SECURITY RISKS - standard
eval
can handle this if you define nan variable in the namespace - And finally, I think the best choice but also hardest to implement: like explained here, you take the source code for
ast
, subclass it and reimplementliteral_eval
in such a way that it knows how to handlenan
string on it's own.
CodePudding user response:
As I understand it, your goal is to parse csv or similar.
If you want a trade-off solution that should work in most cases, you can use a regex to get rid of the "nan". It will fail on the strings that contain the substring nan,
(with comma), but this seems to be a reasonably unlikely edge case. Worth to explode with you real data.
z = str([np.nan, "ab", np.nan, "nan,", "abc", "x nan , y", "x nan y"])
import re
literal_eval(re.sub(r'\bnan\s*,\s*', '', z))
output: ['ab', '', 'abc', 'x y', 'x nan y']
CodePudding user response:
What about:
eval(z,{'nan':'nan'}) # if you can tolerate then:
[i for i in eval(z,{'nan':'nan'}) if i != 'nan']
It may have security considerations.
CodePudding user response:
Many Solutions one of these is
z = [nan, 'string', 'another_one']
string_list = []
for item in z :
# find the object come from str Class and Append it to the list
if item.__class__ == str:
string_list.append(item)
CodePudding user response:
Something like this :
import numpy as np
z = [item for item in [np.nan, "ab", "abc" ] if type(item) == str]
print(z)
CodePudding user response:
Use filter() function:
list(filter(lambda f: type(f)==str, z))