I have a listof characters (sed) which is a series pandas object in dataframe, I try to browse this list as a normal list to retrieve the position of elements that are not spaces with this code:
for i in sed :
if j != " " :
print(j)
pos=sed.index(j)
print(pos)
but with print(j)
, I see that it retrieves the whole list, so when I try with a normal list (manually without dataframe) I get the right result:
seq=" M "
list=seq.tolist()
for j in list :
if j != " ":
print(j)
pos=list.index(j)
print(pos)
result: j = M and pos = 3
CodePudding user response:
You can filter for the rows where the value is different from a space ' '
import pandas as pd
df = pd.DataFrame({'a':list('my name is')})
a
0 m
1 y
2
3 n
4 a
5 m
6 e
7
8 i
9 s
# Get only the values that are not empty strings
print(df[df['a'].ne(' ')])
Output:
a
0 m
1 y
3 n
4 a
5 m
6 e
8 i
9 s
Or, if there is variety of spaces, like 1/2/3, you can use str methods on pandas series, which yields the same result
print(df[df['a'].str.contains('\S ')])
CodePudding user response:
You can use simpler method with a list comprehension and native Python Code :
result = [ch for ch in seq.split(" ") if ch != '']
If you have more special characters then you can also add more filters afterwards
to get the position of the "non-space" element, simply reuse the list :
position = sed.split(" ").index([a for a in sed.split(" ") if a != ''][0])