I have this DF:
Unnamed: 0 Unnamed: 1 Unnamed: 2 Unnamed: 3 Unnamed: 4 Unnamed: 5 Unnamed: 6 Unnamed: 7 Unnamed: 8 Unnamed: 9 ... Unnamed: 23 Unnamed: 24 Unnamed: 25 Unnamed: 26 Unnamed: 27 Unnamed: 28 Unnamed: 29 Unnamed: 30 Unnamed: 31 Unnamed: 32
0 NaN NaN NaN NaN NaN NaN CMO & KPI NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN FMS PSO Zywiec 1
1 NaN Year 2019 NaN NaN NaN Entity: NaN 1268 FMS - PSO Zywiec 1 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN FMS PSO Zywiec 1
2 NaN Month 12 NaN NaN NaN Month: NaN 2019.12 December 2019 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN FMS PSO Zywiec 1
3 NaN Period-1 2019.11 NaN NaN NaN Scope: NaN SC_NONE None ... NaN NaN NaN NaN NaN NaN NaN NaN NaN FMS PSO Zywiec 1
4 NaN Month-1 11 NaN NaN NaN Currency: NaN LC Local Currency ... NaN NaN NaN NaN NaN NaN NaN NaN NaN FMS PSO Zywiec 1
entity = df[df["Unnamed: 6"]=="Entity:"]["Unnamed: 9"].values[0]
I took the value of Entity in column unnamed 6 at column unnamed 9 and put it to another column.
df["Unnamed: 32"] = entity
the thing is i need to split this value so i can put each value o their own columns respectfully.
I used entity = df[df["Unnamed: 6"]=="Entity:"]["Unnamed: 9"].values[0].replace("-","")
to remove "-" as a seperator and replaced it with white space.
then i split the value like this;
data_entity = entity.split(" ")
first_data = second_data = ""
if len(data_entity) > 1:
first_data = data_entity[0].strip()
second_data = data_entity[1].strip()
third_data = data_entity[2].strip()
fourth_data = data_entity[3].strip()
fifth_data = data_entity[4].strip()
but with .replace("-"," ")
I recieve an output like this;
['FMS', '', 'PSO', 'Zywiec', '1']
FMS
PSO
Zywiec
1
.replace
didn't worked out for me because there is a place between FMS and PSO where i can't remove.
what should I use instead of .replace
to remove that white space. I already tried to write the code .replace("-","")
instead .replace("-"," ")
. still there is a white space between them.
thanks for the helps.
CodePudding user response:
When just using .replace('-', '')
the output is:
'FMS PSO Zywiec 1'
As there are two spaces in between FMS
and PSO
, the output for using .split(' ')
contains an empty string. You can just use .split()
to solve this problem. This will give you just the words from the string (no empty strings).
>>> "FMS - PSO Zywiec 1".replace('-', '').split()
['FMS', 'PSO', 'Zywiec', '1']