Home > Back-end >  Using split function to extract values in a dataframe by delimiters in python but function giving ro
Using split function to extract values in a dataframe by delimiters in python but function giving ro

Time:10-14

I have converted an xml file to csv and got this result as a dataframe column "data[column]".

`0 Jan:2018,000/XXX|Dec:2017,000/XXX|Nov:2017,000...

1 Apr:2018,000/XXX|Mar:2018,000/STD|Feb:2018,000...

2 Apr:2019,000/XXX|Mar:2019,000/XXX|Feb:2019,000...

3 Jan:2019,000/XXX|

4 Dec:2018,000/XXX|Nov:2018,000/XXX|Oct:2018,000...

5 Feb:2019,000/XXX|Jan:2019,000/XXX|Dec:2018,000...

6 May:2015,XXX/XXX|Apr:2015,XXX/XXX|Mar:2015,XXX...`

i want this dataframe column to get every first value after comma by splitting it by "|".

example:

000,000,000.....

000,000,000...

000,000,000...

000...

000,000,000...

XXX,XXX,XXX...

and store it in dataframe.

i have used this function:

def my_split(string):

**for x in new.str.split("|"):**

    **for y in x:**

        **print(y.split(",")[-1][0:3])**
           

new.apply(my_split)

but i am getting values for every row one after the other.

000

000

000

000

000

000

000

CodePudding user response:

s = """0 Jan:2018,000/XXX|Dec:2017,000/XXX|Nov:2017,000...
1 Apr:2018,000/XXX|Mar:2018,000/STD|Feb:2018,000...
2 Apr:2019,000/XXX|Mar:2019,000/XXX|Feb:2019,000...
3 Jan:2019,000/XXX|
4 Dec:2018,000/XXX|Nov:2018,000/XXX|Oct:2018,000...
5 Feb:2019,000/XXX|Jan:2019,000/XXX|Dec:2018,000...
6 May:2015,XXX/XXX|Apr:2015,XXX/XXX|Mar:2015,XXX..."""

df = pd.DataFrame([x.split(';') for x in s.split('\n')], columns=['col'])
def custom_strip_fnc(m):
    ar = [k.split(',')[1][0:3] for k in m.split('|') if (',') in k]
    return ar# %%
df['splitted'] = df['col'].apply(custom_strip_fnc)
df
   col  splitted
0   0 Jan:2018,000/XXX|Dec:2017,000/XXX|Nov:2017,0...   [000, 000, 000]
1   1 Apr:2018,000/XXX|Mar:2018,000/STD|Feb:2018,0...   [000, 000, 000]
2   2 Apr:2019,000/XXX|Mar:2019,000/XXX|Feb:2019,0...   [000, 000, 000]
3   3 Jan:2019,000/XXX| [000]
4   4 Dec:2018,000/XXX|Nov:2018,000/XXX|Oct:2018,0...   [000, 000, 000]
5   5 Feb:2019,000/XXX|Jan:2019,000/XXX|Dec:2018,0...   [000, 000, 000]
6   6 May:2015,XXX/XXX|Apr:2015,XXX/XXX|Mar:2015,X...   [XXX, XXX, XXX]
  • Related