Home > Enterprise >  I have a dataframe in which one columns contains day and time and I want to put each day and its tim
I have a dataframe in which one columns contains day and time and I want to put each day and its tim

Time:12-02

I have a dataframe in which one column contains day and its time, I want to put that each day and its time in its respective column.

I have put a '$' in each day to either split or use it to put it in its respective column.

import pandas as pd

data = [{'timings' : 'Friday 10 am - 6:30 pm$Saturday 10am-6:30pm$Sunday Closed$Monday 10am-6:30pm$Tuesday 10am-6:30pm$Wednesday 10am-6:30pm$Thursday 10am-6:30pm',
'monday':'','tuesday':'','wednesday':'','thursday':'','friday':'','saturday':'','sunday':''
}]

df = pd.DataFrame.from_dict(data)

For e.g.: Data contains df['timing'] = "friday 10 am, saturday 6:30pm", then in df['friday'] = '10 am' and df['saturday'] = '6:30pm'.

I dont know how to put it in words.

Please me solve this problem.

CodePudding user response:

Use nested list comprehension for list of dictionaries, then pass to DataFrame constructor:

L = [dict(y.split(maxsplit=1) for y in x.split('$')) for x in df['timings']]

df = pd.DataFrame(L, index=df.index)
print (df)  
            Friday     Saturday  Sunday       Monday      Tuesday  \
0  10 am - 6:30 pm  10am-6:30pm  Closed  10am-6:30pm  10am-6:30pm   

     Wednesday     Thursday  
0  10am-6:30pm  10am-6:30pm  

CodePudding user response:

You can use str.extractall to extract the day name and times and then reshaping the DataFrame:

(df['timings'].str.extractall(r'(?P<day>[^$\s] )\s ([^$] )')
 .droplevel('match')
 .set_index('day', append=True)[1].unstack('day')
)

Output:

day           Friday       Monday     Saturday  Sunday     Thursday      Tuesday    Wednesday
0    10 am - 6:30 pm  10am-6:30pm  10am-6:30pm  Closed  10am-6:30pm  10am-6:30pm  10am-6:30pm

If you want to keep the original order of the days:

(df['timings'].str.extractall('(?P<day>[^$\s] )\s ([^$] )')
 .set_index('day', append=True)[1].unstack(['match', 'day'])
 .droplevel('match', axis=1)
)

Output:

day           Friday     Saturday  Sunday       Monday      Tuesday    Wednesday     Thursday
0    10 am - 6:30 pm  10am-6:30pm  Closed  10am-6:30pm  10am-6:30pm  10am-6:30pm  10am-6:30pm

Alternative to sort based on a custom order (here Friday first):

from calendar import day_name

sorter = pd.Series({d: (i 3)%7 for i,d in enumerate(day_name)})

out = (df['timings']
 .str.extractall('(?P<day>[^$\s] )\s ([^$] )')
 .droplevel('match')
 .set_index('day', append=True)[1].unstack('day')
 .sort_index(axis=1, key=sorter.get)
)
  • Related