Home > Software engineering >  Using an list index to create column in Pandas
Using an list index to create column in Pandas

Time:05-20

I've been stuck on this for longer than I would like to admit. I'm trying to use index of a list to make a new column based on the Day column. I'm sure this is super simple. Really all I am trying to do is get the day difference between today and these other days.

There maybe even a way to get my result with datetime but I just haven't been able to find either solution yet.

import pandas as pd
from datetime import datetime


today = datetime.today().strftime('%Y/%m/%d')
todays_week_day = str.upper(str(datetime.today().strftime('%a')))

# Lets assume today is "THU" for this example

todays_week_day = "THU"

day_abrivs = list(["SUN", "MON", "TUE", "WED", "THU", "FRI", "SAT"])

todays_week_day_num = day_abrivs.index(todays_week_day)


df=
    attendance          day
 0     1546             FRI 
 1     1978             SAT 
 2     2150             SUN

df['day_num'] = day_abrivs.index(df['day'])
df['day_diff'] = df['day_num'] - todays_week_day_num

# This gives the following error on the Day_Num col so I don't even get to the Day_diff

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "C:\Python_Projects\Shell-B\venv\lib\site-packages\pandas\core\generic.py", line 1537, in __nonzero__
    raise ValueError(
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Desired output is as follows:

df=
    attendance          day     day_num    day_diff
 0     1546             FRI        5          1
 1     1978             SAT        6          2
 2     2150             SUN        0         -4

CodePudding user response:

You are getting that error basically because you are not passing a single string value to the index method, you are passing a Serie. So I recommend the Series.apply method to obtain each single day identification. Look at this:

# Your initial dataframe
df = pd.read_csv(io.StringIO("""
atendance,day
1546,FRI
1978,SAT
2150,SUN
"""))

df['day_num'] = df['day'].apply(lambda d: day_abrivs.index(d))
df['day_diff'] = df['day_num'] - todays_week_day_num
print(df)

Output:

atendance day day_num day_diff
0 1546 FRI 5 1
1 1978 SAT 6 2
2 2150 SUN 0 -4

CodePudding user response:

You should not use apply, here you can craft a mapping dictionary:

day_abrivs_dic = {k:v for v,k in enumerate(day_abrivs)}
# {'SUN': 0, 'MON': 1, 'TUE': 2, 'WED': 3, 'THU': 4, 'FRI': 5, 'SAT': 6}

df['day_num'] = df['day'].map(day_abrivs_dic)

df['day_diff'] = df['day_num'] - todays_week_day_num

Output:

   attendance  day  day_num  day_diff
0        1546  FRI        5         1
1        1978  SAT        6         2
2        2150  SUN        0        -4
  • Related