Home > Enterprise >  map function on pandas column works fine on the column as a series, but gives None values in column
map function on pandas column works fine on the column as a series, but gives None values in column

Time:02-23

I have a df with column 'Month'. The notation in that column is for example 'Apr 2021'. This is the case for all months, so Jan, Feb, Mar etc. And for a lot of years. I need to change, for example 'Apr 2021' to 04-2021 (first as a str and afterwards as datetime). The function I made works fine on the column as a series. BUTTTT..... as you can see underneath: the series is fine, but the column in the Dataframe shows None for all rows in column 'Month'. What is wrong here/what am I doing wrong? All code below. Thanks!

df_all.Month (first rows)
0     Feb 2021
1     Mar 2021
2     Mar 2021
3     Apr 2021
4     Apr 2021
5     May 2021
6     May 2021
7     May 2021
8     Jun 2021
9     Jun 2021
10    Jun 2021

Function to change values:

def test(x):
    x.strip()
    deel = x.split()
    if deel[0] == 'Jan': 
        deel[0] = '01-'
        x = deel[0]   deel[1]
    elif deel[0] == 'Feb': 
        deel[0] = '02-'
        x = deel[0]   deel[1]
    elif deel[0] == 'Mar': 
        deel[0] = '03-'
        x = deel[0]   deel[1]
    elif deel[0] == 'Apr': 
        deel[0] = '04-'
        x = deel[0]   deel[1]
    elif deel[0] == 'May': 
        deel[0] = '05-'
        x = deel[0]   deel[1]
    elif deel[0] == 'Jun': 
        deel[0] = '06-'
        x = deel[0]   deel[1]
    elif deel[0] == 'Jul': 
        deel[0] = '07-'
        x = deel[0]   deel[1]
    elif deel[0] == 'Aug': 
        deel[0] = '08-'
        x = deel[0]   deel[1]
    elif deel[0] == 'Sep': 
        deel[0] = '09-'
        x = deel[0]   deel[1]
    elif deel[0] == 'Oct': 
        deel[0] = '10-'
        x = deel[0]   deel[1]
    elif deel[0] == 'Nov': 
        deel[0] = '1-'
        x = deel[0]   deel[1]
    elif deel[0] == 'Dec': 
        deel[0] = '12-'
        x = deel[0]   deel[1]
    else: 
        print('nope')

This gives the changed Series, so this works (but as said Not in the df as shown beneath series)

df_all['Month'] = df_all['Month'].map(test)
df_all


02-2021
03-2021
03-2021
04-2021
04-2021
05-2021
05-2021
05-2021
06-2021
06-2021
06-2021
07-2021
07-2021
07-2021
08-2021
08-2021
08-2021
09-2021
09-2021
09-2021
10-2021
10-2021
10-2021
1-2021
1-2021
1-2021
12-2021
12-2021
12-2021
01-2022
01-2022
01-2022
02-2022
02-2022
02-2022

(first rows of the df:

Month   Access Type Users Accessing Apps
0   None    Analyzer    7
1   None    Analyzer    77
2   None    Professional    3
3   None    Analyzer    114
4   None    Professional    3
5   None    Analyzer    104

Adding a brand new column gives the same result: series is OK, but the new column only contains None values...

     Month     Access Type    Users Accessing Apps  Maand
0   Feb 2021    Analyzer          7                 None
1   Mar 2021    Analyzer         77                 None
2   Mar 2021    Professional      3                 None
3   Apr 2021    Analyzer        114                 None
4   Apr 2021    Professional      3                 None

Maybe this helps you:

df_all['Maand'] = df_all['Month'].map(test)
print(df_all['Maand'])
type(df_all.Maand)

02-2021
03-2021
03-2021
04-2021
04-2021
05-2021
05-2021
05-2021
06-2021
06-2021
06-2021
07-2021
07-2021
07-2021
08-2021
08-2021
08-2021
09-2021
09-2021
09-2021
10-2021
10-2021
10-2021
1-2021
1-2021
1-2021
12-2021
12-2021
12-2021
01-2022
01-2022
01-2022
02-2022
02-2022
02-2022
0     None
1     None
2     None
3     None
4     None
5     None
6     None
7     None
8     None
9     None
10    None
11    None
12    None
13    None
14    None
15    None
16    None
17    None
18    None
19    None
20    None
21    None
22    None
23    None
24    None
25    None
26    None
27    None
28    None
29    None
30    None
31    None
32    None
33    None
34    None
Name: Maand, dtype: object
pandas.core.series.Series

CodePudding user response:

I believe you should have a different approach to deal with your problem. A good solution involves working only with datetime, this way you can easily navigate your date, using proper functions adding efficiency to your code.

The code below does the formating job, without the need of using a function with a series of conditional statements:

df_all = pd.DataFrame({'Month': ['Feb 2021','Mar 2021','Mar 2021','Apr 2021','Apr 2021','May 2021','May 2021','May 2021','Jun 2021','Jun 2021','Jun 2021']})
df_all

       Month
0   Feb 2021
1   Mar 2021
2   Mar 2021
3   Apr 2021
4   Apr 2021
5   May 2021
6   May 2021
7   May 2021
8   Jun 2021
9   Jun 2021
10  Jun 2021

My suggestion: use this instead of your function.

df_all['Maand'] = pd.to_datetime(df_all['Month'],format='%b %Y').dt.to_period('M')

Output:

       Month      Maand
0   Feb 2021    2021-02
1   Mar 2021    2021-03
2   Mar 2021    2021-03
3   Apr 2021    2021-04
4   Apr 2021    2021-04
5   May 2021    2021-05
6   May 2021    2021-05
7   May 2021    2021-05
8   Jun 2021    2021-06
9   Jun 2021    2021-06
10  Jun 2021    2021-06
df_all.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 11 entries, 0 to 10
Data columns (total 2 columns):
 #   Column  Non-Null Count  Dtype    
---  ------  --------------  -----    
 0   Month   11 non-null     object   
 1   Maand   11 non-null     period[M]
dtypes: object(1), period[M](1)
memory usage: 304.0  bytes

CodePudding user response:

There is an easier way to do this by using pandas' pd.to_datetime() function.

df_all['Month']  = pd.to_datetime(df_all['Month'], format= "%b %Y")
  • Related