Home > Software engineering >  Mapping interval using map function in python
Mapping interval using map function in python

Time:02-10

Given a sequence of numbers, first convert to interval and convert to dictionary as per sequence of number and then after that map it with a column in DataFrame as shown in DataFrame below.

Query-1 how to create interval from above sequence into dictionary

given_sequence = [-100, 2, 5, 8, 10, 15]# Given sequence

idict = {-100: '<2',
2:'2-5',
5:'5-8',
8:'8-10',
10:'10-15',
15:'>15'}

Query-2 how to map to a column as shown in below screen shot with dictionary and map function

import pandas as pd
import random
# create an Empty DataFrame object
df = pd.DataFrame()
df['col1'] = [random.randint(1, 10) for i in range(0, 10)]
df['mapped']  = df['col1'].map(idict)

** Expected Output**

enter image description here

CodePudding user response:

Your desired outcome can be obtained using pd.cut. It creates a categorical variable from a continuous variable.

df['mapped'] = pd.cut(df['col1'], right=False, bins=given_sequence [float('inf')], 
                      labels=['<2', '2-5', '5-8','8-10','10-15','>15'])

Output:

   col1 mapped
0     6    5-8
1     1     <2
2    10  10-15
3     7    5-8
4     7    5-8
5     8   8-10
6     4    2-5
7     7    5-8
8    10  10-15
9     6    5-8

CodePudding user response:

If values in dictionary are sorted like in sample data add bins and labels parameters from dictionary keys and values in cut:

#if necessary sorting dict by keys
#idict = {k: v for k, v in sorted(idict.items(), key=lambda item: item[0])}

df['mapped'] = pd.cut(df['col1'], 
                      right=False, 
                      bins=[*list(idict.keys()), np.inf], 
                      labels=list(idict.values()))
  • Related