Home > Software design >  Map to a Trans capacity request from transit(Nodal) points using pandas dataframes
Map to a Trans capacity request from transit(Nodal) points using pandas dataframes

Time:05-18

Input Dataframe With TerminalID,TName,XY cordinate, PeopleID

import pandas as pd

data = {
        'TerminalID': ['5','21','21','2','21','2','5','22','22','22','2','32','41','41','42','50','50'],
        'TName': ['AD','AMBO','AMBO','PS','AMBO','PS','AD','AM','AM','AM','PS','BO','BA','BA','BB','AZ','AZ'],
        'xy': ['1.12731,1.153756','0.12731,0.153757','0.12731,0.153757','1.989385,1.201941','0.12731,0.153757','1.989385,1.201941','1.12731,1.153756','2.12731,1.153756','2.12731,1.153756','2.12731,1.153756','1.989385,1.201941','1.989385,1.201941','2.989385,1.201941','2.989385,1.201941','2.989385,3.201941','3.989385,3.201941','3.989385,3.201941'],
        'Pcode': [ 'None','Z014','Z015','Z016','Z017','Z018','None','Z020','Z021','Z022','Z023','Z024','Z025','Z026','Z027','Z028','Z029']
    }

df = pd.DataFrame.from_dict(data)

Out[55]:

output of DF1

   TerminalID TName                 xy Pcode
0           5    AD   1.12731,1.153756  None
1          21  AMBO   0.12731,0.153757  Z014
2          21  AMBO   0.12731,0.153757  Z015
3           2    PS  1.989385,1.201941  Z016
4          21  AMBO   0.12731,0.153757  Z017
5           2    PS  1.989385,1.201941  Z018
6           5    AD   1.12731,1.153756  None
7          22    AM   2.12731,1.153756  Z020
8          22    AM   2.12731,1.153756  Z021
9          22    AM   2.12731,1.153756  Z022
10          2    PS  1.989385,1.201941  Z023
11         32    BO  1.989385,1.201941  Z024
12         41    BA  2.989385,1.201941  Z025
13         41    BA  2.989385,1.201941  Z026
14         42    BB  2.989385,3.201941  Z027
15         50    AZ  3.989385,3.201941  Z028
16         50    AZ  3.989385,3.201941  Z029

DF2,

T_cap is the capacity requirement at the Terminal ID and the T_load is the Load details, Tcap is the running count increment and T_load is the actual request at the Termainal, The 0 at the start and the end are padding for the solution

data2= {
        'BusID': ['18','18','18','18','18','18','18','18','18'],
        'Tcap': ['0','2','3','6','7','8','10','12','12'],
        'T_Load': ['0','2','1','2','2','1','2','2','0'],
        'TerminalID': [ '5','21','33','2','32','42','41','50','5'],
        
        'TName':['AD','AMBO','AM','PS','BO','BB','BA','AZ','AD']
    }

df2 = pd.DataFrame.from_dict(data2)

Out[59]:

  BusID Tcap T_Load TerminalID TName
0    18    0      0          5    AD
1    18    2      2         21  AMBO
2    18    3      1         33    AM
3    18    6      2          2    PS
4    18    7      2         32    BO
5    18    8      1         42    BB
6    18   10      2         41    BA
7    18   12      2         50    AZ
8    18   12      0          5    AD
    

Data Frame # the Final output requested

The output is based on the T_Load contstrains.

data3 = {
        'BusID': ['18','18','18','18','18','18','18','18','18'],
        'Tcap': ['0','2','3','6','7','8','10','12','12'],
        'T_Load': ['0','2','1','3','1','1','2','2','0'],
        'TerminalID': [ '5','21','33','2','32','42','41','50','5'],
        
        'TName':['AD','AMBO','AM','PS','BO','BB','BA','AZ','AD'],
        'Pcode':['None','Z013,Z019','Z020','Z016,Z018,Z023','Z024','Z027','Z025,Z026','Z028,Z029','None']
    }
    
    df3 = pd.DataFrame.from_dict(data3)

Out[61]:

  BusID Tcap T_Load TerminalID TName           Pcode
0    18    0      0          5    AD            None
1    18    2      2         21  AMBO       Z013,Z019
2    18    3      1         33    AM            Z020
3    18    6      3          2    PS  Z016,Z018,Z023
4    18    7      1         32    BO            Z024
5    18    8      1         42    BB            Z027
6    18   10      2         41    BA       Z025,Z026
7    18   12      2         50    AZ       Z028,Z029
8    18   12      0          5    AD            None

Thanking you

CodePudding user response:

My solution aggreagte join per TerminalID and TName and assign to another DataFrame by aggreagte list, last filter values by positions in list comprehension with join:

s = df.groupby(['TerminalID','TName'])['Pcode'].agg(list).rename('P_list')
df = df2.join(s, on=['TerminalID','TName'])

df['P_list'] = [','.join(x[:int(y)]) if int(y) != 0 else None 
                for x, y in zip(df['P_list'], df['T_Load'])]
print (df)
  BusID Tcap T_Load TerminalID TName          P_list
0    18    0      0          5    AD            None
1    18    2      2         21  AMBO       Z014,Z015
2    18    3      1         22    AM            Z020
3    18    6      3          2    PS  Z016,Z018,Z023
4    18    7      1         32    BO            Z024
5    18    8      1         42    BB            Z027
6    18   10      2         41    BA       Z025,Z026
7    18   12      2         50    AZ       Z028,Z029
8    18   12      0          5    AD            None

CodePudding user response:

You can map the aggregated strings per TName:

df2['Plist'] = df2['TName'].map(df.groupby('TName')['Pcode'].agg(','.join))

or, if you want to replace multiple string None as single one:

df2['Plist'] = df2['TName'].map(df.groupby('TName')['Pcode']
                                  .agg(lambda x: ','.join(e for e in x if e != 'None'))
                                  .replace('', 'None')
                                )

output:

  BusID Tcap T_Load TerminalID TName           Plist
0    18    0      0          5    AD            None
1    18    2      2         21  AMBO  Z014,Z015,Z017
2    18    3      1         22    AM  Z020,Z021,Z022
3    18    6      3          2    PS  Z016,Z018,Z023
4    18    7      1         32    BO            Z024
5    18    8      1         42    BB            Z027
6    18   10      2         41    BA       Z025,Z026
7    18   12      2         50    AZ       Z028,Z029
8    18   12      0          5    AD            None
update: limiting the output:

you can then trim the column with a regex, we can use a groupby to benefit from vectorized string operations within each group (this is mostly interesting if there are few groups and many rows):

df2['P_list'] = (df2.groupby('T_Load')['P_list']
                    .apply(lambda c: c.str.extract(rf'((?:[^,] ,?){{,{str(c.name)}}})',
                                                   expand=False)
                                    .str.strip(',')
                          )
                    .replace('', 'None')
                )

output:

  BusID Tcap T_Load TerminalID TName          P_list
0    18    0      0          5    AD            None
1    18    2      2         21  AMBO       Z014,Z015
2    18    3      1         22    AM            Z020
3    18    6      3          2    PS  Z016,Z018,Z023
4    18    7      1         32    BO            Z024
5    18    8      1         42    BB            Z027
6    18   10      2         41    BA       Z025,Z026
7    18   12      2         50    AZ       Z028,Z029
8    18   12      0          5    AD            None
  • Related