I have been trying to create a new column in my dataframe base on a condition and I keep getting a key error. I suspect it might have something to do with the indexing or headers but have hit a roadblock.
What I am trying to achieve is a new column which says gas or coal based on if the technology field (which has three possible inputs: CCGT, OCGT or Coal). So I wrote a formula which will return coal if the technology column has coal or gas if not but it didn't seem to work.
DF with desired column:
| | settlementDate | 03/01/2022 | 04/01/2022 | 05/01/2022 | 06/01/2022 | 07/01/2022 | 08/01/2022 | |
|----------------|---------------:|-----------:|-----------:|-----------:|-----------:|-----------:|-----------:|----------|
| BM Unit ID 1 | Technology | | | | | | | Gas/Coal |
| CARR-2 | CCGT | 24 | 25 | 27 | 29 | 21 | 12 | Gas |
| CDCL-1 | CCGT | 13 | 22 | 8 | 25 | 12 | 16 | Gas |
| CNQPS-1 | CCGT | 25 | NaN | 33 | NaN | NaN | NaN | Gas |
| CNQPS-2 | CCGT | 20 | NaN | 32 | NaN | 32 | 26 | Gas |
| CNQPS-4 | CCGT | 27 | NaN | 41 | 33 | 35 | 31 | Gas |
| DAMC-1 | CCGT | 23 | 21 | 38 | 34 | NaN | NaN | Gas |
| DIDCB6 | CCGT | 28 | 11 | 11 | 8 | 19 | 15 | Gas |
| EECL-1 | CCGT | 5 | 29 | 4 | 1 | 1 | 2 | Gas |
| FAWN-1 | CCGT | 1 | 5 | 1 | 5 | 2 | 1 | Gas |
| FELL-1 | CCGT | 34 | 39 | 47 | 43 | 41 | 34 | Gas |
| GRAI-6 | CCGT | 4 | 18 | 2 | 4 | 3 | 3 | Gas |
| GRAI-7 | CCGT | 10 | 27 | 15 | 9 | 15 | 11 | Gas |
| GRAI-8 | CCGT | 11 | 28 | 36 | 16 | 26 | 25 | Gas |
| GRMO-1 | CCGT | 2 | 7 | 10 | 24 | 11 | 6 | Gas |
| HUMR-1 | CCGT | 22 | 30 | 37 | 37 | 33 | 28 | Gas |
| KEAD-2 | CCGT | 30 | NaN | NaN | NaN | NaN | NaN | Gas |
| LAGA-1 | CCGT | 15 | 13 | 29 | 32 | 23 | 24 | Gas |
| PEMB-11 | CCGT | 21 | 2 | 9 | 10 | 9 | 14 | Gas |
| PEMB-21 | CCGT | 7 | 1 | 6 | 13 | 8 | 8 | Gas |
| PEMB-31 | CCGT | 14 | 6 | 13 | 6 | 4 | 9 | Gas |
| PEMB-41 | CCGT | 9 | 4 | 7 | 7 | 10 | 13 | Gas |
| PEMB-51 | CCGT | 3 | 3 | 3 | 11 | 16 | NaN | Gas |
| ROCK-1 | CCGT | 31 | 34 | 42 | 38 | 38 | NaN | Gas |
| SCCL-2 | CCGT | 29 | NaN | 16 | 28 | 25 | NaN | Gas |
| SCCL-3 | CCGT | 17 | 16 | 31 | 3 | 18 | 10 | Gas |
| SHBA-1 | CCGT | 26 | 23 | 40 | 35 | 37 | NaN | Gas |
| SHOS-1 | CCGT | 16 | 15 | 28 | 15 | 29 | 27 | Gas |
| STAY-1 | CCGT | 19 | 12 | 5 | 23 | 6 | 7 | Gas |
| STAY-2 | CCGT | 18 | 20 | 18 | 21 | 24 | 20 | Gas |
| STAY-3 | CCGT | 8 | 17 | 17 | 20 | 13 | 22 | Gas |
| STAY-4 | CCGT | 12 | 8 | 20 | 18 | 14 | 23 | Gas |
| WBURB-1 | CCGT | 6 | 9 | 22 | 2 | 7 | 5 | Gas |
| WBURB-41 | COAL | 33 | 38 | 46 | 41 | 40 | 33 | coal |
| WBURB-43 | COAL | 32 | 37 | 45 | 40 | 39 | 32 | |
| CARR-1 | CCGT | NaN | 33 | 26 | 27 | 22 | 4 | |
| DIDCB5 | CCGT | NaN | 10 | 35 | 22 | NaN | NaN | |
| FDUNT-1 | OCGT | NaN | 36 | 44 | NaN | NaN | NaN | |
| GYAR-1 | CCGT | NaN | 26 | 14 | 17 | 20 | 21 | |
| KLYN-A-1 | CCGT | NaN | 24 | 12 | 19 | 27 | 29 | |
| LBAR-1 | CCGT | NaN | 19 | 25 | 31 | 28 | NaN | |
| PETEM1 | CCGT | NaN | 35 | NaN | NaN | NaN | NaN | |
| SEAB-1 | CCGT | NaN | 32 | 34 | 36 | NaN | 30 | |
| SEAB-2 | CCGT | NaN | 31 | 39 | 39 | 34 | NaN | |
| WBURB-2 | CCGT | NaN | 14 | 21 | 12 | 31 | 18 | |
| COSO-1 | CCGT | NaN | NaN | 30 | 42 | 36 | NaN | |
| KEAD-1 | CCGT | NaN | NaN | 43 | NaN | NaN | NaN | |
| MRWD-1 | CCGT | NaN | NaN | 19 | 26 | 5 | 19 | |
| SPLN-1 | CCGT | NaN | NaN | 23 | 30 | 30 | NaN | |
| WBURB-3 | CCGT | NaN | NaN | 24 | 14 | 17 | 17 | |
The code I have tried is:
rank_table_pivot['ave rank'] = rank_table_pivot.mean(axis=1)
rank_table_pivot['rank'] = rank_table_pivot['ave rank'].rank(method='dense', ascending=True)
rank_table_pivot.sort_values('ave rank')
def coal_gas(Technology):
if row['Technology'] == 'COAL':
return "coal"
else:
return "gas"
rank_table_pivot = rank_table_pivot['Technology'].apply(lambda Technology: map_Technology(Technology))
CodePudding user response:
You want to use apply
like this:
def func(value):
if value == "COAL":
return "coal"
else:
return "gas"
rank_table_pivot["Gas/Coal"] = rank_table_pivot['Technology'].apply(func)