Home > other >  Substituting values with conditions
Substituting values with conditions

Time:07-06

I have a dataframe like this one below

   Air Station Code  Humidity  Temperature  Latitude  Longitude

        St.1           20         10        10.00      10.00
        St.2            4         15        25.00      30.00
        St.3           16         21         8.00      15.00
        St.4           38          8        31.00      40.00
        St.5           10         18        10.00      10.00
        St.6           40          4        25.00      30.00
        St.7           10         13         8.00      15.00
        St.8           46         11        31.00      40.00
        St.9           28          9        10.00      10.00
        St.10          14         22        25.00      30.00
        St.11           5         40         8.00      15.00
        St.12          11         10        31.00      40.00
                                    ...

        St.89          61         35        10.00      10.00 
        St.90          23         29        25.00      30.00
        St.91          35         12         8.00      15.00
        St.92          31          7        31.00      40.00

I want to change the station codes by matching the coordinates, substituing the codes by repeating the first 4 codes, obtaining this

   Air Station Code  Humidity  Temperature  Latitude  Longitude

        St.1           20         10        10.00      10.00
        St.2            4         15        25.00      30.00
        St.3           16         21         8.00      15.00
        St.4           38          8        31.00      40.00
        St.1           10         18        10.00      10.00
        St.2           40          4        25.00      30.00
        St.3           10         13         8.00      15.00
        St.4           46         11        31.00      40.00
        St.1           28          9        10.00      10.00
        St.2           14         22        25.00      30.00
        St.3            5         40         8.00      15.00
        St.4           11         10        31.00      40.00
                                    ...

        St.1           61         35        10.00      10.00 
        St.2           23         29        25.00      30.00
        St.3           35         12         8.00      15.00
        St.4           31          7        31.00      40.00

Is there some way to implement an "if/else" substitution on the whole dataframe without going manually over every observation in python?

CodePudding user response:

There may be a better way to do this, but this solves the problem. I create a second dataframe without the duplicates, which keeps the first occurrence of each lat/long. I make lat/long the index and drop the other columns. I then do a "join", adding a new column with the matching lat/long. I then overwrite the original station code with the looked up one.

import pandas as pd

data = [
    ["St.1", 20, 10, 10.00, 10.00],
    ["St.2", 4, 15, 25.00, 30.00],
    ["St.3", 16, 21, 8.00, 15.00],
    ["St.4", 38, 8, 31.00, 40.00],
    ["St.5", 10, 18, 10.00, 10.00],
    ["St.6", 40, 4, 25.00, 30.00],
    ["St.7", 10, 13, 8.00, 15.00],
    ["St.8", 46, 11, 31.00, 40.00],
    ["St.9", 28, 9, 10.00, 10.00],
    ["St.10", 14, 22, 25.00, 30.00],
    ["St.11", 5, 40, 8.00, 15.00],
    ["St.12", 11, 10, 31.00, 40.00],
    ["St.89", 61, 35, 10.00, 10.00],
    ["St.90", 23, 29, 25.00, 30.00],
    ["St.91", 35, 12, 8.00, 15.00],
    ["St.92", 31, 7, 31.00, 40.00],
]

column = "Air_Station_Code  Humidity  Temperature  Latitude  Longitude".split()
df = pd.DataFrame(data,columns=column)
print(df)
df1 = df.drop_duplicates(['Latitude','Longitude'])
df1 = df1[['Air_Station_Code','Latitude','Longitude']]
df1.set_index(['Latitude','Longitude'], inplace=True)
print(df1)
df2 = df.join( df1, on=['Latitude','Longitude'], rsuffix='R' )
print(df2)

df['Air_Station_Code'] = df2['Air_Station_CodeR']
print(df)

Output:

   Air_Station_Code  Humidity  Temperature  Latitude  Longitude
0              St.1        20           10      10.0       10.0
1              St.2         4           15      25.0       30.0
2              St.3        16           21       8.0       15.0
3              St.4        38            8      31.0       40.0
4              St.5        10           18      10.0       10.0
5              St.6        40            4      25.0       30.0
6              St.7        10           13       8.0       15.0
7              St.8        46           11      31.0       40.0
8              St.9        28            9      10.0       10.0
9             St.10        14           22      25.0       30.0
10            St.11         5           40       8.0       15.0
11            St.12        11           10      31.0       40.0
12            St.89        61           35      10.0       10.0
13            St.90        23           29      25.0       30.0
14            St.91        35           12       8.0       15.0
15            St.92        31            7      31.0       40.0
                   Air_Station_Code
Latitude Longitude                 
10.0     10.0                  St.1
25.0     30.0                  St.2
8.0      15.0                  St.3
31.0     40.0                  St.4
   Air_Station_Code  Humidity  ...  Longitude  Air_Station_CodeR
0              St.1        20  ...       10.0               St.1
1              St.2         4  ...       30.0               St.2
2              St.3        16  ...       15.0               St.3
3              St.4        38  ...       40.0               St.4
4              St.5        10  ...       10.0               St.1
5              St.6        40  ...       30.0               St.2
6              St.7        10  ...       15.0               St.3
7              St.8        46  ...       40.0               St.4
8              St.9        28  ...       10.0               St.1
9             St.10        14  ...       30.0               St.2
10            St.11         5  ...       15.0               St.3
11            St.12        11  ...       40.0               St.4
12            St.89        61  ...       10.0               St.1
13            St.90        23  ...       30.0               St.2
14            St.91        35  ...       15.0               St.3
15            St.92        31  ...       40.0               St.4

[16 rows x 6 columns]
   Air_Station_Code  Humidity  Temperature  Latitude  Longitude
0              St.1        20           10      10.0       10.0
1              St.2         4           15      25.0       30.0
2              St.3        16           21       8.0       15.0
3              St.4        38            8      31.0       40.0
4              St.1        10           18      10.0       10.0
5              St.2        40            4      25.0       30.0
6              St.3        10           13       8.0       15.0
7              St.4        46           11      31.0       40.0
8              St.1        28            9      10.0       10.0
9              St.2        14           22      25.0       30.0
10             St.3         5           40       8.0       15.0
11             St.4        11           10      31.0       40.0
12             St.1        61           35      10.0       10.0
13             St.2        23           29      25.0       30.0
14             St.3        35           12       8.0       15.0
15             St.4        31            7      31.0       40.0

CodePudding user response:

df['Air Station Code'] = 'St.'   pd.Series(df[['Latitude','Longitude']].astype(str).agg(sum, axis=1).factorize()[0]   1).astype(str)

df
Out[77]: 
   Air Station Code  Humidity  Temperature  Latitude  Longitude
0              St.1        20           10      10.0       10.0
1              St.2         4           15      25.0       30.0
2              St.3        16           21       8.0       15.0
3              St.4        38            8      31.0       40.0
4              St.1        10           18      10.0       10.0
5              St.2        40            4      25.0       30.0
6              St.3        10           13       8.0       15.0
7              St.4        46           11      31.0       40.0
8              St.1        28            9      10.0       10.0
9              St.2        14           22      25.0       30.0
10             St.3         5           40       8.0       15.0
11             St.4        11           10      31.0       40.0
  • Related