Home > Enterprise >  Change Values in Python DataFrame to a binary Flag (For-Loop Problem)
Change Values in Python DataFrame to a binary Flag (For-Loop Problem)

Time:12-17

my question may be stupid but i am stuck for some time now.

I have a DataFrame, acquired from MySQL, in which the local weatherforecast is stored. Now i want to alter the Day to a binary float. Hereby a workday (Mon to Fri) should become a 1 and a weekend should become a 0.

I implemented the code before in the same script and it worked just fine. Now in this instance it just wouldn't.

I guess it's a pretty stupid mistake...

Here's my Code:

import mysql.connector
import pandas as pd

mydb = mysql.connector.connect(
  host="localhost",
  user="root",
  password="XXXXXX",
  database="Munich-Weather"
)

mycursor = mydb.cursor()

mycursor.execute("""SELECT timestamp, 
                 temperature-273.15 As temperature FROM weatherforecast""")

myresult = mycursor.fetchall()

data_weather = pd.DataFrame(data=myresult,
                    columns=['datetime','T_AMB'])

data_weather['datetime'] = pd.to_datetime(data_weather['datetime'])
data_weather['T_AMB']=pd.to_numeric(data_weather['T_AMB'])

'Wochentag und Stunde als Integer bestimmen'
data_weather['Hour']=data_weather['datetime'].dt.hour
data_weather['Day']=data_weather['datetime'].dt.dayofweek
data_weather['Hour']=pd.to_numeric(data_weather['Hour'], downcast="float")
data_weather['Day']=pd.to_numeric(data_weather['Day'], downcast="float")

for index, row in data_weather.iterrows():
    if row['Day'] < 5:
        row['Day']=1
    else:
        row['Day']=0

And here is the DataFrame data_weather after running the code:

runfile('C:/Simulation/Wettervorhersage/Regressionsanalyse.py', wdir='C:/Simulation/Wettervorhersage')
              datetime  T_AMB  Hour  Day
0  2021-12-04 12:00:00   1.94  12.0  5.0
1  2021-12-04 15:00:00   2.88  15.0  5.0
2  2021-12-04 18:00:00   5.37  18.0  5.0
3  2021-12-04 21:00:00   4.50  21.0  5.0
4  2021-12-05 00:00:00   2.12   0.0  6.0
5  2021-12-05 03:00:00   0.90   3.0  6.0
6  2021-12-05 06:00:00  -0.32   6.0  6.0
7  2021-12-05 09:00:00   0.98   9.0  6.0
8  2021-12-05 12:00:00   2.27  12.0  6.0
9  2021-12-05 15:00:00   1.45  15.0  6.0
10 2021-12-05 18:00:00   0.87  18.0  6.0
11 2021-12-05 21:00:00   0.16  21.0  6.0
12 2021-12-06 00:00:00  -1.26   0.0  0.0
13 2021-12-06 03:00:00  -0.91   3.0  0.0
14 2021-12-06 06:00:00  -1.45   6.0  0.0
15 2021-12-06 09:00:00  -0.13   9.0  0.0
16 2021-12-06 12:00:00   1.76  12.0  0.0
17 2021-12-06 15:00:00   1.25  15.0  0.0
18 2021-12-06 18:00:00   0.66  18.0  0.0
19 2021-12-06 21:00:00  -2.03  21.0  0.0
20 2021-12-07 00:00:00  -2.77   0.0  1.0
21 2021-12-07 03:00:00  -2.00   3.0  1.0
22 2021-12-07 06:00:00  -0.18   6.0  1.0
23 2021-12-07 09:00:00   0.71   9.0  1.0
24 2021-12-07 12:00:00   1.79  12.0  1.0
25 2021-12-07 15:00:00  -0.13  15.0  1.0
26 2021-12-07 18:00:00  -3.04  18.0  1.0
27 2021-12-07 21:00:00  -3.11  21.0  1.0
28 2021-12-08 00:00:00  -4.50   0.0  2.0
29 2021-12-08 03:00:00  -5.93   3.0  2.0
30 2021-12-08 06:00:00  -4.14   6.0  2.0
31 2021-12-08 09:00:00  -1.42   9.0  2.0
32 2021-12-08 12:00:00   1.05  12.0  2.0
33 2021-12-08 15:00:00  -1.52  15.0  2.0
34 2021-12-08 18:00:00  -2.22  18.0  2.0
35 2021-12-08 21:00:00  -2.61  21.0  2.0
36 2021-12-09 00:00:00  -1.73   0.0  3.0
37 2021-12-09 03:00:00  -2.58   3.0  3.0
38 2021-12-09 06:00:00  -2.59   6.0  3.0
39 2021-12-09 09:00:00  -1.43   9.0  3.0

As you can see the Column "Day" is not binary

Thank you in advance! Flaayor

CodePudding user response:

Try:

df = pd.DataFrame({'datetime': pd.date_range('2021-12-13', '2021-12-19', freq='D')})
df['Day'] = (df['datetime'].dt.weekday // 5 == 0).astype(float)
print(df)

# Output:
    datetime  Day
0 2021-12-13  1.0
1 2021-12-14  1.0
2 2021-12-15  1.0
3 2021-12-16  1.0
4 2021-12-17  1.0
5 2021-12-18  0.0
6 2021-12-19  0.0
  • Related