I'm working as a python newbie and having following problem. Source Table:
Timestamp Value
0 2022-01-31T23:00:37Z 79
1 2022-01-31T23:00:38Z 80
2 2022-01-31T23:00:39Z 79
3 2022-01-31T23:00:46Z 79
4 2022-01-31T23:00:47Z 80
... ... ...
17181 2022-02-01T22:56:54Z 79
17182 2022-02-01T22:59:16Z 79
17183 2022-02-01T22:59:17Z 80
17184 2022-02-01T22:59:18Z 80
17185 2022-02-01T22:59:19Z 79
[17186 rows x 2 columns]
And want to divide it by a function with following code:
MAX = 79
SCALLING = 100
try:
cond = result['Value']>MAX
result.loc[cond,'Value'] = result['Value'].div(SCALLING).round(2)
except:
result = result
print(result)
The function want divide the Values. I tried itterating over the dataframe by code:
MAX = 79
SCALLING = 100
for i in result.index:
cond = result['Value']>MAX
result.loc[cond,'Value'] = result['Value'].div(SCALLING).round(2)
print(result)
But then I got the 'TypeError' : '>' not supported between instances of 'dict' and 'int'. Probably because result.loc[cond,'Value'] = dictonary datatype, how can I specify the specific value?
CodePudding user response:
Your initial code, though not extremely "pythonic", actually works just fine with the data you've provided. E.g.:
import pandas as pd
data = {'Timestamp': {0: '2022-01-31T23:00:37Z',
1: '2022-01-31T23:00:38Z',
2: '2022-01-31T23:00:39Z',
3: '2022-01-31T23:00:46Z',
4: '2022-01-31T23:00:47Z'},
'Value': {0: 79,
1: 80,
2: 79,
3: 79,
4: 80 }}
result = pd.DataFrame(data)
print(result)
Timestamp Value
0 2022-01-31T23:00:37Z 79
1 2022-01-31T23:00:38Z 80
2 2022-01-31T23:00:39Z 79
3 2022-01-31T23:00:46Z 79
4 2022-01-31T23:00:47Z 80
MAX = 79
SCALLING = 100
try:
cond = result['Value']>MAX
result.loc[cond,'Value'] = result['Value'].div(SCALLING).round(2)
except:
result = result
print(result)
Timestamp Value
0 2022-01-31T23:00:37Z 79.0
1 2022-01-31T23:00:38Z 0.8
2 2022-01-31T23:00:39Z 79.0
3 2022-01-31T23:00:46Z 79.0
4 2022-01-31T23:00:47Z 0.8
However, this error message:
'TypeError' : '>' not supported between instances of 'dict' and 'int'`
means that you have one or more dict
values somewhere in result.Value
.
E.g.:
data = {'Timestamp': {0: '2022-01-31T23:00:37Z',
1: '2022-01-31T23:00:38Z',
2: '2022-01-31T23:00:39Z',
3: '2022-01-31T23:00:46Z',
4: '2022-01-31T23:00:47Z'},
'Value': {0: {'Value': 79},
1: 80,
2: 79,
3: 79,
4: 80 }}
result = pd.DataFrame(data)
print(result)
Timestamp Value
0 2022-01-31T23:00:37Z {'Value': 79}
1 2022-01-31T23:00:38Z 80
2 2022-01-31T23:00:39Z 79
3 2022-01-31T23:00:46Z 79
4 2022-01-31T23:00:47Z 80
result['Value']>MAX
would haved raised the aforementioned TypeError
, if not for the Try ... Except
construction.
So, the remedy is to find the dict
values in your column, and deal with them. To locate them, you could use:
dict_values = result[result.Value.map(type)==dict]
print(dict_values)
Timestamp Value
0 2022-01-31T23:00:37Z {'Value': 79}
CodePudding user response:
Here's an answer if your goal is to divide all value
's by SCALLING if they are larger than MAX. I don't know how you got a dict
error. Maybe you have a dict called MAX
somewhere?
import pandas as pd
import io
#Read in your example table
result = pd.read_csv(
io.StringIO("""
Timestamp Value
0 2022-01-31T23:00:37Z 79
1 2022-01-31T23:00:38Z 80
2 2022-01-31T23:00:39Z 79
3 2022-01-31T23:00:46Z 79
4 2022-01-31T23:00:47Z 80
17181 2022-02-01T22:56:54Z 79
17182 2022-02-01T22:59:16Z 79
17183 2022-02-01T22:59:17Z 80
17184 2022-02-01T22:59:18Z 80
17185 2022-02-01T22:59:19Z 79
"""),
delim_whitespace=True,
index_col=0,
parse_dates=['Timestamp'],
)
#Divide all value's by SCALLING if they are larger than MAX
MAX = 79
SCALLING = 100
cond = result['Value']>MAX
result.loc[cond,'Value'] = result.loc[cond,'Value'].div(SCALLING).round(2)
print(result)
Output