I am writing a custom error message when 2 Pandas series are not equal and want to use '<' to point at the differences.
Here's the workflow for a failed equality:
- Convert both lists to Python:
pd.Series([list])
- Side by side comparison in a dataframe:
table = pd.concat([list1], [list2]), axis=1
- Add column and index names:
table.columns = ['...', '...']
,table.index = ['...', '...']
Current output:
|Yours|Actual|
|1|1|
|2|2|
|4|3|
Desired output:
|Yours|Actual|-|
|1|1||
|2|2||
|4|3|<|
The naive solution is iterating through each list index and if it's not equal, appending '<' to another list then putting this list into pd.concat()
but I am looking for a method using Pandas. For example,
error_series = '<' if (abs(yours - actual) >= 1).all(axis=None) else ''
Ideally it would append '<' to a list if the difference between the results is greater than the Margin of Error of 1, otherwise append nothing
Note: Removed tables due to StackOverflow being picky and not letting my post my question
CodePudding user response:
You can create the DF and give index and column names in one line:
import pandas as pd
list1 = [1,2,4]
list2 = [1,2,10]
df = pd.DataFrame(zip(list1, list2), columns=['Yours', 'Actual'])
Create a boolean mask to find the rows that have a too large difference:
margin_of_error = 1
mask = df.diff(axis=1)['Actual'].abs()>margin_of_error
Add a column to the DF and set the values of the mask as you want:
df['too_different'] = df.diff(axis=1)['Actual'].abs()>margin_of_error
df['too_different'].replace(True, '<', inplace=True)
df['too_different'].replace(False, '', inplace=True)
output:
Yours Actual too_different
0 1 1
1 2 2
2 4 10 <
CodePudding user response:
or you can do something like this:
df = df.assign(diffr=df.apply(lambda x: '<'
if (abs(x['yours'] - x['actual']) >= 1)
else '', axis=1))
print(df)
'''
yours actual diffr
0 1 1
1 2 2
2 4 3 <