I would like to create a column that is the result of boolean logic of list stored in other column.
import pandas as pd
import numpy as np
d = {'202201': [7180516.0, 4868058.0], '202202': [433433740.0, 452632806.0], '202203': [5444119.0, 10000000.0]}
df = pd.DataFrame(data=d)
#Storing Values in List
df['seq'] = df.agg(list, axis=1)
#Or
#df['seq'] = df.agg(np.array, axis=1)
df
Desired output I want is a new col (df['seqToFs']) that is a True or False list For values in df['seq']list > 8000000.
import numpy as np
d = {'202201': [7180516.0, 4868058.0], '202202': [433433740.0, 452632806.0], '202203': [5444119.0, 10000000.0],
'seq':[[7180516.0,433433740.0,5444119.0],[4868058.0,452632806.0,10000000.0]], 'seqToFs':[[False,True,False],[False,True,True]]}
df = pd.DataFrame(data=d)
df
Is it better to make df['seq'] a list or np.array for performance?
My end goals is to analyze sequential orders of values meeting conditions. Is there a better way to perform such analysis than making lists in dataframe?
Example frame work of what I was trying to apply to each row. (Not my code)
original_prices = [1.25, -9.45, 10.22, 3.78, -5.92, 1.16]
prices = [True if i > 0else False for i in original_prices]
prices
Where original_prices list is replaced with row list, df['seq'] and prices is new col df['seqToFs]. Getting errors because of list format.
Help would be much appreciated.
CodePudding user response:
You can use the normal >
operator and then use agg
or apply
to get the desired output:
(df > 8000000).apply(list, axis=1)
0 [False, True, False]
1 [False, True, True]
example:
df = pd.DataFrame({'202201': [7180516.0, 4868058.0], '202202': [433433740.0, 452632806.0], '202203': [5444119.0, 10000000.0]})
df['seqToFs'] = (df > 8000000).apply(list, axis=1)