I've been using idx.max() to get the most common value, after using value_counts(), for each column in a pandas dataframe. But I'd like to use this within a function as an argument, and specify either idxmax() or idxmin().
This code works fine and gives me the output I would like
test = ['00100',
'11110',
'10110',
'10111',
'10101',
'01111',
'00111',
'11100',
'10000',
'11001',
'00010',
'01010']
split_lines = [list(x) for x in test]
inp = pd.DataFrame(split_lines)
def get_binary(x,y):
df = x
b = []
for col in df.columns:
res = df[col].value_counts()
if res[0] == res[1]:
b.append(y)
else:
b.append(res.idxmax())
df = df[df[col] == b[col]]
return b
answer = get_binary(inp, '1')
print(answer)
output
['1', '0', '1', '1', '1']
However this doesn't work
def get_binary(x,y,z):
df = x
b = []
for col in df.columns:
res = df[col].value_counts()
if res[0] == res[1]:
b.append(y)
else:
b.append(z)
df = df[df[col] == b[col]]
return b
answer = get_binary(inp, '1', 'res.idxmax()')
and returns the error
File "<input>", line 17, in <module>
File "<input>", line 7, in get_binary
File "/Users/user/adventofcode/lib/python3.8/site-packages/pandas/core/series.py", line 939, in __getitem__
return self._values[key]
IndexError: index 0 is out of bounds for axis 0 with size 0
Any help appreciated!
CodePudding user response:
I am not sure why you want to do this, but yes, you can, using __getattribute__
:
def get_binary(x,y,z):
....
if res[0] == res[1]:
b.append(y)
else:
b.append(res.__getattribute__(z)())
....
Calling:
answer = get_binary(inp, '1', 'idxmax')
This method works only in Python-3.x.