Hi i have a column which has values in both integer and text, I am trying to write a function in python where i can get the count of only integer values corresponding to another value in another column.
hear is the sample of the data,
constructorId positionText
1 3
1 4
1 R
4 6
4 5
4 N
4 9
and i want the result to be like this
constructorID positionText_count
1 2
4 3
here is the code
def not_finished(c):
r = 0
for c in hybrid_era_results['constructorId']:
y = hybrid_era_results['positionText']
if isinstance(y, int):
r = r 1
return r
this code does not throw an error but when i call the function it always returns the value 0. What am i doing wrong?
CodePudding user response:
As there are no clear details about the data, here a simple example to check the existance of intger values from col1 in col2, where both columns are in a str type.
import pandas as pd
# dimple example df
df = pd.DataFrame({"col1": ["value1", "4", "5", "3"], "col2": ["3", "6", "value2", "4"]})
# get the index where we have integer in both columns
idx_col1 = df.col1.str.isdigit()
idx_col2 = df.col2.str.isdigit()
# retrieve integer values
df_col1_int = df.loc[idx_col1, "col1"]
df_col2_int = df.loc[idx_col2, "col2"]
# get only values from col1 tht exists in col2 (the same index is not required)
idx_exists = df_col1_int.isin(df_col2_int)
df_exists = df_col1_int[idx_exists]
# get the number of integers in common
len(df_exists)
CodePudding user response:
IIUC, you can use to_numeric
to filter the numeric values, then groupby.sum
:
(pd.to_numeric(df['positionText'], errors='coerce')
.notna()
.groupby(df['constructorId'])
.sum()
.reset_index()
)
output:
constructorId positionText
0 1 2
1 4 3
To set up a custom column name, use .agg(positionText_count='sum')
in place of .sum()
(or .agg(**{'positionText_count': 'sum'})
if you want spaces of special characters):
constructorId positionText_count
0 1 2
1 4 3