how to use if something in dataframe after groupby?-CodePudding

This is my code, that works until df=pd.df, and if I try to print((df.loc[("invoice type")]) before the if, the code works, but when i use the if gives me the error "Key Error: ´type´" when I groupby the column name type appears below the other columns.

What i need is to look for name type that contains words and number for example invoice type-5 and print the value of column 2 only by using if to search that.

df = pd.read_excel("example.xlsx", header=0)
df = df.gropuby("type")[["column 2","column 3", "column 4"]].sum()

df = pd.DataFrame(df)

if "invoice type - 5" in df["type"]:
    print((df.loc[("invoice type")])

CodePudding user response：

First of all, you don't want to use df[type] - you want to use either df['type'] (note the quotes) or df.type.

Secondly, you can check if "your string" in df.your_column. You need to use df.your_column.str.contains("your string") instead:

if df['type'].str.contains("invoice type - 5").any():
    print((df.loc[("invoice type")])

CodePudding user response：

Let's use as_type=False in groupby then query to filter the dataframe groupby results:

df.groupby('type', as_index=False)[["column 2","column 3", "column 4"]]\
  .sum().query('type == "invoice type - 5"')

MVCE:

df = pd.DataFrame({'type':np.random.choice([*'abcde'], 100),
                  'col1':np.random.random(100),
                  'col2':np.random.random(100)})

df.groupby('type', as_index=False)[['col1', 'col2']]\
  .sum().query('type == "a"')

Output:

  type      col1      col2
0    a  9.226583  8.052578