Home > Net >  Creating new column with values that depends on other columns
Creating new column with values that depends on other columns

Time:01-10

i have a pandas dataframe with play by play basketball data.

I want to look in the game events column where a player missed a throw, and if he missed a throw, I want to add a new column "Missed throw" and in this row set the value from 0 to 1. If he missed the next throw I want to increase the value in the column from 1 to 2 etc.

This is my dataframe

import pandas as pd

url = 'https://www.basketball-reference.com/boxscores/pbp/200911060GSW.html'
dfs = pd.read_html(url)

df = dfs[0] 
df.columns = df.columns.droplevel() # drops the "1st Q" Multilevel header of the dataframe

df.rename(columns={'Unnamed: 2_level_1': 'PM1', 'Unnamed: 4_level_1': 'PM2'}, inplace=True)

df

then i have made a subset of curry because I focus on his actions.

df_curry = df.loc[df["Golden State"].str.contains("Curry", na=False)]
df_curry`

now i tried to insert the hit and not hit throws into a new column to calculate the quote later but i always get the error "str' object has no attribute 'str'. Maybe someone can help me or give me another approach

# Calculating Hit Rate

field_throws_missed = 0
field_throws_hit = 0`

# Creating the new Columns
df_curry["Field Goals Hit"] = 0
df_curry["Field Goals Missed"] = 0
df_curry["Field Goals Percentage"] = 0`


for row in range(len(df_curry["Golden State"])):
  if df_curry.iloc[row]["Golden State"].str.contains("misses 2|misses 3"): 
    field_throws_missed  = 1
    df_curry.iloc[row]["Field Goals Missed"] = field_throws_missed
  elif df_curry.iloc[row]["Golden State"].str.contains("makes 2|makes 3"): 
    field_throws_hit  = 1
    df_curry.iloc[row]["Field Goals Hit"] = field_throws_hit`

CodePudding user response:

No loops necessary here, for count Trues values use cumulative sum by Series.cumsum:

df_curry = df.loc[df["Golden State"].str.contains("Curry", na=False)].copy()

df_curry["Field Goals Hit"] = df_curry["Golden State"].str.contains("misses 2|misses 3").cumsum()
df_curry["Field Goals Missed"] = df_curry["Golden State"].str.contains("makes 2|makes 3").cumsum()

EDIT: If need add 1 in next row use:

df_curry["Field Goals Hit"] = df_curry["Golden State"].str.contains("misses 2|misses 3").shift(fill_value=0).cumsum()
df_curry["Field Goals Missed"] = df_curry["Golden State"].str.contains("makes 2|makes 3").shift(fill_value=0).cumsum()
  • Related