Home > Enterprise >  Creating a dataframe that contains two specific years that's from a dataframe
Creating a dataframe that contains two specific years that's from a dataframe

Time:02-28

I'm using pandas and I've been stuck with making a new DataFrame for the years 2012 and 2015 and both the 'Team' TOR and NYA. I have imported a .csv file and that's where I want to call the year 2012 and 2015 and put them into a single DataFrame.

df_2012 = pd.DataFrame(df_baseball[(df_baseball['Year '] == 2012) & 
                                   (df_baseball['Year '] == 2015) & 
                                   (df_baseball['Team '] == 'TOR') & 
                                   (df_baseball['Team '] == 'NYA')], 
                       columns = ['Games_Won', 'Runs_Scored','At_Bats','Hits',
                                  'Doubles','Triples','Home_Runs','Walks', 
                                  'Runs_Against','Earned_Runs',
                                  'Earned_Run_Average','Complete_Games',
                                  'Shutout','Saves','Infield_Put_Outs',
                                  'Hits_Allowed','Home_Run_Allowed', 
                                  'Walks_Allowed','Strikeouts_Allowed',
                                  'Errors','Fielding_Percentage'])

Am I using the wrong operator or is my syntax wrong? Would highly appreciate the responses!

CodePudding user response:

You want the OR operator because a year cannot be 2012 and 2015 at the same time; similarly a team cannot be TOR and NYA at the same time. You could also use isin, instead of writing OR between every condition.

Also, since isin (or OR) creates a boolean mask that you can use to filter df_baseball, you don't need to pass the result into a DataFrame constructor, since the sliced outcome will be a DataFrame, so the following should suffice:

df_2012 = df_baseball[df_baseball['Year '].isin([2012, 2015]) & df_baseball['Team '].isin(['TOR','NYA'])]
  • Related