Pandas - Create dataframe-CodePudding

im trying to create a dataframe from a csv file, there's multiple columns and rows. One of the columns has either 'yes' or 'no'. I only want the dataframe to include the rows that have 'yes' Can someone show me how to write this code? Thanks in advance.

CodePudding user response：

You can read the file then filter the dataframe to only get "yes" rows.For example:

df = pd.read_csv("data.csv")
df = df[df.column == 'yes']

CodePudding user response：

Here are some ways that can help you. Say that your column name is choice and your data frame name is df

df_new = df[df['choice'] == 'yes']

In this case, if you run df_new, you will get your datagram that only has yes.

Same to the code below.

mask = df['choice'] == 'yes'
  
# new dataframe with selected rows
df_new = pd.DataFrame(df[mask])

You can also try this:

# condition with df.values property
mask = df['choice'].values == 'yes'
  
# new dataframe
df_new = df[mask]
  
print(df_new)