Home > Software engineering >  How do I create my DataFrame to show only French movies in the 'Language' column of my dat
How do I create my DataFrame to show only French movies in the 'Language' column of my dat

Time:10-29

How do I create my DataFrame to show only French movies in the 'Language' column of my dataset where there is multiple languages in the column?

Example:

Languages column:
French
English
German,French,Spanish
Spanish,English,French
French, English, Gernman

What I have been trying only brings back the columns that have French only as the value in the language column. Please help!

I have tried:

df.loc[df['column_name'] == some_value]

but it only returns movies that are in the French language only, not those that are in French but also in other languages.

CodePudding user response:

Use str.contains with word boundaries (\b) to avoid matching substrings (e.g. 'Abc' matching 'Abcde'):

df.loc[df['column_name'].str.contains(r'\bFrench\b', case=False)]

If you are sure that there is no possible substring match (might be possible with languages):

df.loc[df['column_name'].str.contains('French', case=False)]

CodePudding user response:

Loc function returns the data at the specified index. You should get the rows you want like this:

df[df['column_name'] == 'value']
  • Related