Home > Enterprise >  Python pandas filter by word
Python pandas filter by word

Time:07-06

I have csv file:

df=pd.read_csv(Path(os.getcwd() r'\all_files.csv'), sep=',', on_bad_lines='skip', index_col=False, dtype='unicode')

column:

column=input("Column:")

word:

word=input("Word:")

I want to filter a csv file:

df2=df[(df[column].dropna().str.contains(word.lower()))]

But when I write to column:ЄДРПОУ(Гр.8)

I have a error:

Warning (from warnings module):
  File "C:\python\python\FilterExcelFiles.py", line 35
    df2=df[(df[column].dropna().str.contains(word.lower()))]
UserWarning: Boolean Series key will be reindexed to match DataFrame index.
Traceback (most recent call last):
  File "C:\python\python\FilterExcelFiles.py", line 51, in <module>
    s()
  File "C:\python\python\FilterExcelFiles.py", line 35, in s
    df2=df[(df[column].dropna().str.contains(word.lower()))]
  File "C:\Users\Станислав\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\frame.py", line 3496, in __getitem__
    return self._getitem_bool_array(key)
  File "C:\Users\Станислав\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\frame.py", line 3549, in _getitem_bool_array
    key = check_bool_indexer(self.index, key)
  File "C:\Users\Станислав\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\indexing.py", line 2383, in check_bool_indexer
    raise IndexingError(
pandas.core.indexing.IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match).

And I wont to lower df[column]

CodePudding user response:

You're dropping the NaN in the indexer, making it likely shorter, which results in the error in boolean indexing.

Don't dropna, the NaN will be False anyway:

df2 = df[df[column].str.contains(word.lower())]

Alternatively, if you had a operation that would return NaNs, you could fill them with False:

df2 = df[df[column].str.contains(word.lower()).fillna(False)]

CodePudding user response:

So i have searched around for an answer and I came across a similar post that might have the solution for your problem, hope it helps.

  • Related