Home > Net >  Pandas ValueError: Columns must be same length as key PyCharm
Pandas ValueError: Columns must be same length as key PyCharm

Time:01-11

I want to use a dictionary to add a column to a pandas DataFrame. I use apply lambda with a function to a row. I get 'ValueError: Columns must be same length as key'. I should be able to add a new column, but to simplify I included the column to change in the df.

I don't see what I'm doing wrong.

import pandas as pd

court_dict = dict(zip(['INC:INC08 Pensions', 'TX:TX01 Federal Tax', 'HO:HO08 Rent'], [8, 8, 0]))
bank_info = {
        'Category':['INC:INC08 Pensions', 'TX:TX01 Federal Tax', 'HO:HO08 Rent'],
        'Amount':[1250.23, 300.0, 1000],
        'Paragraph': ['', '', '', ]
            }
bank2 = pd.DataFrame(bank_info)


def get_column_names(row: pd.core.series.Series, position: int) -> str:
    category = row['Category']
    result = court_dict.get(category, 'd')
    print(category, result)
    return result


if __name__=="__main__":
    bank2[['Paragraph']] = bank2.apply(lambda row:get_column_names(row, 0), axis=1)
    print(bank2)

Here's the output:

C:\Users\Steve\anaconda3\envs\AccountingPersonal\python.exe C:\Users\Steve\PycharmProjects\AccountingPersonal\src\get_simple.py 
INC:INC08 Pensions 8
TX:TX01 Federal Tax 8
HO:HO08 Rent 0
Traceback (most recent call last):
  File "C:\Users\Steve\PycharmProjects\AccountingPersonal\src\get_simple.py", line 20, in <module>
    bank2[['Paragraph']] = bank2.apply(lambda row:get_column_names(row, 0), axis=1)
  File "C:\Users\Steve\anaconda3\envs\AccountingPersonal\lib\site-packages\pandas\core\frame.py", line 3643, in __setitem__
    self._setitem_array(key, value)
  File "C:\Users\Steve\anaconda3\envs\AccountingPersonal\lib\site-packages\pandas\core\frame.py", line 3702, in _setitem_array
    self._iset_not_inplace(key, value)
  File "C:\Users\Steve\anaconda3\envs\AccountingPersonal\lib\site-packages\pandas\core\frame.py", line 3721, in _iset_not_inplace
    raise ValueError("Columns must be same length as key")
ValueError: Columns must be same length as key

Process finished with exit code 1

CodePudding user response:

Everything looks perfect just [[]] and []-

bank2['Paragraph'] = bank2.apply(lambda row:get_column_names(row, 0), axis=1)

[] -> is a series [[]] -> is a dataframe

CodePudding user response:

With bank2[['Paragraph']], you're returning a DataFrame and not a Series. You need to use single square brackets [] instead.

def get_column_names(row: pd.core.series.Series, position: int) -> str:
    category = row['Category']
    result = court_dict.get(category, 'd')
    print(category, result)
    return result

if __name__=="__main__":
    bank2['Paragraph'] = bank2.apply(lambda row:get_column_names(row, 0), axis=1) # <- line updated
    print(bank2)

By the way, you can use pandas.Series.map without using apply and a custom function to get your expected output/column.

if __name__=="__main__":
    bank2['Paragraph'] = bank2['Category'].map(court_dict)
    print(bank2)

CodePudding user response:

Use single brackets when you try to assign something. Double brackets returns the column(s).

bank2['Paragraph'] = bank2.apply(lambda row:get_column_names(row, 0), axis=1)
  • Related