Home > OS >  Setting Character Limit on Pandas DataFrame Column
Setting Character Limit on Pandas DataFrame Column

Time:03-25

Background:
Given the following pandas df -

Holding Account Model Type Entity ID Direct Owner ID
WF LLC | 100 Jones Street 26th Floor San Francisco Ca Ltd Liability - Income Based Gross USA Only (486941515) 51364633 4564564 5646546
RF LLC | Neuberger | LLC | Aukai Services LLC-Neuberger Smid - Income Accuring Net of Fees Worldwide Fund (456456218) 46256325 1645365 4926654

The ask:
What is the most pythonic way to enforce a 80 character limit to the Holding Account column (dtype = object) values?

Context: I am writing df to a .csv and then subsequently uploading to a system with an 80-character limit. The values of Holding Account column are unique, so I just want to sacrifice those characters that take the string over 80-characters.

My attempt:
This is what I attempted - df['column'] = df['column'].str[:80]

CodePudding user response:

Why not just use .str, like you were doing?

df['Holding Account'] = df['Holding Account'].str[:80]

Output:

>>> df
                                                                    Holding Account  Model Type  Entity ID  Direct Owner ID
0  WF LLC | 100 Jones Street 26th Floor San Francisco Ca Ltd Liability - Income Bas    51364633    4564564          5646546
1  RF LLC | Neuberger | LLC | Aukai Services LLC-Neuberger Smid - Income Accuring N    46256325    1645365          4926654

CodePudding user response:

Using slice will loss some information, I will suggest create a mapping table after get the factorized. This also save the storage space for server or db

s = df['Holding Account'].factorize()[0]
df['Holding Account'] = df['Holding Account'].factorize()[0]
d = dict(zip(s, df['Holding Account']))

If you would like get the databank just do

df['new'] = df['Holding Account'] .map(d)
  • Related