Setting Character Limit on Pandas DataFrame Column-CodePudding

Background:
Given the following pandas df -

Holding Account	Model Type	Entity ID	Direct Owner ID
WF LLC \| 100 Jones Street 26th Floor San Francisco Ca Ltd Liability - Income Based Gross USA Only (486941515)	51364633	4564564	5646546
RF LLC \| Neuberger \| LLC \| Aukai Services LLC-Neuberger Smid - Income Accuring Net of Fees Worldwide Fund (456456218)	46256325	1645365	4926654

The ask:
What is the most pythonic way to enforce a 80 character limit to the Holding Account column (dtype = object) values?

Context: I am writing df to a .csv and then subsequently uploading to a system with an 80-character limit. The values of Holding Account column are unique, so I just want to sacrifice those characters that take the string over 80-characters.

My attempt:
This is what I attempted - df['column'] = df['column'].str[:80]

CodePudding user response：

Why not just use .str, like you were doing?

df['Holding Account'] = df['Holding Account'].str[:80]

Output:

>>> df
                                                                    Holding Account  Model Type  Entity ID  Direct Owner ID
0  WF LLC | 100 Jones Street 26th Floor San Francisco Ca Ltd Liability - Income Bas    51364633    4564564          5646546
1  RF LLC | Neuberger | LLC | Aukai Services LLC-Neuberger Smid - Income Accuring N    46256325    1645365          4926654

CodePudding user response：

Using slice will loss some information, I will suggest create a mapping table after get the factorized. This also save the storage space for server or db

s = df['Holding Account'].factorize()[0]
df['Holding Account'] = df['Holding Account'].factorize()[0]
d = dict(zip(s, df['Holding Account']))

If you would like get the databank just do

df['new'] = df['Holding Account'] .map(d)