Background:
Given the following pandas df
-
Holding Account | Model Type | Entity ID | Direct Owner ID |
---|---|---|---|
WF LLC | 100 Jones Street 26th Floor San Francisco Ca Ltd Liability - Income Based Gross USA Only (486941515) | 51364633 | 4564564 | 5646546 |
RF LLC | Neuberger | LLC | Aukai Services LLC-Neuberger Smid - Income Accuring Net of Fees Worldwide Fund (456456218) | 46256325 | 1645365 | 4926654 |
The ask:
What is the most pythonic way to enforce a 80 character limit to the Holding Account
column (dtype = object
) values?
Context: I am writing df
to a .csv
and then subsequently uploading to a system with an 80-character limit. The values of Holding Account
column are unique, so I just want to sacrifice those characters that take the string over 80-characters.
My attempt:
This is what I attempted - df['column'] = df['column'].str[:80]
CodePudding user response:
Why not just use .str
, like you were doing?
df['Holding Account'] = df['Holding Account'].str[:80]
Output:
>>> df
Holding Account Model Type Entity ID Direct Owner ID
0 WF LLC | 100 Jones Street 26th Floor San Francisco Ca Ltd Liability - Income Bas 51364633 4564564 5646546
1 RF LLC | Neuberger | LLC | Aukai Services LLC-Neuberger Smid - Income Accuring N 46256325 1645365 4926654
CodePudding user response:
Using slice will loss some information, I will suggest create a mapping table after get the factorized. This also save the storage space for server or db
s = df['Holding Account'].factorize()[0]
df['Holding Account'] = df['Holding Account'].factorize()[0]
d = dict(zip(s, df['Holding Account']))
If you would like get the databank just do
df['new'] = df['Holding Account'] .map(d)