I've searched many questions here and I couldn't find a proper answer to me, so pls help me
A string column in df
Farms |
---|
Albatros |
Bali |
Casablanca |
Desired output
Farms | ACR | sourcekey |
---|---|---|
Albatros | Alb | Db_Alb_key |
Bali | Bal | Db_Bal_key |
Casablanca | Cas | Db_Cas_key |
My main focus here is to have a unique source key, because after, I need to create those tables on the Database.
So what is the best solution, thinking in performance. Should I do a foreach? Should I create the ACR(acronym) table?
I am using python version 3.8.10
If you need any more information, please let me know. I am just a noob and sometimes is really frustrating when we get stuck.
Thank you so much!
CodePudding user response:
Simply use slicing and vectorial string addition:
df['ACR'] = df['Farms'].str[:3]
df['sourcekey'] = 'Db_' df['ACR'] '_key'
output:
Farms ACR sourcekey
0 Albatros Alb Db_Alb_key
1 Bali Bal Db_Bal_key
2 Casablanca Cas Db_Cas_key
CodePudding user response:
This should work for you:
df['ACR'] = df.Farms.apply(lambda x: x[:3])
df['sourcekey'] = df.ACR.apply(lambda x: 'Db_' x 'key')
Output:
>>df
Farms ACR sourcekey
0 Albatros Alb Db_Albkey
1 Bali Bal Db_Balkey
2 Casablanca Cas Db_Caskey