I am reading a .txt in Pandas. I want to create new column names based on values before '=>' and add values in that column, based on value after '=>'
Input:
other_tags
"access"=>"agricultural","agricultural"=>"no"
"access"=>"customers"
"access"=>"customers","amenity"=>"parking"
"description"=>"GRAND PERE DE MON AMI"
"description"=>"GRAND PERE"
"design"=>"asymmetric","power"=>"tower"
"design"=>"asymmetric","power"=>"tower"
code:
import pandas as pd
df1 = pd.read_csv('try.txt', sep ='\t')
df1['access_type'] = df1['other_tags'].get('access')
df1['amenity'] = df1['other_tags'].get('amenity')
print(df1)
expected_output:
other_tags access_type amenity
"access"=>"agricultural","agricultural"=>"no" agricultural
"access"=>"customers" customers
"access"=>"customers","amenity"=>"parking" customers parking
"description"=>"GRAND PERE DE MON AMI"
"description"=>"GRAND PERE"
"design"=>"asymmetric","power"=>"tower"
"design"=>"asymmetric","power"=>"tower"
CodePudding user response:
Explode and expand your column other_tags
then pivot it and finally join it to your original dataframe.
df = df.join(df['other_tags'].str.replace('"', '')
.str.split(',').explode().str.split('=>', expand=True)
.reset_index().pivot('index', 0, 1).fillna(''))
Output
>>> df
other_tags access agricultural amenity description design power
0 "access"=>"agricultural","agricultural"=>"no" agricultural no
1 "access"=>"customers" customers
2 "access"=>"customers","amenity"=>"parking" customers parking
3 "description"=>"GRAND PERE DE MON AMI" GRAND PERE DE MON AMI
4 "description"=>"GRAND PERE" GRAND PERE
5 "design"=>"asymmetric","power"=>"tower" asymmetric tower
6 "design"=>"asymmetric","power"=>"tower" asymmetric tower
If you want to keep only the 2 columns access
and amenity
:
COLS_TO_KEEP = ['access', 'amenity']
df = df.join(df['other_tags'].str.replace('"', '')
.str.split(',').explode().str.split('=>', expand=True)
.reset_index().pivot('index', 0, 1)[COLS_TO_KEEP].fillna(''))
Output:
>>> df
other_tags access amenity
0 "access"=>"agricultural","agricultural"=>"no" agricultural
1 "access"=>"customers" customers
2 "access"=>"customers","amenity"=>"parking" customers parking
3 "description"=>"GRAND PERE DE MON AMI"
4 "description"=>"GRAND PERE"
5 "design"=>"asymmetric","power"=>"tower"
6 "design"=>"asymmetric","power"=>"tower"