Home > Back-end >  Pandas Transform DF to a new format
Pandas Transform DF to a new format

Time:11-11

I need help on how to properly transform my df from this:

df = pd.DataFrame({'ID': ['ID no1', "ID no2", "ID no3"],
               'ValueM2': ["11998","11076", "12025"],
               'ValueSqFt': [129145.39718,119221.07178, 129.43600276]})

to this: --> also i need it to be outputted as double quote (") instead of single quote (')

dfnew = pd.DataFrame({'ID': ["ID no1", "ID no2", "ID no3"],
               'DataMetric': [{"ValueM2": "11998"}, {"ValueM2": "11076"}, {"ValueM2": "12025"}],
               'DataImperial': [{"ValueSqFt": "129145.39718"}, {"ValueSqFt": "119221.07178"}, {"ValueSqFt": "129.43600276"}]})

CodePudding user response:

If there are only 2 columns to be manipulated, it is best to adopt a manual approach as follows:

df['ValueM2'] = [{'ValueM2': x} for x in df['ValueM2'].values]
df['ValueSqFt'] = [{"ValueSqFt": x} for x in df['ValueSqFt'].values]
df = df.rename(columns={'ValueM2': 'DataMetric', 'ValueSqFt': 'DataImperial'})

If you want to have the output with double quotes, you can use json.dumps:

import json
df['DataMetric'] = df['DataMetric'].apply(lambda x: json.dumps(x))
df['DataImperial'] = df['DataImperial'].apply(lambda x: json.dumps(x))

or

df['DataMetric'] = df['DataMetric'].astype(str).apply(lambda x: x.replace("'", '"'))
df['DataImperial'] = df['DataImperial'].astype(str).apply(lambda x: x.replace("'", '"'))

but this will convert the date type to string!

  • Related