How do I convert this column into multiple columns?-CodePudding

My pandas DataFrame looks like this:

id	address
1	[{'city': 'MURFREESBORO', 'line': ['9999 Candy Cane Island'], 'postalCode': '39999', 'state': '56'}]
2	[{'city': 'LIKELAND', 'line': ['11111 WS 80RD ST'], 'postalCode': '71398', 'state': '99'}]
3	[{'city': 'CHASS', 'line': ['36 LONDON LN'], 'postalCode': '269235', 'state': '35'}]

How do I convert this column into multiple columns to look like this?

id	city	line	postalcode	state
1	MURFREESBORO	9999 Candy Cane Island	39999	56
2	LIKELAND	11111 WS 80RD ST	71398	99
3	CHASS	36 LONDON LN	269235	35

I have tried multiple different ways:

df = pd.json_normalize(newdf['address'])

# AND
newdf['address'] = newdf['address'].apply(lambda x: "'"   str(x)   "'")
newdf['address'] = newdf['address'].apply(str).str.replace('[', '').str.replace(']', '')

Data to construct the DataFrame:

{'id': [1, 2, 3],
 'address': [[{'city': 'MURFREESBORO', 'line': ['9999 Candy Cane Island'],
               'postalCode': '39999', 'state': '56'}],
             [{'city': 'LIKELAND', 'line': ['11111 WS 80RD ST'],
               'postalCode': '71398', 'state': '99'}],
             [{'city': 'CHASS','line': ['36 LONDON LN'],
               'postalCode': '269235', 'state': '35'}]]}

CodePudding user response：

You could explode "address" column, convert it to a list and construct a DataFrame with it; then join it back to df:

s = df['address'].explode()
out = df.join(pd.DataFrame(s.tolist(), index=s.index).explode('line')).drop(columns='address')

Output:

   id          city                    line postalCode state
0   1  MURFREESBORO  9999 Candy Cane Island      39999    56
1   2      LIKELAND        11111 WS 80RD ST      71398    99
2   3         CHASS            36 LONDON LN     269235    35

CodePudding user response：

First change 1-element list in each element of address column into the content of this element.

Then you can invoke json_normalize on the result of step 1 and save the result in a temporary DataFrame.

Both above operations can be performed as:

wrk = pd.json_normalize(df.address.str[0])

There is one more conversion to perform, i.e. change 1-element list in each element of line column into the content of this element:

wrk.line = wrk.line.str[0]

And the last step is to join id column with wrk:

result = df[['id']].join(wrk)

The result is:

   id          city                    line postalCode state
0   1  MURFREESBORO  9999 Candy Cane Island      39999    56
1   2      LIKELAND        11111 WS 80RD ST      71398    99
2   3         CHASS            36 LONDON LN     269235    35

The leftmost column is the index, which you didn't include in your post.

To easier comprehend how the above code works, print wrk after each step of my code.