please help me with the following conversion please. So I have a pandas dataframe in the following format:
id | location |
---|---|
{ "0": "5", | "0": "Charlotte, North Carolina", |
"1": "5", | "1": "N/A", |
"2": "5", | "2": "Portland, Oregon", |
"3": "5", | "3": "Jonesborough, Tennessee", |
"4": "5", | "4": "Rockville, Indiana", |
"5": "5",} | "5": "Dallas, Texas", |
and would like to convert this into the following format:
A header | Another header |
---|---|
"5" | "Charlotte, North Carolina" |
"5" | "N/A" |
"5" | "Portland, Oregon" |
"5" | "Jonesborough, Tennessee" |
"5" | "Rockville, Indiana" |
"5" | "Dallas, Texas" |
Please help
CodePudding user response:
try this, "column" being the name of the column you want to set as an index, this removes the index column and sets your desired column as the first one
df=df.set_index("column")
CodePudding user response:
You can try this.
import pandas as pd
import re
df = pd.DataFrame([['{ "0": "5",', '"0": "Charlotte, North Carolina",'], ['"1": "5",','"1": "N/A",']], columns=['id', 'location'])
#using regex to extract int values and selecting second int
df['id'] = df['id'].apply(lambda x: re.findall(r'\d ', x)[1])
#Split the string with : and select second value. And remove comma
df['location'] = df['location'].apply(lambda x: x.split(':')[1][:-1])
print(df)
Output:
id location
0 5 "Charlotte, North Carolina"
1 5 "N/A"