My pandas dict looks like this:
import pandas as pd
data = {'address': ["William J. Clare\\n290 Valley Dr.\\nCasper, WY 82604\\nUSA",
"1180 Shelard Tower\\nMinneapolis, MN 55426\\nUSA",
"William N. Barnard\\n145 S. Durbin\\nCasper, WY 82601\\nUSA",
"215 S 11th ST"]
}
df = pd.DataFrame(data)
df_dict = df.to_dict('records')
for row in df_dict:
add = row["address"]
print(add.split("\\n"), len(add.split("\\n")))
If you see I need to write an if statement to pop the 1st or 1st 2 elements in the dict if len(add.split("\\n"))
is equal to 4 then pop the 1st element and if len(add.split("\\n"))
is equal to 5 then pop the 1st two elements and save it has a pandas dataframe.
Your help will be greatly appreciated. I am stuck with this because when I give the if statement it says pop operation cannot be applied for str objects.
Thanks
CodePudding user response:
import pandas as pd
data = {'address': ["William J. Clare\\n290 Valley Dr.\\nCasper, WY 82604\\nUSA",
"1180 Shelard Tower\\nMinneapolis, MN 55426\\nUSA",
"William N. Barnard\\n145 S. Durbin\\nCasper, WY 82601\\nUSA",
"215 S 11th ST"]
}
df = pd.DataFrame(data)
df_dict = df.to_dict('records')
for row in df_dict:
add = row["address"]
if len(add.split("\\n"))==4:
target = add.split("\\n")
target.pop(0)
target = '\\n'.join(target)
if len(add.split("\\n"))==5:
target = add.split("\\n")
target.pop(0)
target.pop(1)
target = '\\n'.join(target)
print(target)
CodePudding user response:
The add
variable is actually a str.
If you want to remove an element from the splitted list you can do something like that:
import json
import pandas as pd
data = {'address': ["William J. Clare\\n290 Valley Dr.\\nCasper, WY 82604\\nUSA",
"1180 Shelard Tower\\nMinneapolis, MN 55426\\nUSA",
"William N. Barnard\\n145 S. Durbin\\nCasper, WY 82601\\nUSA",
"215 S 11th ST"]
}
df = pd.DataFrame(data)
df_dict = df.to_dict('records')
for row in df_dict:
add = row["address"]
if len(add.split("\\n")) == 5:
row['address'] = add[2:]
elif len(add.split("\\n")) == 4:
row['address'] = add[1:]
print(json.dumps(df_dict, indent=4))
dataframe = pd.DataFrame(df_dict)
print(dataframe)