I have a dataset like below (see code):
import pandas as pd
data = {'id': ['001', '002', '003','004'],
'address': ["William J. Clare\\n290 Valley Dr.\\nCasper, WY 82604\\nUSA",
"1180 Shelard Tower\\nMinneapolis, MN 55426\\nUSA",
"William N. Barnard\\n145 S. Durbin\\nCasper, WY 82601\\nUSA",
"215 S 11th ST"],
'locality': [None, None, None,'Laramie'],
'region': [None, None, None, 'WY'],
'Zipcode': [None, None, None, '87656'],
'Country': [None, None, None, 'US']
}
df = pd.DataFrame(data)
As you can see 4th line in address doesn't have locality,region, zipcode, country but it is there in different column.
I am trying to work with if statement. I want to write an if condition for the dataframe telling if df[locality,region,zipcode, country] not None then concatenate locality, region,zipcode, country into address column with '\\n' seperator
sample output:
address
290 Valley Dr.\\nCasper, WY 82604\\nUSA
1180 Shelard Tower\\nMinneapolis, MN 55426\\nUSA
145 S. Durbin\\nCasper, WY 82601\\nUSA
215 S 11th ST\\nLaramie, WY 87656\\nUS
I have been trying this from yesterday since I am not from a coding back ground any help will be appreciated greatly.
Thanks
CodePudding user response:
The following will do the work
df = df['address'].where(df[['locality', 'region', 'Zipcode', 'Country']].isnull().all(axis=1), df['address'] '\\n' df['locality'] ', ' df['region'] ' ' df['Zipcode'] '\\n' df['Country'])
[Out]:
0 William J. Clare\n290 Valley Dr.\nCasper, WY 8...
1 1180 Shelard Tower\nMinneapolis, MN 55426\nUSA
2 William N. Barnard\n145 S. Durbin\nCasper, WY ...
3 215 S 11th ST\nLaramie, WY 87656\nUS
Notes:
- I've adjusted the separator to be more close to the sample output in OP's question. If needed, one can change the
', '
or' '
with\\n
.