I have a dataframe which has column as follows:
|REGION/CATEGORY|
|--|-|
|NORTHERN REGION|
|THERMAL|
|HYDRO|
|NUCLEAR|
|WESTERN REGION|
|THERMAL|
|HYDRO|
|NUCLEAR|
|SOUTHERN REGION|
|THERMAL|
|HYDRO|
|NUCLEAR|
|EASTERN REGION|
|THERMAL|
|HYDRO|
|NORTH EASTERN REGION|
|THERMAL|
|HYDRO|
|ALL INDIA REGION|
|THERMAL|
|HYDRO|
|NUCLEAR|
I want to split the column into two different columns in the dataframe i.e.Region and Category as column name.
REGION = ['NORTHERN REGION','WESTERN REGION','SOUTHERN REGION','EASTERN REGION','NORTH EASTERN REGION']
CATEGORY = ['THERMAL','NUCLEAR','HYDRO']
How can I write an if else statement so that I can get the following as desired output:
id | REGION | CATEGORY |
---|---|---|
11 | NORTHERN REGION | THERMAL |
12 | NORTHERN REGION | NUCLEAR |
13 | NORTHERN REGION | HYDRO |
14 | WESTERN REGION | THERMAL |
15 | WESTERN REGION | NUCLEAR |
16 | WESTERN REGION | HYDRO |
for df['REGION'] in df:
if df['REGION'] == 'REGION':
df['REGION'] = df['REGION'].append('REGION')
elif df['CATEGORY'] == CATEGORY:
df['CATEGORY'] = df['CATEGORY'].append('CATEGORY')
I tried to append it to the columns after splitting
CodePudding user response:
For the sake of simplicity I have used a list for the initial data.
data = ['NORTHERN REGION','THERMAL','NUCLEAR','HYDRO',
'WESTERN REGION','THERMAL','NUCLEAR','HYDRO',
'SOUTHERN REGION','THERMAL','NUCLEAR','HYDRO',
'EASTERN REGION','THERMAL','HYDRO'
]
transformed = []
for elem in data:
if elem.endswith('REGION'):
active_region = elem
else:
transformed.append((active_region, elem))
df = pd.DataFrame(transformed, columns=('REGION', 'CATEGORY'))
CodePudding user response:
I hope this helps you...
df1 = pd.DataFrame(columns = ["REGION","CATEGORY"])
for index, row in df.iterrows():
if row['CATEGORY/ReGION'] in REGIONS:
temp = row['CATEGORY/ReGION']
else:
df1 = df1.append({"REGION": temp,"CATEGORY":row['CATEGORY/ReGION']},ignore_index = True)
print(df1)