Home > database >  Splitting the columns and append the values in the dataframe
Splitting the columns and append the values in the dataframe

Time:11-13

I have a dataframe which has column as follows:

|REGION/CATEGORY|
|--|-|
|NORTHERN REGION|
|THERMAL|
|HYDRO|
|NUCLEAR|
|WESTERN REGION|
|THERMAL|
|HYDRO|
|NUCLEAR|
|SOUTHERN REGION|
|THERMAL|
|HYDRO|
|NUCLEAR|
|EASTERN REGION|
|THERMAL|
|HYDRO|
|NORTH EASTERN REGION|
|THERMAL|
|HYDRO|
|ALL INDIA REGION|
|THERMAL|
|HYDRO|
|NUCLEAR|

I want to split the column into two different columns in the dataframe i.e.Region and Category as column name.

REGION = ['NORTHERN REGION','WESTERN REGION','SOUTHERN REGION','EASTERN REGION','NORTH EASTERN REGION']
CATEGORY = ['THERMAL','NUCLEAR','HYDRO']

How can I write an if else statement so that I can get the following as desired output:

id REGION CATEGORY
11 NORTHERN REGION THERMAL
12 NORTHERN REGION NUCLEAR
13 NORTHERN REGION HYDRO
14 WESTERN REGION THERMAL
15 WESTERN REGION NUCLEAR
16 WESTERN REGION HYDRO
for df['REGION'] in df:
    if df['REGION'] == 'REGION':
        df['REGION'] = df['REGION'].append('REGION')
    elif df['CATEGORY'] == CATEGORY:
            df['CATEGORY'] = df['CATEGORY'].append('CATEGORY')

I tried to append it to the columns after splitting

CodePudding user response:

For the sake of simplicity I have used a list for the initial data.

data = ['NORTHERN REGION','THERMAL','NUCLEAR','HYDRO',
        'WESTERN REGION','THERMAL','NUCLEAR','HYDRO',
        'SOUTHERN REGION','THERMAL','NUCLEAR','HYDRO',
        'EASTERN REGION','THERMAL','HYDRO'
       ]

transformed = []

for elem in data:
    if elem.endswith('REGION'):
        active_region = elem
    else:
        transformed.append((active_region, elem))

df = pd.DataFrame(transformed, columns=('REGION', 'CATEGORY'))

CodePudding user response:

I hope this helps you...

df1 = pd.DataFrame(columns = ["REGION","CATEGORY"])
for index, row in df.iterrows():
    if row['CATEGORY/ReGION'] in REGIONS:
        temp = row['CATEGORY/ReGION']
    else:
        df1 = df1.append({"REGION": temp,"CATEGORY":row['CATEGORY/ReGION']},ignore_index = True)
print(df1)

OUTPUT:
enter image description here

  • Related