Home > OS >  How to create a single new column with numbers present in multiple columns in Excel using Excel or P
How to create a single new column with numbers present in multiple columns in Excel using Excel or P

Time:06-24

I have a file with several columns in which there are numbers. I would like these numbers to be in a single column. However, the title of the column and a line must also be present for each digit. In Excel it would be better, and otherwise with pandas in Python.

Here is an example:

original file

enter image description here

what I would like:

enter image description here

CodePudding user response:

With pandas you can try this:

import pandas as pd
vars = {
    'title_1': ['word_1', 'word_5'],
    'title_2': ['word_2', 'word_6'],
    'title_3': ['word_3', 'word_7'],
    'title_4': ['word_4', 'word_8'],
    'title_5': ['1', ''],
    'title_6': ['', '2'],
    'title_7': ['3', ''],
    'title_8': ['', '4'],
}

inital_df = pd.DataFrame(vars)

words_df = inital_df[['title_1', 'title_2', 'title_3', 'title_4']]
numbers_df = inital_df[['title_5', 'title_6', 'title_7', 'title_8']]
new_df = pd.DataFrame(columns = ['title_1', 'title_2', 'title_3', 'title_4', 'new_column_1', 'new_column_2'])

for row in range(numbers_df.shape[0]):
    for column in range(numbers_df.shape[1]):
        if numbers_df.iloc[row, column] != '':
            values =  words_df.iloc[row].to_list() 
            values.append(numbers_df.columns[column])
            values.append(numbers_df.iloc[row, column])
            new_df = new_df.append({new_df.columns[idx]:values[idx] for idx, value in enumerate(values)}, ignore_index=True)
print(new_df)

Maybe this is not the best solution considering computational cost, but it solves your problem.

  • Related