Home > database >  Python Pandas Data frame Pivoting
Python Pandas Data frame Pivoting

Time:12-04

I have such .txt file:

Field Value
First 1
Second alfa
First 23
Second beta
First 55
Second omega

I need to read and transform this file to get data like this:

First Second
1 alfa
23 beta
55 omega

I start with this:

file = './data.txt'
df = pd.read_csv(file, sep='\t',header=None, skiprows=89, skipfooter=11, engine='python')
df = df.pivot(values=1, columns=0)

but it looks as I need to generate some indexes otherwise my pivoted table looks not very well

First Second
1
alfa
23
beta
55
omega

Is any other solution hot to read that data and get the results that I need?

CodePudding user response:

in order to make your code work I had to modify the way you access the .csv file, as I don't have that many rows.

import pandas as pd

file = './data.txt'
df = pd.read_csv(file, sep='\t',header=0, engine='python')
df = df.pivot(values='Value', columns='Field')

# for each column on the dataframe, sort the value and ignore the index
for col in df.columns:
      df[col] = df[col].sort_values(ignore_index=True)

# drop NaN
df.dropna(axis=0, how='all', inplace=True)

# Show dataframe
print(df)

Here some more info about .sort_values:

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sort_values.html

Hope it can help :)

CodePudding user response:

The trick is you need to create common keys for the index. Using .assign create a column named CommonKeys which is the cumcount of grouping on the Fields column. Finally chain functions to pivot and clean up the df.

df = (
    df.assign(CommonKeys=df.groupby("Field").cumcount())
    .pivot(index="CommonKeys", columns="Field", values="Value")
    .reset_index(drop=True)
    .rename_axis(None, axis=1)
)

print(df)

Output:

  First Second
0     1   alfa
1    23   beta
2    55  omega
  • Related