I have a data frame containing name
and father's name
as shown in the example below
name 1
name 2
name 3
father's name 1
father's name 2
father's name 3
name 4
name 5
name 6
father's name 4
father's name 5
father's name 6
now I want to create a data frame from this so that it appears as shown below
name 1 father's name 1
name 2 father's name 2
name 3 father's name 3
name 4 father's name 4
name 5 father's name 5
name 6 father's name 6
I want to do it in pandas and I have tried something mentioned in the link below Pandas every nth row but it did not helped much.
CodePudding user response:
You can create a condition based on the index of that name column then filter the column with boolean condition
mask = df['col'].index.values // 3 % 2 == 0
out = df.loc[mask, ['col']].reset_index(drop=True) ' ' df.loc[~mask, ['col']].reset_index(drop=True)
print(out)
col
0 name 1 father's name 1
1 name 2 father's name 2
2 name 3 father's name 3
3 name 4 father's name 4
4 name 5 father's name 5
5 name 6 father's name 6
CodePudding user response:
try:
df
col1
0 name 1
1 name 2
2 name 3
3 father's name 1
4 father's name 2
5 father's name 3
6 name 4
7 name 5
8 name 6
9 father's name 4
10 father's name 5
11 father's name 6
df['col2'] = df['col1'].str.split('\s').str[-1]
df1 = df.groupby('col2')['col1'].apply(lambda x: ' '.join(x)).reset_index()
df1
col2 col1
0 1 name 1 father's name 1
1 2 name 2 father's name 2
2 3 name 3 father's name 3
3 4 name 4 father's name 4
4 5 name 5 father's name 5
5 6 name 6 father's name 6
df1 = df1.drop(columns='col2')
df1
col1
0 name 1 father's name 1
1 name 2 father's name 2
2 name 3 father's name 3
3 name 4 father's name 4
4 name 5 father's name 5
5 name 6 father's name 6