How to apply pivot to the data frame columns if there is specific pattern like numbers then letters?-CodePudding

I am trying to write a code inside Python3 to convert the first table into the expected table. Would you please help me to convert the one I have into the expected one at the bottom?

CURRENT ONE:

1104369592
MA01609Worcester
MA01604Worcester
MA01604Worcester
1104842379
MA01040Hampden
MA01040Hampden
1104896613
MA02745Bristol
MA02745Bristol

EXPECTED ONE:

1104469592 MA01609Worcester
1104469592 MA01609Worcester
1104469592 MA01609Worcester
1104842379 MA01040Hampden
1104842379 MA01040Hampden
1104896613 MA02745Bristol
1104896613 MA02745Bristol

I did not find any solution. I tried pivot, but I dont know how to do it.

CodePudding user response：

May not be the cleanest, but it could work:

import pandas as pd

current = pd.Series([
    "1104369592",
    "MA01609Worcester",
    "MA01604Worcester",
    "MA01604Worcester",
    "1104842379",
    "MA01040Hampden",
    "MA01040Hampden",
    "1104896613",
    "MA02745Bristol",
    "MA02745Bristol"
])

def get_subframe(x):
    return pd.DataFrame({
        "num": x.iloc[0],
        "val": x.iloc[1:]
    })

groups = s.str.isnumeric().cumsum()
out = s.groupby(groups).apply(get_subframe).reset_index(drop=True)

out:

          num               val
0  1104369592  MA01609Worcester
1  1104369592  MA01604Worcester
2  1104369592  MA01604Worcester
3  1104842379    MA01040Hampden
4  1104842379    MA01040Hampden
5  1104896613    MA02745Bristol
6  1104896613    MA02745Bristol

CodePudding user response：

Another method:

mask = df["Column1"].str.match(r"\d{9}")

df["Column2"] = df.loc[~mask, "Column1"]
df.loc[~mask, "Column1"] = np.nan
df["Column1"] = df["Column1"].ffill()
df = df.dropna()

print(df)

Prints:

      Column1           Column2
1  1104369592  MA01609Worcester
2  1104369592  MA01604Worcester
3  1104369592  MA01604Worcester
5  1104842379    MA01040Hampden
6  1104842379    MA01040Hampden
8  1104896613    MA02745Bristol
9  1104896613    MA02745Bristol

Input dataframe:

            Column1
0        1104369592
1  MA01609Worcester
2  MA01604Worcester
3  MA01604Worcester
4        1104842379
5    MA01040Hampden
6    MA01040Hampden
7        1104896613
8    MA02745Bristol
9    MA02745Bristol