Pandas - Propagate the string ending with certain substring down a column until a new one is listed-CodePudding

I've converted a pdf table into a csv and need to clean up two index columns below.

I'd like to propagate the string ending with "LB" down Column A until a new one is listed.

Trying to create two new columns, from this...:

A          B

30,000 LB  NaN
foo        high
bar        low
25,000 LB  NaN
baz        high
zoo        low

...to this.:

A            B

30,000 LB    high
30,000 LB    low
25,000 LB    high
25,000 LB    low

CodePudding user response：

You can check for column A ends with LB by str.endswith().

Then use .where() to mask other row entries not ends with LB to NaN.

Finally, forward fill the NaN values with .ffill() for column A

To further clean up, we remove the rows where column B have NaN values by .dropna().

df['A'] = df['A'].where(df['A'].str.endswith('LB')).ffill()

df = df.dropna(subset=['B'])

Result:

print(df)

           A     B
1  30,000 LB  high
2  30,000 LB   low
4  25,000 LB  high
5  25,000 LB   low

CodePudding user response：

It looks like something went wrong with converting PDF?

Anyhow, I would first put NaN in column A, where is not NaN in column B. Then use fillna from pandas to propagate down the values in column A. Finally, dropna since in column B there are still missing values.