My dataframe
is currently created to be wide
with many columns, after the for
statement below is executed. I want to stack multiple columns of data so that the dataframe
is long
and remove any blank rows from col4
before the output dataframe
is generated. The reason for the latter part (remove blanks before output is generated) is because the dataframe will be too large for any output to be created with the blank values included.
code:
# dataframe
df0 = pd.DataFrame(data ={'col1':[123,123,456,456],'col2':['one two three',
'green yellow','four five six','green yellow']})
# words to search for
search_words1 = ['one','three','four','six','green yellow']
# create columns for each search word and indicate if search word is found for each row
for n in search_words1:
df0[n] = np.where(df0['col2'].str.contains(n),n,'')
# stack all search word columns created and remove blank rows in col4 before output is generated
df0 = pd.concat([
df0[['col1']].melt(value_name='col3'),
df0[['one','three','four','six','green yellow']].melt(value_name='col4')],
axis=1)
df0.loc[:,['col3','col4']]
current output:
col3 col4
0 123.0 one
1 123.0
2 456.0
3 456.0
4 NaN three
5 NaN
6 NaN
7 NaN
8 NaN
9 NaN
10 NaN four
11 NaN
12 NaN
13 NaN
14 NaN six
15 NaN
16 NaN
17 NaN green yellow
18 NaN
19 NaN green yellow
desired output:
col3 col4
0 123.0 one
1 123.0 three
2 123.0 green yellow
3 456.0 four
4 456.0 six
5 456.0 green yellow
CodePudding user response:
try this:
search_words1 = ['one','three','four','six','green yellow']
search_words1 = '|'.join(search_words1)
df0['col2'] = df0.col2.str.findall(search_words1)
df0.explode('col2')
>>>
col1 col2
0 123 one
0 123 three
1 123 green yellow
2 456 four
2 456 six
3 456 green yellow
CodePudding user response:
you can remove all nan
and blank spaces like such.
col3 = df0['col3']
col4 = df0['col4']
three = col3[col3.notna()]
four = col4[col4 != ""]
print(three, '\n', four)
out:
0 123.0
1 123.0
2 456.0
3 456.0
Name: col3, dtype: float64
0 one
4 three
10 four
14 six
17 green yellow
19 green yellow
Name: col4, dtype: object