Home > Blockchain >  Process string in Pandas Dataframe rows to comma-delimitered chars
Process string in Pandas Dataframe rows to comma-delimitered chars

Time:05-25

I have a dataframe, with data in each row as such.

MKEYGEDLK

How can I process the sequence strings in each row, such that the format will be as such?

[M, K, E, Y, G, E, D, L, K]

I tried

get_seq_str = ','.join(test_df.loc[0]['seq_1'])
arr.append(get_seq_str)

However, when I append it to the dataframe, there is a single quotation mark at the start and end of each string, which I do not want.

['M, K, E, Y, G, E, D, L, K']

How can I strip the single quotation marks?

CodePudding user response:

IIUC, you can try apply list to string value

df['col_list'] = df['col'].apply(list)
print(df)

         col                     col_list
0  MKEYGEDLK  [M, K, E, Y, G, E, D, L, K]

CodePudding user response:

You can try this.

get_seq_str = [*test_df.loc[0]['seq_1']]

CodePudding user response:

You can use str.findall:

df['new'] = df['seq_1'].str.findall(r'[a-zA-Z]')

Example:

         seq_1                          new
0    MKEYGEDLK  [M, K, E, Y, G, E, D, L, K]
1  ?MKEY GEDLK  [M, K, E, Y, G, E, D, L, K]
  • Related