Home > OS >  Json data stored in a string type column, how to interact with it
Json data stored in a string type column, how to interact with it

Time:02-11

I have a column of type String where json data is stored, I am not able to interact with this column.

    id                          active  indicators
0   617f27ad9df9b10b633d5850    True    [{'_id': '617abc32fe562a3281b4448d', 'column': 'created_at', 'indicatorId': '61088233be3a2e04a2d3d8f8', 'value': '00-00-00', 'key': '****'}, {'_id': '617abc32fe562a4441b2268e', 'column': 'hash', 'indicatorId': '61088233be3a2e04a2d3d8fc', 'value': '14fsdfsd44*', 'key': '****'}

The column types

df_raw_nps.dtypes
id                               string             
active                           boolean                        
indicators                       string                         
dtype: object

I'm trying to access to create new columns according to the values ​​of the fields "column" and "value" of this json.

d2 = pd.DataFrame([{n['column']:n['value'] for n in i}for i in df['indicators']])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
----> 1 d2 = pd.DataFrame([{n['column']:n['value'] for n in i}for i in df['indicators']])

TypeError: string indices must be integers

The result was to be like this:

    id                          active  created_at  hash
0   617f27ad9df9b10b633d5850    True    00-00-00    14fsdfsd44*

Does anyone know what I'm doing wrong? I've tried to convert the column to a json object but without success.

CodePudding user response:

As you have shown that the columns contains a string (which BTW is not a valid Json because it uses simple quotes...), you must literal_eval it:

d2 = pd.DataFrame([{n['column']:n['value'] for n in ast.literal_eval(i)}
                   for i in df['indicators']], index = df.index)

From there you have just to horizontaly concat it:

pd.concat((df.drop(columns='indicators'), d2), axis = 1).to_string()

to get as expected:

                         id  active created_at         hash
0  617f27ad9df9b10b633d5850    True   00-00-00  14fsdfsd44*
  • Related