Home > other >  saving appended list/dictionary to pandas dataframe
saving appended list/dictionary to pandas dataframe

Time:10-20

I am working on a code like below, which slices the address column. see code

import pandas as pd

data = {'id':  ['001', '002', '003','004'],
        'address': ["William J. Clare\\n290 Valley Dr.\\nCasper, WY 82604\\nUSA",
                    "1180 Shelard Tower\\nMinneapolis, MN 55426\\nUSA",
                    "William N. Barnard\\n145 S. Durbin\\nCasper, WY 82601\\nUSA"]

df_dict = df.to_dict('records')

final = []
for row in df_dict:
    add = row["address"]
    # print(add.split("\\n") , len(add.split("\\n")))
    if len(add.split("\\n")) > 3:
        target = add.split("\\n")
        target = target[-3:]
        target = '\\n'.join(target)
    else:
        target = add.split("\\n")
        target = '\\n'.join(target)
    final.append(target)
    print(target)

After slicing I am appending into final. However, final is list. I want to convert it into dataframe along with ID.

sample out put:

id  address
1   290 Valley Dr.\\nCasper, WY 82604\\nUSA
2   1180 Shelard Tower\\nMinneapolis, MN 55426\\nUSA
3   145 S. Durbin\\nCasper, WY 82601\\nUSA

Your help will be greatly appreciated.

Thanks in advance

CodePudding user response:

What about this?

data = {'id':  ['001', '002', '003'],
        'address': ["William J. Clare\\n290 Valley Dr.\\nCasper, WY 82604\\nUSA",
                    "1180 Shelard Tower\\nMinneapolis, MN 55426\\nUSA",
                    "William N. Barnard\\n145 S. Durbin\\nCasper, WY 82601\\nUSA"]
       }

data= pd.DataFrame(data)

CodePudding user response:

You can operate directly on your df using str.split and apply to re-join the last 3 segments:

import pandas as pd

data = {'id':  [1, 2, 3],
        'address': ["William J. Clare\\n290 Valley Dr.\\nCasper, WY 82604\\nUSA",
                    "1180 Shelard Tower\\nMinneapolis, MN 55426\\nUSA",
                    "William N. Barnard\\n145 S. Durbin\\nCasper, WY 82601\\nUSA"]
}

df = pd.DataFrame(data).set_index('id')
df['address'] = df['address'].str.rsplit('\\n', n=3).apply(lambda x: '\\n'.join(x[-3:]))
print(df)

Output:

                                           address
id
1            290 Valley Dr.\nCasper, WY 82604\nUSA
2   1180 Shelard Tower\nMinneapolis, MN 55426\nUSA
3             145 S. Durbin\nCasper, WY 82601\nUSA
  • Related