Home > database >  How to combine parts of different columns in a DataFrame?
How to combine parts of different columns in a DataFrame?

Time:02-21

Say I have a DataFrame:

     foo   bar
0    1998  abc
1    1999  xyz
2    2000  123

What is the best way to combine the first two (or any n) characters of foo with the last two of bar to make a third column 19bc, 19yz, 2023 ? Normally to combine columns I'd simply do

df['foobar'] = df['foo']   df['bar']

but I believe I can't do slicing on these objects.

CodePudding user response:

If you convert your columns as string with astype, you can use str accessor and slice values:

df['foobar'] = df['foo'].astype(str).str[:2]   df['bar'].str[-2:]
print(df)

# Ouput
    foo  bar foobar
0  1998  abc   19bc
1  1999  xyz   19yz
2  2000  123   2023

Using astype(str) is useless if your column has already object dtype like bar column (?).

You can read: Working with text data

CodePudding user response:

Example:

import pandas as pd

df = pd.DataFrame({"foo": [1998, 1999, 2000], "bar": ["abc", "xyz", "123"]})
df["foobar"] = df.apply(lambda x: str(x["foo"])[0:2]   x["bar"][1:], axis=1)
print(df)
#     foo  bar foobar
# 0  1998  abc   19bc
# 1  1999  xyz   19yz
# 2  2000  123   2023
  • Related