I'm well aware I've done an awful job with the title. However, I couldn't find a better choice of words to explain my problem.
Let's assume we've got 2 Pandas dataframes. One of them contains a column with simple string values in each row e.g. an url. The other, stores the same values in another column but within lists. Please see a simplified example below:
df1's output would look like this:
url name
x a
y b
df2's output would look like this:
url title
[x, y] a
[x, y] b
My intent is to extract the common values from df2 so that I can use them to vlookup another column via merge function and add the aforesaid column to df1.
For that, I would like to get the following structure:
df2:
url title
x a
x b
y a
y b
I've been going over different ways via for loops but I always come to an impasse because of the bloody list. Please note that both dataframes df1 and df2 have been built from 2 different lists of dictionaries.
Have you got any clue on how to work this out?
Thank you very much in advance!
CodePudding user response:
Try this:
exploded = df2.explode('url', ignore_index=True)
Output:
>>> exploded
url title
0 x a
1 y a
2 x b
3 y b
Note: if your Pandas version is lower than 1.1.0, ignore_index
for explode
is not available. Use this instead:
df2.explode('url').reset_index(drop=True)