Home > database >  Check a word in a list and create a new variable
Check a word in a list and create a new variable

Time:11-11

I want to get de location that is inside obras desc using the list of citys that I have in a list.

I have the following dataframe

obras = pd.DataFrame([['1','Agua de Buenos Aires'],['2', 'Sistenas de carreteras Jujuy'],['3','Reasentamiento en Entre Ríos'], ['4','Rutas en Córdoba']],
columns = ['id', 'desc'])

And the list

list = ['Buenos Aires', 'Jujuy', Corrientes', 'Entre Ríos']

I try to do this

for s in obras["desc"]:if any(xs in s for xs in list):obras['Localidad'] = s 

The expected result would be:

id desc localidad
1 Agua de Buenos Aires Buenos Aires
2 Sistenas de carreteras Jujuy
3 Reasentamiento en Entre Ríos Entre Ríos
4 Rutas en Córdoba NaN

But the result I get is:

id desc localidad
1 Agua de Buenos Aires Reasentamiento en Entre Ríos
2 Sistenas de carreteras Reasentamiento en Entre Ríos
3 Reasentamiento en Entre Ríos Reasentamiento en Entre Ríos
4 Rutas en Córdoba Reasentamiento en Entre Ríos

How I can solve this problem?

thanks!!!

CodePudding user response:

You can check whether a list item exists in the string using apply:

import pandas as pd

obras = pd.DataFrame([['1','Agua de Buenos Aires'],['2', 'Sistenas de carreteras Jujuy'],['3','Reasentamiento en Entre Ríos'], ['4','Rutas en Córdoba']],columns = ['id', 'desc'])

list_ = ['Buenos Aires', 'Jujuy', 'Corrientes', 'Entre Ríos']
obras['localidad'] = obras['desc'].apply(lambda x: next(iter([i for i in list_ if i in x]), None))

Note that -given the desired output- this only returns the first match in case of multiple matches.

id desc localidad
0 1 Agua de Buenos Aires Buenos Aires
1 2 Sistenas de carreteras Jujuy Jujuy
2 3 Reasentamiento en Entre Ríos Entre Ríos
3 4 Rutas en Córdoba

PS. don't use list as a variable name as it is also a builtin python function.

  • Related