I have a df like this:
Casa | Name | Clase_jfs | Categoria |
---|---|---|---|
Just_For_Sports | mochila reebok active | ACCESORIOS | mochila |
Just_For_Sports | tubo lejopi de pelotas softee | ACCESORIOS | tubo |
Just_For_Sports | pack de medias puma x2 | ACCESORIOS | pack |
Just_For_Sports | gorro adidas de natación 3 rayas | ACCESORIOS | natacion |
And 27 different Lists like these:
MODA=['mochila','wear', 'urban', 'pack']
TENIS=['tubo', 'raqueta','red']
NATACION=['natacion', 'pileta','tapon']
on the other hand I have an empty list:
intermedia1=[]
this is my current script:
for element in df_JFS['Categoria']:
if element in VOLEY:
intermedia1.append('VOLEY')
elif element in UNIFORMES:
intermedia1.append('UNIFORMES')
elif element in TREKKING_OUTDOOR_ADVENTURE:
intermedia1.append('TREKKING_OUTDOOR_ADVENTURE')
elif element in TRAINING:
intermedia1.append('TRAINING')
elif element in TENIS:
intermedia1.append('TENIS')
elif element in SURF:
intermedia1.append('SURF')
elif element in SQUASH:
intermedia1.append('SQUASH')
elif element in SKATEBOARD:
intermedia1.append('SKATEBOARD')
elif element in RUNNING:
intermedia1.append('RUNNING')
elif element in RUGBY:
intermedia1.append('RUGBY')
elif element in PING_PONG:
intermedia1.append('PING_PONG')
elif element in PESAS:
intermedia1.append('PESAS')
elif element in PADDLE:
intermedia1.append('PADDLE')
elif element in NATACION:
intermedia1.append('NATACION')
elif element in MODA:
intermedia1.append('MODA')
elif element in INFANTIL:
intermedia1.append('INFANTIL')
elif element in HOCKEY:
intermedia1.append('HOCKEY')
elif element in HANDBALL:
intermedia1.append('HANDBALL')
elif element in GOLF:
intermedia1.append('GOLF')
elif element in FUTBOL:
intermedia1.append('FUTBOL')
elif element in FRONTON:
intermedia1.append('FRONTON')
elif element in CICLISMO:
intermedia1.append('CICLISMO')
elif element in BASQUET:
intermedia1.append('BASQUET')
elif element in BASICOS:
intermedia1.append('BASICOS')
elif element in BASEBALL_SOFTBALL:
intermedia1.append('BASEBALL_SOFTBALL')
elif element in ARTES_MARCIALES_Y_BOX:
intermedia1.append('ARTES_MARCIALES_Y_BOX')
elif element in AEROBICS_Y_FITNESS:
intermedia1.append('AEROBICS_Y_FITNESS')
else:
intermedia1.append('OTROS')
df_JFS['Categoria']=intermedia1
How can it be done efficiently?
output should look like this:
Casa | Name | Clase_jfs | Categoria |
---|---|---|---|
Just_For_Sports | mochila reebok active | ACCESORIOS | MODA |
Just_For_Sports | tubo lejopi de pelotas softee | ACCESORIOS | TENIS |
Just_For_Sports | pack de medias puma x2 | ACCESORIOS | MODA |
Just_For_Sports | gorro adidas de natación 3 rayas | ACCESORIOS | NATACION |
df['Categoria'] value, should be the name of the list where the word was found
Thanks!
CodePudding user response:
Not sure about the time efficiency, but if you want to prevent boilerplate coding, you can use apply
function along with a few other steps:
import pandas as pd
# Defining the lists of data(rest of the code)
# .
# .
myDict ={'MODA':MODA, "TENIS":TENIS, "NATACION":NATACION}
def search(valueToSearch):
for key, valuesList in myDict.items():
if valueToSearch in valuesList:
return key
return "Not Found"
df["Categoria"] = df["Categoria"].apply(search)
df
Output
Casa | Name | Clase_jfs | Categoria | |
---|---|---|---|---|
0 | Just_For_Sports | mochila reebok active | ACCESORIOS | MODA |
1 | Just_For_Sports | tubo lejopi de pelotas softee | ACCESORIOS | TENIS |
2 | Just_For_Sports | pack de medias puma x2 | ACCESORIOS | MODA |
3 | Just_For_Sports | gorro adidas de natación 3 rayas | ACCESORIOS | NATACION |
Note that, you should define the myDict
as shown above. If you have any other list, you should define them in myDict
variable in the same way.
CodePudding user response:
There are few approaches I would suggests
Approach 1
The complexity of finding something in a list is O(n)
. it optimise that you can use a set instead which is O(1)
.
MODA = set(['mochila', 'wear', 'urban', 'pack'])
Approach 2
If all of the value of all the list is unique, you can create a dict
that map values to key.
You can just write a loop to map value to key the result should be like below:
{
'mochila': "MODA",
'wear': "MODA",
'urban': "MODA",
'pack': "MODA",
...
}