Home > Blockchain >  Extracting colors in French
Extracting colors in French

Time:01-29

I have a CSV with some information about products color. Since sometimes some extra details are there, I would like to extract just the color name. I found out some library but my data are in french so they dont fit those. I try to do it with Python.

From "transparent blue" I want to just keep "blue"

The table is like :

Product ref Color Sales quantity
F33 Bleu transparent 2
K367 Ecaille Marron 1

And I am looking to take the "Bleu" (Blue) and "Marron" (brown) to see which colors are the more sale

CodePudding user response:

You could create a translator function and then apply this to the column.

here is an example (using the data in the question).

import pandas as pd

# original dataframe
data = {'Product ref': ['F33', 'K367'],
        'Color': ['Bleu transparent', 'Ecaille Marron'],
        'Sales quantity': [2, 1]}

df = pd.DataFrame(data)


def translate(french):
    ''' translating function '''
    if 'Bleu' in french:
        return 'blue'
    
    if 'Marron' in french:
        return 'brown'
    
    return '-'

# apply the result
df['english'] = df['Color'].apply(translate)
print(df)

This is the result:

  Product ref             Color  Sales quantity english
0         F33  Bleu transparent               2    blue
1        K367    Ecaille Marron               1  brown

Note: You could use a much more sophistocated translating and matching function (for example googletrans). The example above is a working example.

  • Related