I have this dataset:
ID L a b
1 29.13 15.82 5.40
2 25.23 17.89 7.32
3 28.10 18.23 7.40
4 26.39 17.18 7.32
5 26.08 14.20 5.59
6 29.62 18.23 6.73
7 32.96 20.82 9.56
L
, a
, and b
are measures to describe Cielab color Standard, and I would like to compare each ID with another set (B1 to B8) of colors using Delta E
(https://www.viewsonic.com/library/creative-work/what-is-delta-e-and-why-is-it-important-for-color-accuracy/). So, I want my final dataset this way:
ID L a b CS
1 29.13 15.82 5.40 B1
2 25.23 17.89 7.32 B2
3 28.10 18.23 7.40 B3
4 26.39 17.18 7.32 B4
5 26.08 14.20 5.59 B7
6 29.62 18.23 6.73 B8
7 32.96 20.82 9.56 B1
OBS= for CS I used random values, I did not do the calculation. And CS
is related to the lowest value of Delta E
among B1 to B8.
To do this I tryed this code:
import pandas as pd
# load excel with its path
filename = ("my_file.xlsx")
sh = pd.read_excel(filename)
#The set of color that I want to compare
b = []
b.extend([[61.3989, 29.7278, 19.1271]]) #B1
b.extend([[50.1579, 44.9554, 28.1963]]) #B2
b.extend([[42.9522, 54.1075, 38.8624]]) #B3
b.extend([[36.6442, 60.9148, 51.0431]]) #B4
b.extend([[31.2828, 54.7112, 45.1422]]) #B5
b.extend([[25.2966, 47.7847, 37.7562]]) #B6
b.extend([[19.3373, 40.9029, 29.2594]]) #B7
b.extend([[13.0943, 33.665, 20.5282]]) #B8
### Delta E formule
DEb = []
for x in b:
DEb.extend([((x[0]-sh['L'])**2 (x[1]-sh['a'])**2 (x[2]-sh['b'])**2)**0.5])
#### This will select the lowest value of the set of values (B1 to B8)
sh['CS'] = str('B') str(DEb.index(min(DEb)) 1)
print(sh['CS'])
Using isolated ID
values work well, using this code:
import pandas as pd
filename = ("my_file.xlsx")
sh = pd.read_excel(filename)
print (sh['L'])
def meat_color (Lab):
### Padrão CieLab BBQ
b = []
b.extend([[61.3989, 29.7278, 19.1271]])
b.extend([[50.1579, 44.9554, 28.1963]])
b.extend([[42.9522, 54.1075, 38.8624]])
b.extend([[36.6442, 60.9148, 51.0431]])
b.extend([[31.2828, 54.7112, 45.1422]])
b.extend([[25.2966, 47.7847, 37.7562]])
b.extend([[19.3373, 40.9029, 29.2594]])
b.extend([[13.0943, 33.665, 20.5282]])
#print('lista: ',b)
#print(b[0][0])
### Criando Lista com valores Delta 1 para grau
DEb = []
for x in b:
DEb.extend([((x[0]-Lab[0])**2 (x[1]-Lab[1])**2 (x[2]-Lab[2])**2)**0.5])
print(DEb)
meat_color = str('B') str(DEb.index(min(DEb)) 1)
print('Cor de carne:', meat_color)
#Valor
#print('DE: ', nsmallest(1, DEb))
return meat_color
meat_color([29.13,15.82 ,5.40 ])
but when I use the complete dataset I got this error:
Traceback (most recent call last):
File "d:\Análises\labels_nome\Programas\import_openpyxl.py", line 27, in <module>
sh['cor_carne'] = str('B') str(DEb.index(min(DEb)) 1)
File "C:\Users\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\pandas\core\generic.py", line 1442, in __nonzero__
raise ValueError(
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
CodePudding user response:
I still can't follow the logic of this, but since you said, it works for isolated ID's, here is one way how you could do it. Your whole calculation part will be put in a function which is applied rowwise.
def calculations(row):
DEb = []
for x in b:
DEb.append(((x[0]-row['L'])**2 (x[1]-row['a'])**2 (x[2]-row['b'])**2)**0.5)
print(DEb)
return str('B') str(DEb.index(min(DEb)) 1)
sh['CS'] = sh.apply(calculations, axis=1)
print(sh)
Output:
# list "DEb" for each row:
[37.72455543091264, 42.552501750895914, 52.694573561705575, 64.60105855394322, 55.64712432929486, 45.6438460264908, 35.97665651585761, 28.36283016079319]
[39.8463354207134, 42.30550764687737, 51.192838140212544, 62.39492647876108, 53.131496968559055, 42.66215880719118, 32.33658534941499, 23.88686640666791]
[37.12870884450467, 40.413909131139484, 49.976649195899476, 61.64183386540345, 52.587848692640016, 42.459842464262636, 32.690651172162354, 25.21427954017326]
[39.01894378965171, 42.09757495699722, 51.311496985081234, 62.68644847405219, 53.507469367556524, 43.17644561667855, 33.07346117448248, 24.959715758197248]
[40.88753289769389, 45.12964322105815, 54.62904255842308, 66.02922959031099, 56.855984540943446, 46.51032371517102, 36.31458891492509, 27.760843804358682]
[35.99702735588037, 39.960660370669544, 49.97433033918514, 61.927234463441046, 53.00132041864617, 43.06735615393636, 33.576303817722405, 26.490154939712976]
[31.299350415943135, 35.00852717924591, 45.45912848757662, 57.810153972204574, 49.168276298035906, 39.75988480981805, 31.2564743382871, 26.075687387487985]
As you can see, the lowest value is always the last value (index 7). So your final result looks like this:
ID L a b CS
0 1 29.13 15.82 5.40 B8
1 2 25.23 17.89 7.32 B8
2 3 28.10 18.23 7.40 B8
3 4 26.39 17.18 7.32 B8
4 5 26.08 14.20 5.59 B8
5 6 29.62 18.23 6.73 B8
6 7 32.96 20.82 9.56 B8