I am using imagededupe in Python to produce image embeddings I place in a folder and I want to take those embeddings and convert them back to the original .jpg image.
I encode each image by using the CNN
method (convolutional neural network trained on ImageNet
) method that package.
The resulting encodings are numpy.ndarray
type, like so:
{'IMG-7817.jpg': array([0. , 0.8666797 , 0.6738928 , ..., 0.19499177, 0.19915162,
0.11766607], dtype=float32)}
To persist them in memory, I have used numpy.ndarray.tolist()
to convert the ndarray
values into a list of floats
. Then it saves it as a new document into MongoDB.
Here is an example of one document, showing the floats:
{'_id': ObjectId('62de157cd66e524858266e56'),
'filename': 'image_0.jpg',
'createdAt': datetime.datetime(2022, 7, 24, 21, 0, 58, 297000),
'encoding': [0.34322163462638855,
0.509546160697937,
0.5979495048522949,
0.0,
0.9418766498565674,
0.062201134860515594,
0.0,
0.0,
0.3629385828971863,
0.8452704548835754,
1.0556049346923828,
0.15479359030723572,
0.05965745821595192,
0.6355874538421631,
0.0075227259658277035,
0.0,
0.7630028128623962,
0.25163599848747253,
0.38510754704475403,
0.05900629609823227,
1.259505033493042,
0.2511945962905884,
0.34552499651908875,
0.0,
0.20837950706481934,
0.0,
0.46649169921875,
0.04043807461857796,
0.04735632985830307,
2.764833450317383,
0.28932467103004456,
0.022755710408091545,
1.7064937353134155,
0.020368073135614395,
0.08486563712358475,
0.08789866417646408,
0.018082227557897568,
0.8046256899833679,
0.2572726905345917,
1.7080179452896118,
0.13402247428894043,
0.0,
0.6671139597892761,
0.6578285694122314,
0.8306216597557068,
0.33851000666618347,
0.8741984367370605,
1.769014596939087,
0.684082567691803,
0.5945196747779846,
0.0,
0.4923103153705597,
0.0,
0.0,
0.0,
0.43480363488197327,
0.0,
2.0243093967437744,
3.1641061305999756,
0.04148351401090622,
0.2754305601119995,
0.24396584928035736,
0.0,
0.02952236868441105,
1.3269319534301758,
0.400570809841156,
0.1814577877521515,
1.0266987085342407,
0.0,
0.48076146841049194,
0.31500837206840515,
0.24837727844715118,
0.0,
0.23274537920951843,
0.10061652213335037,
0.5313457250595093,
0.22604097425937653,
0.11777201294898987,
0.6426587700843811,
0.30787965655326843,
0.00855748075991869,
0.011529005132615566,
1.1762546300888062,
0.02678094431757927,
1.6777329444885254,
0.6672563552856445,
0.23667019605636597,
0.49905094504356384,
0.9757379293441772,
0.07683343440294266,
1.5291916131973267,
0.0,
0.3130893409252167,
0.6051976084709167,
0.017192933708429337,
0.43943557143211365,
0.2320941686630249,
0.3049321174621582,
1.4164737462997437,
3.0678188800811768,
0.027480242773890495,
0.0016468112589791417,
0.0,
0.07514918595552444,
0.43065083026885986,
3.375669479370117,
1.547513723373413,
0.4367760121822357,
0.004104389809072018,
0.19813460111618042,
0.0,
1.5236296653747559,
2.4143331050872803,
0.0,
0.4325718879699707,
0.3500346839427948,
0.7155059576034546,
0.0,
2.191272258758545,
0.021950488910079002,
0.6380945444107056,
0.07029495388269424,
0.9965856075286865,
0.7871404886245728,
0.020270364359021187,
0.21629869937896729,
0.22851204872131348,
0.6256837844848633,
0.6793181896209717,
0.0,
0.7013466358184814,
0.2701347768306732,
0.4660792052745819,
0.0,
0.99172443151474,
0.05413336679339409,
0.9221435785293579,
0.0,
0.9360405802726746,
0.0,
0.030728023499250412,
0.022367192432284355,
0.019651522859930992,
0.4800198972225189,
0.11290711909532547,
0.0,
0.8062187433242798,
0.0,
0.7398870587348938,
0.6118819713592529,
0.17569033801555634,
0.6082322597503662,
0.025949034839868546,
1.804274559020996,
0.4318274259567261,
0.0,
0.33641141653060913,
1.38775634765625,
0.0,
1.2272226810455322,
0.15384995937347412,
1.5630751848220825,
1.190468192100525,
0.5651965737342834,
1.4905359745025635,
0.08070334792137146,
0.099326491355896,
1.6428868770599365,
1.8835887908935547,
1.900753378868103,
0.0058065494522452354,
0.7904874682426453,
0.7456177473068237,
1.5191725492477417,
0.0,
1.411240577697754,
0.0,
0.2351973056793213,
0.47451984882354736,
0.6398224830627441,
0.06026613339781761,
0.06863820552825928,
0.33046841621398926,
1.8896198272705078,
0.021269584074616432,
2.3213772773742676,
0.19969674944877625,
0.22938404977321625,
0.138387069106102,
0.0,
0.2955643832683563,
0.7730927467346191,
1.2605562210083008,
1.813166618347168,
0.5475223064422607,
0.07392473518848419,
0.05272753909230232,
1.260231375694275,
0.0,
1.3669939041137695,
0.13212966918945312,
0.0,
1.810328722000122,
0.0,
0.968720555305481,
0.5265544056892395,
0.0,
0.0,
0.3666769862174988,
0.6280245780944824,
0.24455592036247253,
0.05917458236217499,
0.1274377703666687,
0.24018031358718872,
0.04338640719652176,
0.6593717336654663,
0.4561670124530792,
0.6908249258995056,
0.0,
0.025656165555119514,
1.4662184715270996,
0.4808516204357147,
0.48574984073638916,
0.0,
0.5596708059310913,
0.0,
0.07661600410938263,
0.8362483382225037,
0.019625132903456688,
1.4666523933410645,
0.0,
0.5307647585868835,
0.2795000374317169,
0.14065083861351013,
0.06074102967977524,
0.5063194036483765,
0.3797137141227722,
0.37272703647613525,
0.22654202580451965,
1.655469298362732,
0.0,
0.11777283996343613,
1.388221263885498,
0.327331006526947,
0.14950616657733917,
2.0307371616363525,
0.40243691205978394,
1.0219730138778687,
1.53922438621521,
1.2161401510238647,
0.7625423073768616,
0.1292436718940735,
0.9063143134117126,
0.9079506397247314,
0.37720248103141785,
1.4248236417770386,
1.437509298324585,
0.6693912148475647,
0.0,
0.0,
0.0027586750220507383,
0.9666323065757751,
0.0,
0.46942809224128723,
1.44985032081604,
0.34272393584251404,
2.2227885723114014,
0.0,
1.488860011100769,
0.2924092411994934,
1.0731092691421509,
1.4170044660568237,
0.10373884439468384,
0.12016452103853226,
0.02246188558638096,
0.9552142024040222,
0.05175960808992386,
0.9273093342781067,
0.4393492639064789,
0.3075776696205139,
0.2306509166955948,
0.0,
0.615312933921814,
0.16303738951683044,
0.0,
0.2877673804759979,
1.501681923866272,
0.4097016751766205,
2.9622018337249756,
0.5579401254653931,
0.142703577876091,
0.5920137166976929,
1.303241491317749,
0.00606588926166296,
0.4949913024902344,
0.19190119206905365,
0.733661413192749,
0.6974300742149353,
0.0,
0.8278228044509888,
0.5845810770988464,
0.026461739093065262,
0.4288907051086426,
0.21419385075569153,
2.954052686691284,
0.3760499656200409,
0.004199077375233173,
3.8070878982543945,
0.715491771697998,
0.4648321568965912,
0.0,
0.0,
0.19548028707504272,
0.48019057512283325,
1.1979873180389404,
0.07732359319925308,
0.19149938225746155,
0.1502079963684082,
2.0691733360290527,
2.4982733726501465,
1.337972640991211,
1.100319504737854,
2.6857542991638184,
0.29503199458122253,
0.23128174245357513,
0.0,
0.5455896854400635,
0.4850095510482788,
0.10392599552869797,
0.2443435788154602,
0.7817987203598022,
0.0,
2.063075065612793,
0.4417440593242645,
0.2553230822086334,
0.3762524724006653,
0.05001183599233627,
0.04212421923875809,
1.286272644996643,
0.9305449724197388,
0.02449200674891472,
1.2230279445648193,
0.0,
0.29419076442718506,
0.008204713463783264,
0.7008044123649597,
0.16965098679065704,
0.11801895499229431,
0.8991751670837402,
0.4699343144893646,
0.0,
0.8951627612113953,
1.3787250518798828,
0.0,
0.08008038997650146,
1.3252822160720825,
0.32005879282951355,
0.0,
0.4818739593029022,
0.4019497334957123,
0.7889391779899597,
0.0,
0.18576571345329285,
0.0,
1.3088619709014893,
2.7723488807678223,
0.00022756097314413637,
0.4293895363807678,
0.5022754073143005,
0.044507503509521484,
1.5894511938095093,
0.062480755150318146,
0.0,
1.1833703517913818,
1.6038163900375366,
1.7601900100708008,
0.11935224384069443,
0.0,
3.55781888961792,
0.9694040417671204,
0.38496172428131104,
0.09860406816005707,
1.0045870542526245,
3.0407869815826416,
1.3872655630111694,
0.5597723722457886,
0.4029926359653473,
1.82244873046875,
1.4435524940490723,
0.0,
0.268917053937912,
1.1448471546173096,
1.3053370714187622,
0.6695809364318848,
0.22877833247184753,
0.8759015202522278,
0.009998252615332603,
0.22522638738155365,
0.0,
2.089116096496582,
0.10225784778594971,
0.0835852101445198,
0.04399365931749344,
0.357903391122818,
0.00423638429492712,
2.197624444961548,
1.4854577779769897,
0.785973310470581,
0.0,
1.33061945438385,
0.05213224142789841,
0.0,
1.4183796644210815,
0.07906366884708405,
0.27266740798950195,
0.0919405072927475,
0.32426464557647705,
1.0865753889083862,
0.0,
0.17555345594882965,
0.8236625790596008,
0.5672846436500549,
0.0,
0.035536810755729675,
0.7000375986099243,
1.5232492685317993,
0.3168185353279114,
0.0,
0.29834094643592834,
0.20520535111427307,
0.4351038336753845,
0.13560494780540466,
0.48883458971977234,
0.02037435956299305,
1.1522188186645508,
0.9122401475906372,
0.0,
1.30799400806427,
0.22012335062026978,
0.0,
0.0,
0.24422557651996613,
0.0,
0.4273641109466553,
0.5228200554847717,
0.5666704773902893,
0.3363366425037384,
0.6759360432624817,
2.0034751892089844,
0.17202824354171753,
0.2595841884613037,
0.015393995679914951,
0.04131931811571121,
0.18808656930923462,
0.0669889822602272,
0.9347860217094421,
1.5079140663146973,
2.3521738052368164,
0.45415419340133667,
0.7850313186645508,
0.010190464556217194,
0.6210658550262451,
0.3110385835170746,
0.08557576686143875,
0.2882275879383087,
0.018340876325964928,
0.08235052973031998,
0.11521648615598679,
2.253997564315796,
1.2350491285324097,
0.08332804590463638,
0.1232355460524559,
2.3126087188720703,
1.156542181968689,
0.6510205864906311,
0.0,
0.12935137748718262,
0.0308428592979908,
0.3024436831474304,
0.0,
0.3228701055049896,
1.1835720539093018,
1.2879806756973267,
0.0,
0.0,
1.0283877849578857,
0.9930158257484436,
0.46817547082901,
1.4385830163955688,
1.4435007572174072,
0.3171677887439728,
0.6235666871070862,
0.5815529823303223,
0.5360093116760254,
0.9516975283622742,
0.1696314960718155,
0.09518808126449585,
0.030107799917459488,
0.31380897760391235,
0.0,
0.42714065313339233,
0.5804895162582397,
0.9608817100524902,
0.1775510311126709,
0.010171392001211643,
0.5893941521644592,
1.5398626327514648,
0.39006567001342773,
0.32146579027175903,
2.948575019836426,
0.06686578691005707,
1.531951904296875,
0.0,
0.17576193809509277,
0.2349756807088852,
0.9275945425033569,
0.3193134367465973,
3.367725372314453,
0.8526585102081299,
0.0,
0.422979474067688,
1.0967295169830322,
1.4886173009872437,
0.08168420940637589,
1.9246219396591187,
0.6297357082366943,
0.6322764158248901,
0.0,
2.65885329246521,
1.6137109994888306,
0.791668176651001,
0.18871046602725983,
1.6689802408218384,
0.219522163271904,
1.7474461793899536,
0.10053602606058121,
0.6737263798713684,
0.8441563248634338,
0.3114991784095764,
0.7566099166870117,
0.2245187759399414,
0.2011154741048813,
0.8214294910430908,
0.0,
1.9816653728485107,
0.7716283798217773,
1.6154274940490723,
0.22172033786773682,
0.0,
0.058957479894161224,
1.076833963394165,
0.26426446437835693,
1.0176373720169067,
0.3910500705242157,
2.1494927406311035,
0.664349377155304,
0.7940111756324768,
0.4397018253803253,
0.7703667283058167,
0.0,
0.018988996744155884,
0.5460900664329529,
1.7279266119003296,
0.754666805267334,
0.08635308593511581,
0.5351725816726685,
0.015438989736139774,
0.5810871720314026,
0.4795708954334259,
0.09397150576114655,
1.8130183219909668,
1.0192855596542358,
0.7982662916183472,
0.6442264914512634,
1.0579023361206055,
0.9965284466743469,
0.33102571964263916,
0.0,
0.12347199022769928,
0.0012085589114576578,
0.0,
0.012317202985286713,
0.7999298572540283,
0.23033346235752106,
0.10711371153593063,
0.055289968848228455,
0.10045799612998962,
0.2694406807422638,
1.6767593622207642,
0.08160638809204102,
0.055825188755989075,
0.35727065801620483,
0.14833365380764008,
0.03880086913704872,
0.05746801570057869,
0.023325448855757713,
0.3614659309387207,
1.1203327178955078,
0.23827773332595825,
0.1701105833053589,
0.005051849409937859,
0.014566520228981972,
0.0,
0.17170780897140503,
0.0,
0.3361126184463501,
0.7800708413124084,
1.4469071626663208,
0.9698413014411926,
0.47699087858200073,
0.0,
0.6676139235496521,
2.3889882564544678,
0.16141292452812195,
0.549674928188324,
1.446986198425293,
2.9572486877441406,
1.2994608879089355,
0.9938348531723022,
0.012836528941988945,
0.5091323852539062,
0.3293815851211548,
0.2696889638900757,
0.023653989657759666,
0.9666279554367065,
0.0,
0.09485077112913132,
0.21621057391166687,
0.05144309997558594,
0.7748975157737732,
0.16373832523822784,
0.32590582966804504,
3.2590131759643555,
0.18453021347522736,
0.32559701800346375,
0.01810051128268242,
1.3126763105392456,
0.9643300771713257,
0.7701171636581421,
0.012872778810560703,
3.2080838680267334,
1.3774534463882446,
0.6093534827232361,
0.4270886778831482,
1.536228060722351,
1.0596519708633423,
0.7293568849563599,
0.040961820632219315,
0.8098430037498474,
0.03354305401444435,
0.3781156837940216,
0.07808584719896317,
1.5445245504379272,
0.2643190026283264,
0.03979670628905296,
0.1401960253715515,
0.0,
2.858400821685791,
1.5187857151031494,
0.916731059551239,
0.4406169056892395,
2.0909416675567627,
0.041937537491321564,
0.26336491107940674,
0.6905952095985413,
0.5504804849624634,
0.7093060612678528,
0.0,
1.5680856704711914,
0.6002225875854492,
0.6944994926452637,
0.09715773910284042,
0.37384095788002014,
0.09308439493179321,
0.11243189871311188,
0.0,
0.0014087663730606437,
0.6081674695014954,
1.0300709009170532,
0.0,
1.6597859859466553,
0.0,
0.06655865907669067,
0.0,
0.0,
0.776603102684021,
0.706895112991333,
0.06840622425079346,
0.01103916671127081,
0.20892217755317688,
0.06662453711032867,
2.3237693309783936,
0.0,
1.001037836074829,
0.7960761189460754,
0.011129717342555523,
0.04408513754606247,
0.9169716835021973,
0.18723581731319427,
0.0,
0.13618384301662445,
0.1143331304192543,
0.08429567515850067,
0.5295025706291199,
0.24401012063026428,
0.4518049657344818,
1.6427943706512451,
2.9292097091674805,
0.2356298565864563,
1.886139988899231,
0.8494397401809692,
1.0584583282470703,
0.10908976197242737,
1.6228749752044678,
0.03479631990194321,
0.3667331635951996,
0.29944977164268494,
0.0,
0.5278289318084717,
1.8577454090118408,
0.7919538617134094,
0.3287472724914551,
1.0173267126083374,
0.21281661093235016,
2.5687320232391357,
0.0,
0.22931823134422302,
0.08953932672739029,
1.8022074699401855,
0.031096210703253746,
0.0,
0.3735235929489136,
0.9747275710105896,
1.4670822620391846,
0.0014497002121061087,
0.1918376237154007,
0.017492996528744698,
0.14093568921089172,
2.4359452724456787,
1.0266388654708862,
0.06744030863046646,
0.5923642516136169,
0.05829468369483948,
0.7882423996925354,
0.006824888754636049,
0.0,
0.12467359006404877,
0.0,
0.6454435586929321,
0.7348983287811279,
0.21230670809745789,
0.751591682434082,
0.10542069375514984,
2.747696876525879,
0.025314131751656532,
1.5147247314453125,
0.4253242611885071,
0.026335788890719414,
0.2820242941379547,
0.0,
1.629762887954712,
0.0,
1.0258194208145142,
0.01697576977312565,
0.0026492439210414886,
0.5882153511047363,
0.15281014144420624,
0.15450280904769897,
0.7614965438842773,
1.5575156211853027,
0.44215384125709534,
0.031301699578762054,
0.0,
0.41315919160842896,
0.0,
0.9163169860839844,
0.0,
0.6567441821098328,
2.000319480895996,
0.6224666833877563,
1.1936211585998535,
0.37969183921813965,
0.6791279315948486,
0.7023849487304688,
0.5725939273834229,
0.33029961585998535,
0.29946884512901306,
0.08158722519874573,
1.1009563207626343,
0.1951158344745636,
0.17529775202274323,
0.0,
0.24547788500785828,
0.4813558757305145,
0.10754260420799255,
1.0999802350997925,
0.09632908552885056,
1.257835030555725,
3.48246169090271,
1.4914885759353638,
1.3317164182662964,
0.0,
0.26352792978286743,
0.1344657689332962,
1.5231508016586304,
0.10092543810606003,
0.11918514966964722,
2.744619846343994,
0.0,
1.9301466941833496,
0.23720590770244598,
1.2849202156066895,
0.0021527463104575872,
0.22261980175971985,
0.0,
1.2995405197143555,
1.5964542627334595,
0.3902784585952759,
0.0,
0.39399170875549316,
1.5980095863342285,
0.5178284049034119,
1.002244472503662,
0.0,
1.0685653686523438,
0.3623090386390686,
0.24759413301944733,
2.319288969039917,
0.06726662069559097,
0.4107389748096466,
0.25172483921051025,
0.09692853689193726,
0.36315271258354187,
0.19497661292552948,
0.19528783857822418,
0.9833536744117737,
0.0,
0.3592410981655121,
0.3673408031463623,
0.9627466797828674,
0.026713738217949867,
0.6547496318817139,
0.04448238015174866,
0.008870258927345276,
0.6792797446250916,
0.0,
0.024829436093568802,
1.5601677894592285,
0.18391843140125275,
0.16963115334510803,
0.49852055311203003,
0.29731711745262146,
1.5001922845840454,
0.5966616868972778,
0.5658796429634094,
0.9493252635002136,
0.0,
0.0,
0.4046423137187958,
1.4287112951278687,
0.0,
0.15436147153377533,
0.3651731312274933,
0.0,
0.00114287412725389,
0.4599761962890625,
0.13904592394828796,
0.049609310925006866,
0.04458696022629738,
0.006772597320377827,
0.050582993775606155,
0.0,
0.0,
1.1677778959274292,
0.3135088384151459,
0.1667020320892334,
1.4050168991088867,
0.08352083712816238,
1.5827399492263794,
1.193649172782898,
0.030835293233394623,
0.0,
0.0015767943114042282,
0.7566667795181274,
0.26073670387268066,
1.8150725364685059,
0.0028364635072648525,
0.7325196266174316,
0.21360227465629578,
0.27992552518844604,
0.9042648673057556,
2.0153775215148926,
0.26339706778526306,
0.0,
0.6202089190483093,
0.10526447743177414,
2.882450580596924,
0.3095151484012604,
0.0,
0.637947142124176,
0.5944535136222839,
0.0,
0.0,
2.7035391330718994,
0.7932912111282349,
0.40263426303863525,
1.382933497428894,
0.5418493151664734,
0.08483023196458817,
0.7244842052459717,
0.20402145385742188,
0.27017006278038025,
1.0310587882995605,
0.0,
0.26290374994277954,
0.44489234685897827,
0.0,
0.10985265672206879,
4.00398588180542,
0.0,
1.889216661453247,
0.9396313428878784,
0.5248347520828247,
2.334782361984253,
0.6514081358909607,
0.021495413035154343,
0.30265650153160095,
0.0,
0.11324827373027802,
0.0,
0.0,
0.07938762754201889,
0.00594561081379652,
0.2369159758090973,
0.3200414478778839,
0.7243974804878235,
0.07896162569522858,
0.19521552324295044,
0.017373599112033844,
0.0,
0.7747635245323181,
0.0,
3.0585978031158447,
0.2982688248157501,
0.006080341059714556,
0.11861772835254669,
0.0,
0.0,
0.4043075144290924,
0.0,
0.2331291139125824,
0.05324462428689003,
0.3905235528945923,
0.49934008717536926,
0.0,
0.0,
0.2706427276134491,
0.0,
0.0,
0.02849772945046425,
0.5065882802009583,
1.5505584478378296,
1.4176472425460815,
0.830781102180481,
0.6771125197410583,
0.1085791289806366,
0.43909430503845215,
0.13297335803508759,
1.5684330463409424,
2.108699321746826,
0.11001390963792801,
0.4872213900089264,
0.32524189352989197,
0.0737423524260521,
0.14409983158111572,
0.9547043442726135,
1.1935787200927734,
0.16664843261241913,
0.282832533121109,
0.28268536925315857,
0.19144511222839355,
0.11462447047233582,
0.5705850720405579,
0.434655100107193,
0.03162039443850517,
0.8690363168716431,
0.049061935395002365,
2.015324592590332,
0.01103197317570448,
0.9478869438171387,
0.0,
0.7611132264137268,
1.5938303470611572,
0.41670796275138855,
0.5746119618415833,
0.6635866165161133,
1.0752760171890259,
0.06721699237823486,
0.05494782701134682,
0.396668016910553,
...],
'updatedAt': datetime.datetime(2022, 7, 25, 0, 33, 50, 36000)}
And so, I want to take those saved embeddings from MongoDB and convert them back to the original .jpg
image.
Can this be done?
I have tried unsuccessfully the following before asking:
- Took the embeddings list, passed it to np.array (converting it into ndarray), then tried to use Image.fromarray to generate the image. Got a
ValueError: not enough image data
code:
na = np.array( embeddings_doc['embeddings'] )
image = Image.fromarray(na.astype('uint8'), 'RGB')
image
- Tried to use PIL Image to render an image but it ended up displaying an image with a single line (the embeddings no doubt but a single line is not what I am after)
Before trying keras, perhaps someone can advise a better way to regenerate the original image from encodings?
I'm not saving the image shape. Should I? Is that going to be needed or can that be figured out by python somehow?
CodePudding user response:
The embeddings means following:
Assume there is a neural network that gives 3-component vector like [1,0,0]
.
[1,0,0]
-there is only a cat on the image.
[0,1,0]
-there is only a dog on the image.
[0,0,1]
-there is only a human on the image.
Assume we have a vector [0.7,0.2,0.5]
, this means:
70% chance that there is a cat on image, 20% chance there is a dog and 50% there is a human.
Since we know such vectors for images, we can compare them in this terms.
WARNING !!! THE FOLLOWING METHOD IS GIVEN ONLY TO EXPLAIN WHY IT IS NOT GOOD IDEA TO USE EMBEDDINGS FOR IMAGE STORING. TREAT "BLUR" AS AN OPERATION WHICH PROVIDES THE CERTAIN PROBABILITY OF OBJECT DETECTION.
The reconstruction of image by this vector can be done in following way:
- Create a dataset of 3 images - cat,dog and human.
- Blur each image for some amount to provide certain probability.
- Then each blurred image place to it's specific predefined position on resulting image.
By constructing image this way it is possible make image for each possible embeddings.
Let us change the images in the dataset. For the same embeddings there will be the other image. We can add some background, so we will get another image for our embeddings. So for one embeddings vector there is infinite variety of different images.
This method solves the reconstruction problem, but reconstruction result is useless for most cases. That shows that embeddings can not store image data.
Networks like imagededupe
works the same way, but they have more features and features means something else.
The embeddings vector is similar to verbal description of image (in some way).
There are networks like Dall-E
that generates images from verbal description, so theoretically it is possible to generate image by embeddings. But I didn't see the claim that Dall-E
can be applied for image compression.
Image can be encoded in jpg
and represented as bytes
type. It can be decoded later with cv2.decode
import cv2
{'IMG-7817.jpg':Binary(cv2.imencode('.jpg', img)[1].tobytes())}
CodePudding user response:
TL;DR: You can use an Autoencoder (AE) or a Variational Autoencoder (VAE) to get the embedding.
Image compression using nerual network is a more general research topic, which I am not familiar with. If you are interested, you can visit paper with code and read some survey papers, like this. Perhpas you can try some SOTA method.
However, as you want a encoder of images in your dataset instead of a universial encoder that applies to arbitrary images, you may "overfit" on your dataset (it seems that you don't care about generalizability) and get a much shorter embedding than the standard image compression method.
Ideally, as overfitting is acceptable, I think you can assign a random vector embedding of each image, and force the network(decoder) to reconstruct corresponding images(decode the random embedding). This is like forcing network to "remember" all of the image.
The above method can be improved. You can also use an encoder to encode images to vectors and a decoder to decode the vectors to images. In this way the embedding is "learnable", and the network(encoder) may learn to store some useful information of original images in the embedding, which makes reconstruction easier. For example, all the images of faces may be close to each other in the embedding space. Equipped with a learnable encoder, the decoder network can be more lightweight (the encoder is facing a easier task!).
So in conclustion, I think any encoder-decoder style network with a reconstruction loss is fine. The simplest solution will be using an Autoencoder (AE) or a Variational Autoencoder (VAE). For the decoder I think you can even use all kinds of generative model, like GAN, flow based models, diffustion model(but diffustion models may be unsuitable for compression).
Why your original attempt fails:
You may be confused about the meaning of "encode"/"decode" in neural network. It is different from "encode a string to bytes, and decode the bytes back to string". Also, it is not the same as "convert an image to numpy array and convert the numpy array back to image".
You may also misunderstand meaning of "embedding" produced by the neural network. In fact, "embedding" is a terminology used in deep learning. Generally, you feed a input to the network, and any "feature map"(intermediate results produced by the network) can be considered as an embedding.
Usually, the embedding you get is dependent on the task that the network is designed to do. For example, the embedding from CNN trained on ImageNet is good for classification (all images of faces may have similar embeddings, all images of flowers have similar embeddings which are different from face embeddings). But these embeddings are not suitable for reconstruction(It may lost the detailed information, only the high-level semantic info like "what object is inside the image"). If you want an embedding that preserves some property, choose a network that do related tasks and pick embeddings from it. Since you want a compression embedding that you can reconstruct images from, choose a network that trained with reconstruction loss (the CNN you used is a classification loss)!