in my database I have an id (docdb_family_id
) and a list of ids (cited_docdb_list
) as follows:
{'docdb_family_id': {0: 3498148,
1: 3512921,
2: 3525647,
3: 3636418,
4: 3673165,
5: 3680127,
6: 3688953,
7: 3689983,
8: 3700898,
9: 3768731,
10: 3770463,
11: 3771404,
12: 3771425,
13: 3771495,
14: 3771604,
15: 3772274,
16: 3772510,
17: 3772940,
18: 3775109,
19: 3779413,
20: 3783583,
21: 3784332,
22: 3784469,
23: 3787179,
24: 3787982,
25: 3790639,
26: 3790670,
27: 3792458,
28: 3795015,
29: 3799670,
30: 3800683,
31: 3802132,
32: 3802281,
33: 3803326,
34: 3803728,
35: 3808684,
36: 3809416,
37: 3810114,
38: 3811389,
39: 3812435,
40: 3813073,
41: 3813312,
42: 3815934,
43: 3816821,
44: 3816927,
45: 3817424,
46: 3818542,
47: 3818766,
48: 3819057,
49: 3819335,
50: 3820633,
51: 3820694,
52: 3821540,
53: 3821838,
54: 3822049,
55: 3822089,
56: 3823057,
57: 3823114,
58: 3824187,
59: 3824375,
60: 3825785,
61: 3826171,
62: 3826211,
63: 3827560,
64: 3828464,
65: 3829519,
66: 3829990,
67: 3831455,
68: 3831510,
69: 3831784,
70: 3831999,
71: 3832248,
72: 3832987,
73: 3834046,
74: 3834444,
75: 3835251,
76: 3886195,
77: 3887480,
78: 3890389,
79: 3892024,
80: 3944218},
'cited_docdb_list': {0: '[3454392.0, 3489764.0, 3492286.0, 3802281.0, 3944218.0, 4161113.0, 6055754.0, 4167218.0, 6245259.0, 6310327.0, 6339325.0, 7865817.0, 10818295.0, 21820994.0, 25257112.0, 25333370.0, 25421470.0]',
1: '[22785397.0, 3800683.0]',
2: '[3508710.0, 3832248.0, 6015961.0, 9173676.0, 22615010.0]',
3: '[3482303.0, 3518675.0, 3688207.0, 3688953.0, 7856041.0, 9893906.0, 9911676.0, 21740142.0, 22095959.0, 22224845.0, 22455261.0, 22522023.0, 23039462.0, 23149018.0, 23248627.0, 25608484.0, 26145960.0, 26246393.0, 27122358.0, 27215945.0, 27267946.0, 27368911.0, 27535943.0, 27569239.0, 27759996.0, 34107815.0, 35219296.0, 46248356.0]',
4: '[7917626.0, 13587294.0, 15860525.0, 16099836.0, 18349663.0, 18831836.0, 24223941.0, 26558149.0]',
5: '[3680147.0, 3680169.0, 6442447.0, 8168860.0, 8170479.0, 8178540.0, 8178541.0, 10655404.0, 10764890.0, 10765687.0, 11600956.0, 14593411.0, 22296890.0, 22471622.0, 24169239.0, 24966171.0, 25033444.0, 25166841.0, 25372199.0, 25459000.0, 25533862.0, 25918313.0, 26371384.0, 26439834.0, 27274967.0, 27294655.0, 27523014.0]',
6: '[5459370.0, 16645542.0, 17462457.0, 21959571.0, 22010115.0, 22296144.0, 26927437.0, 33041169.0, 33101777.0, 34066530.0]',
7: '[7650618.0, 7806400.0, 7835575.0, 7857812.0, 8210353.0, 8232323.0, 8239494.0, 10024300.0, 11566936.0, 11637978.0, 11942149.0, 12192469.0, 12437164.0, 12474858.0, 12862377.0, 13357403.0, 13391145.0, 13884195.0, 14268316.0, 14780600.0, 14837681.0, 14959673.0, 15493334.0, 15660109.0, 15690908.0, 15706187.0, 15740492.0, 16185014.0, 16286275.0, 16301821.0, 16400795.0, 16599264.0, 16867936.0, 17017842.0, 17303135.0, 18156945.0, 18168645.0, 18351330.0, 18357701.0, 18361853.0, 18553020.0, 18665747.0, 22042028.0, 22509938.0, 22752953.0, 22752985.0, 22955054.0, 23605846.0, 23635250.0, 24042617.0, 24281660.0, 24426092.0, 24470177.0, 25217414.0, 25342266.0, 25399276.0, 25481652.0, 26026958.0, 26034429.0, 26150729.0, 26427482.0, 26488815.0, 26500234.0, 26537700.0, 26644976.0, 26692209.0, 26785282.0, 27339916.0, 27370666.0, 27372394.0, 27524906.0, 27563165.0, 29229947.0, 49274340.0]',
8: '[3764296.0, 3770459.0, 3773222.0, 3811210.0, 3825785.0, 6119308.0, 6262275.0, 6409776.0, 6450504.0, 6484157.0, 7640046.0, 7646955.0, 7762359.0, 7813503.0, 7823236.0, 7886063.0, 8103745.0, 10347742.0, 10563528.0, 11894269.0, 12556976.0, 12589238.0, 12666170.0, 12673679.0, 12702964.0, 13630878.0, 14026520.0, 14271281.0, 14325872.0, 14416179.0, 15383496.0, 15479503.0, 15920227.0, 16127226.0, 16222285.0, 16339588.0, 16871054.0, 16912938.0, 16912954.0, 16913656.0, 17401011.0, 17461197.0, 17474177.0, 17663812.0, 17724327.0, 18063449.0, 18227455.0, 18250669.0, 18386252.0, 18426307.0, 18587018.0, 18654484.0, 19300409.0, 19312456.0, 19372912.0, 19550439.0, 19638358.0, 19704233.0, 21801532.0, 21877403.0, 21974791.0, 22002267.0, 22067617.0, 22089128.0, 22098429.0, 22223747.0, 22276463.0, 22298327.0, 22341037.0, 22385483.0, 22395684.0, 22676560.0, 22731313.0, 22904054.0, 22918676.0, 23080548.0, 23084056.0, 23402016.0, 23516757.0, 23601888.0, 23628604.0, 23848237.0, 24030077.0, 24083853.0, 24132340.0, 24248118.0, 24251602.0, 24295241.0, 24316904.0, 24422851.0, 24429865.0, 24443752.0, 24547890.0, 24589548.0, 24632640.0, 24770649.0, 24785182.0, 24839047.0, 24962082.0, 25028009.0, 25378809.0, 25397848.0, 25410040.0, 25434196.0, 25449992.0, 25470970.0, 25494098.0, 25514405.0, 25525923.0, 25540364.0, 26040210.0, 26438189.0, 26450647.0, 26486031.0, 26707770.0, 26723069.0, 26723453.0, 26748272.0, 26870598.0, 26889379.0, 26889380.0, 26901249.0, 26985941.0, 26990011.0, 27000869.0, 27018916.0, 27025822.0, 27060755.0, 27060756.0, 27311622.0, 27315336.0, 27340467.0, 27569697.0, 37944191.0, 46149961.0, 46255262.0]',
9: '[8583594.0, 9119276.0, 21793982.0, 22133036.0, 24149220.0, 25776190.0, 26736757.0]',
10: '[10568655.0, 13302684.0, 19844775.0, 22493955.0, 26714695.0, 26997884.0]',
11: '[4344006.0, 24838031.0, 25098959.0, 25395637.0, 27025593.0]',
12: '[25642630.0, 25642846.0, 25642930.0, 26279148.0, 26287348.0]',
13: '[10451245.0, 10564358.0, 22491246.0, 24064440.0, 24279325.0, 24519613.0, 24651262.0, 25072503.0, 26461666.0, 26692304.0]',
14: '[4351264.0, 4384434.0, 6117960.0, 9116940.0, 10999954.0, 22148709.0, 22562211.0, 23862977.0, 24037344.0, 24361917.0, 24432647.0, 25076138.0, 26840072.0, 27429215.0]',
15: '[3692248.0, 6053171.0, 6226485.0, 12362875.0, 27371744.0]',
16: '[5933264.0, 6125219.0, 6247996.0, 10521070.0, 13063586.0, 15774983.0, 16803481.0, 16904934.0, 22065174.0, 27127184.0, 27496706.0, 27624793.0]',
17: '[3526456.0, 6170998.0, 6335295.0, 10505184.0, 11549684.0, 14422646.0, 15088415.0, 17645959.0, 22169836.0, 22901756.0, 22994874.0, 22994878.0, 23172874.0, 23925148.0, 25244507.0, 27389063.0]',
18: '[6350760.0, 20369026.0, 24216636.0, 26762272.0, 26927655.0, 27126594.0, 27371255.0]',
19: '[3775878.0, 6008063.0, 12812693.0, 13575794.0, 14790639.0, 22013262.0, 24622370.0, 26901485.0, 26985941.0, 27076644.0, 27112632.0]',
20: '[3775488.0, 10948289.0, 10952971.0, 10952974.0, 11367322.0, 12710129.0, 15469131.0, 22577881.0, 25644554.0, 26467182.0, 26933783.0, 27401801.0]',
21: '[6134715.0, 6350620.0, 15983939.0, 16269143.0, 17680987.0, 23994234.0, 24992672.0, 26268730.0, 26367621.0, 26629308.0, 26787837.0, 26988835.0, 27365620.0, 27455735.0, 27476152.0, 41508342.0]',
22: '[3690998.0, 3779413.0, 8103745.0, 10528617.0, 10533016.0, 14026520.0, 17474177.0, 21959397.0, 22069056.0, 23038428.0, 23077293.0, 24078130.0, 24160889.0, 25618055.0, 26462451.0, 27407332.0, 27569697.0]',
23: '[6512805.0, 8105738.0, 10680104.0, 10719170.0, 18290174.0, 22237701.0, 22290947.0, 23695912.0, 23765282.0, 24565635.0, 26289399.0, 27358491.0, 27420192.0]',
24: '[6462400.0, 16101703.0, 24045826.0, 25612324.0, 26283893.0, 26434155.0]',
25: '[8208100.0, 23566456.0, 23702554.0, 25266985.0, 26142859.0]',
26: '[3771632.0, 14240231.0, 15623240.0, 22486268.0, 23605938.0, 27170740.0]',
27: '[3798105.0, 46299235.0, 46299236.0, 46299237.0, 46299238.0, 46299740.0, 46299800.0]',
28: '[2631556.0, 2944019.0, 10790311.0, 13793711.0, 18470587.0, 21851951.0, 21924559.0, 23889759.0, 23927439.0, 23963011.0, 24766696.0, 26713651.0, 26990589.0, 27287227.0]',
29: '[3796218.0, 24589826.0, 25624390.0, 25765848.0]',
30: '[3772972.0, 6025591.0, 7764892.0, 12805981.0, 15547363.0, 16262273.0, 21905352.0, 22082762.0, 23922610.0, 23984212.0, 24257317.0, 25315731.0, 25402356.0, 25518280.0, 26719186.0, 26734227.0, 26940453.0, 26979759.0, 27025821.0, 27025822.0]',
31: '[3779080.0, 3794389.0, 9425562.0, 10768435.0, 22860582.0, 25471727.0, 25617513.0, 25620315.0, 25644721.0, 26092132.0, 27153345.0]',
32: '[2634854.0, 3700806.0, 3802276.0, 3802292.0, 3802325.0, 3802326.0, 3802327.0, 3802332.0, 3802333.0, 3802334.0, 3802337.0, 3802338.0, 3802339.0, 3802354.0, 3802356.0, 3805158.0, 3805178.0, 3805242.0, 3806854.0, 3808228.0, 3808232.0, 3808236.0, 3810760.0, 4258298.0, 6062612.0, 6161522.0, 6180029.0, 6243195.0, 6328004.0, 6352957.0, 6397822.0, 6415485.0, 6456158.0, 6476429.0, 6495895.0, 7588639.0, 9099878.0, 9119945.0, 9447476.0, 9454581.0, 9460842.0, 10036436.0, 10089783.0, 10642403.0, 10676758.0, 10702950.0, 10729821.0, 10746269.0, 11194385.0, 11411510.0, 11592343.0, 12638122.0, 12808119.0, 13792188.0, 13869248.0, 13880272.0, 14224791.0, 14363363.0, 14475114.0, 14555145.0, 14654996.0, 14659718.0, 14905880.0, 15009474.0, 15208979.0, 15365386.0, 15418108.0, 15427440.0, 15532726.0, 15759142.0, 15839949.0, 16148732.0, 16454470.0, 16472116.0, 16557241.0, 16567151.0, 16574330.0, 16670501.0, 16826733.0, 16866056.0, 16917358.0, 16952937.0, 17009237.0, 17042089.0, 17152410.0, 17167043.0, 17167057.0, 17176980.0, 17177751.0, 17203313.0, 17214040.0, 17359106.0, 17384372.0, 17390431.0, 17398779.0, 17419690.0, 17521757.0, 17541035.0, 17548222.0, 17692283.0, 17709222.0, 17752106.0, 17787836.0, 17980830.0, 18032898.0, 18091978.0, 18108188.0, 18157469.0, 18183177.0, 18202974.0, 18210551.0, 18356218.0, 18513671.0, 20358277.0, 21694022.0, 21760302.0, 21839477.0, 22005188.0, 22196129.0, 22231670.0, 22241704.0, 22321076.0, 22407725.0, 22574957.0, 22624317.0, 22688378.0, 22819977.0, 22837041.0, 22856540.0, 22891528.0, 22899520.0, 22911089.0, 22957363.0, 22978599.0, 23009341.0, 23016791.0, 23017033.0, 23194812.0, 23238114.0, 23242315.0, 23372955.0, 23403394.0, 23583171.0, 23717292.0, 23818247.0, 23822065.0, 23967128.0, 24023429.0, 24035021.0, 24041033.0, 24056428.0, 24092174.0, 24102216.0, 24115524.0, 24258574.0, 24305268.0, 24384033.0, 24407235.0, 24437414.0, 24440441.0, 24511068.0, 24607773.0, 24618564.0, 24640870.0, 24695776.0, 24712750.0, 24771021.0, 24777130.0, 24782249.0, 24802597.0, 24824797.0, 24857748.0, 24902244.0, 24921608.0, 24928011.0, 24981047.0, 24992362.0, 25006081.0, 25056097.0, 25079341.0, 25079896.0, 25098400.0, 25128528.0, 25157096.0, 25175720.0, 25184562.0, 25211651.0, 25273616.0, 25325219.0, 25395409.0, 25430909.0, 25431399.0, 25441093.0, 25458263.0, 25459754.0, 25478333.0, 25511171.0, 25540179.0, 25644902.0, 25645209.0, 25645479.0, 25645484.0, 25645485.0, 25645493.0, 25645494.0, 25645495.0, 25645496.0, 25645497.0, 25645498.0, 25645507.0, 25645510.0, 25645511.0, 25645513.0, 25645515.0, 25645516.0, 25645517.0, 25645524.0, 25645526.0, 25645531.0, 25645539.0, 25645541.0, 25645542.0, 25645545.0, 25645546.0, 25645548.0, 26138811.0, 26227739.0, 26352404.0, 26435079.0, 26437848.0, 26443181.0, 26495400.0, 26535740.0, 26564673.0, 26687357.0, 26688326.0, 26719816.0, 26767248.0, 26792309.0, 26883761.0, 27005599.0, 27048622.0, 27054476.0, 27157854.0, 27158025.0, 27204375.0, 27278808.0, 27279445.0, 27288524.0, 27308865.0, 27324977.0, 27325474.0, 27329746.0, 27339109.0, 27376149.0, 27467592.0, 27522909.0, 27526374.0, 27530134.0, 27530208.0, 27542962.0, 27550891.0, 27551736.0, 27552627.0, 27554184.0, 27554356.0, 27557355.0, 27577752.0, 27578291.0, 28455314.0, 29248275.0, 29999199.0, 31994773.0, 32302805.0, 32324813.0, 34221894.0, 34753905.0, 36808782.0, 36954792.0, 38226628.0, 38622001.0, 38622009.0, 46253718.0, 46302623.0, 46302626.0, 46330220.0, 56289952.0]',
33: '[9193201.0, 9690456.0, 11262890.0, 11857463.0, 20399558.0, 22182248.0, 23000715.0, 23242310.0, 23324343.0, 23849738.0, 24920698.0, 26305246.0]',
34: '[2103060.0, 3773965.0, 3774544.0, 3775695.0, 3775872.0, 3776256.0, 3786612.0, 3791581.0, 5870313.0, 5916275.0, 6021141.0, 6199234.0, 6245542.0, 6295893.0, 6295894.0, 6295895.0, 6296520.0, 6365302.0, 6421653.0, 6453213.0, 6470668.0, 6470669.0, 6505848.0, 7762300.0, 7996364.0, 8204435.0, 8504791.0, 8516769.0, 8537466.0, 9978587.0, 10525500.0, 10532630.0, 11421697.0, 11861168.0, 11938229.0, 12631519.0, 14831183.0, 15028144.0, 19729781.0, 19865575.0, 20357413.0, 21762166.0, 21916786.0, 22585241.0, 22736795.0, 22800842.0, 22821355.0, 23120569.0, 23397799.0, 23436004.0, 23481575.0, 23518025.0, 23722477.0, 23740173.0, 23790685.0, 23790691.0, 23790693.0, 23844609.0, 23967824.0, 24169834.0, 24225931.0, 24575089.0, 24686268.0, 24701256.0, 24701581.0, 24738797.0, 24962380.0, 25062108.0, 25145546.0, 25220031.0, 25326521.0, 25341958.0, 25350944.0, 25375270.0, 25532312.0, 25636025.0, 25671453.0, 25782505.0, 25782589.0, 26158327.0, 26516437.0, 26877119.0, 26950677.0, 27100111.0, 27157416.0, 27167473.0, 27286248.0, 27339086.0, 27339905.0, 27356707.0, 27404057.0, 27414896.0, 27461178.0, 27462950.0, 27464289.0, 27477792.0, 27490121.0, 41667474.0]',
35: '[3775287.0, 24656178.0, 25590998.0, 26752872.0, 27104052.0, 27111638.0, 27154855.0, 27449240.0, 27505577.0]',
36: '[10966704.0, 14429073.0, 14796404.0, 24388079.0, 25634499.0, 55024694.0]',
37: '[3810974.0, 6485046.0, 8220639.0, 10710317.0, 24372965.0, 25336013.0, 26139248.0, 30115768.0, 31188433.0, 34102684.0, 35502814.0, 41505355.0, 44170427.0, 46325309.0]',
38: '[3087303.0, 4124422.0, 20979317.0, 21870465.0, 23941444.0, 25013107.0, 25326934.0, 25638943.0, 26674623.0, 27041345.0, 27357929.0, 27505577.0]',
39: '[3796218.0, 3799670.0, 13202074.0, 16015369.0, 18376479.0, 21761811.0, 22420460.0, 25064869.0, 25362187.0, 25420991.0, 25645622.0]',
40: '[6383399.0, 11571184.0, 16203469.0, 19328209.0, 19338037.0, 23609959.0, 23669719.0, 24172105.0, 24533474.0, 25545404.0, 27031913.0, 27475424.0]',
41: '[3790030.0, 10844179.0, 17904788.0, 25518619.0, 25644273.0, 26230725.0, 27107515.0, 27358315.0]',
42: '[3765777.0, 5219438.0, 6509530.0, 9401909.0, 10606015.0, 11550806.0, 12762794.0, 13827315.0, 14042779.0, 15264928.0, 15458075.0, 15925094.0, 16128449.0, 17054858.0, 18055051.0, 18471454.0, 21862046.0, 22293413.0, 22679682.0, 24127226.0, 24176606.0, 24248291.0, 24679083.0, 25083983.0, 25400937.0, 26366826.0, 26985312.0]',
43: '[3775287.0, 3776915.0, 21721135.0, 22104735.0, 22570362.0, 25326934.0, 25584184.0, 25586333.0, 25638943.0, 26759870.0]',
44: '[4173143.0, 7807763.0, 13522010.0, 13654473.0, 13927771.0, 15719616.0, 16249907.0, 16525019.0, 21694632.0, 22093627.0, 22464844.0, 22964817.0, 23061734.0, 23211210.0, 23691361.0, 23831988.0, 23938149.0, 24244391.0, 24684633.0, 25241119.0, 25530551.0, 26801599.0, 27370214.0, 27539801.0, 27556890.0, 46249740.0]',
45: '[7830781.0, 7843024.0, 7852695.0, 8237386.0, 9444575.0, 10762585.0, 21739343.0, 21899596.0, 22200593.0, 23421862.0, 24138149.0, 25127817.0, 26792398.0, 33378328.0]',
46: '[24175325.0, 26752769.0, 26865384.0]',
47: '[3808127.0, 22989911.0, 22991587.0, 24661354.0, 25009434.0]',
48: '[3801540.0, 5986989.0, 7758470.0, 13433718.0, 13869888.0, 13870030.0, 13870091.0, 15253727.0, 15460683.0, 15581976.0, 15640684.0, 16014121.0, 17269442.0, 17330959.0, 18272758.0, 18289278.0, 19819299.0, 22635021.0, 22763032.0, 24234146.0, 25270151.0, 25330011.0, 26481016.0, 26873860.0, 30798811.0]',
49: '[4161793.0, 21787085.0, 22034688.0, 23282114.0, 24428824.0, 25016295.0]',
50: '[3793081.0, 3803264.0, 4207952.0, 11470889.0, 11669056.0, 12523378.0, 12636851.0, 12730154.0, 15584724.0, 16344287.0, 17109625.0, 17721742.0, 17745772.0, 17910462.0, 18186065.0, 18210837.0, 18223914.0, 21639272.0, 22927223.0, 26708844.0, 27047225.0, 27290433.0, 27308607.0, 27314463.0, 27584488.0, 60520854.0]',
51: '[26150927.0, 27292634.0]',
52: '[2092705.0, 2855690.0, 3448135.0, 3808851.0, 4531792.0, 7778731.0, 12783185.0, 17298876.0, 20135092.0, 20175428.0, 20913824.0, 21599292.0, 22046526.0, 22607332.0, 22691016.0, 22787233.0, 22930717.0, 23249413.0, 23308386.0, 23380573.0, 23824923.0, 23929977.0, 23970974.0, 24197297.0, 24485989.0, 25130652.0, 26732210.0, 26735928.0, 26743678.0, 26786285.0, 27584461.0, 29547928.0, 31990350.0, 78669067.0]',
53: '[22113349.0, 26695070.0, 27119373.0, 27493256.0]',
54: '[3777847.0, 3790007.0, 21871161.0, 22030506.0, 22031745.0, 22176213.0, 22401126.0, 23088391.0, 25613851.0, 25646253.0, 26671540.0, 26863907.0, 26903057.0, 27397174.0, 39338541.0]',
55: '[3802927.0, 3823288.0, 4984890.0, 4989432.0, 5073611.0, 5082137.0, 6061217.0, 6348178.0, 6423623.0, 10965588.0, 15797375.0, 18127308.0, 18175653.0, 18289498.0, 18849747.0, 21800742.0, 22397195.0, 23221251.0, 23468869.0, 23690813.0, 24191813.0, 24284509.0, 24708045.0, 24855719.0, 25014176.0, 25360346.0, 26846684.0, 27033183.0, 27275736.0, 27331606.0, 27490188.0, 27535521.0, 27568184.0, 27574439.0, 27578281.0, 27578284.0, 27650233.0, 34549244.0, 34746656.0, 35542271.0, 35736297.0, 36587440.0, 37433822.0, 37967362.0, 38022911.0, 38066849.0, 39925109.0, 46251516.0, 46252778.0, 46252929.0]',
56: '[1343281.0, 1345715.0, 3512210.0, 3783167.0, 4382571.0, 5813114.0, 7093752.0, 8235578.0, 8518638.0, 8783563.0, 8850107.0, 9121566.0, 9923753.0, 9955607.0, 10692798.0, 12383956.0, 12776229.0, 12886199.0, 12969910.0, 14707530.0, 14889080.0, 15072156.0, 19041276.0, 20298361.0, 21688702.0, 21900949.0, 21937269.0, 22104118.0, 22153767.0, 22186346.0, 22826706.0, 22855741.0, 22953235.0, 23004360.0, 23134063.0, 23354534.0, 23591524.0, 24305737.0, 24462242.0, 24489942.0, 24592901.0, 24641378.0, 25198004.0, 25253475.0, 25275454.0, 25432521.0, 25488956.0, 25643518.0, 26068855.0, 26166520.0, 26320235.0, 26328728.0, 26331139.0, 26428311.0, 26693295.0, 26791936.0, 26793455.0, 26961378.0, 26972264.0, 27059428.0, 27157985.0, 27313342.0, 27379089.0, 27395407.0, 27399829.0, 27424041.0, 27424409.0, 27517571.0, 27547373.0, 27584206.0, 28676052.0, 29709654.0, 29765036.0, 30774464.0, 32030450.0, 33159613.0, 33476757.0, 34135377.0, 34193337.0, 34958524.0, 36144355.0, 36567630.0, 36950563.0, 36971922.0, 37494273.0, 37855421.0, 37911312.0, 37989420.0, 38051788.0, 38218330.0, 38345747.0, 38420621.0, 38624732.0, 38823526.0, 38876900.0, 38962587.0, 39101659.0, 39226884.0, 39271180.0, 39387557.0, 39439714.0, 39561752.0, 39643971.0, 39673143.0, 39688790.0, 39748498.0, 39758481.0, 39789493.0, 39832372.0, 40003041.0, 40227969.0, 40380014.0, 40511531.0, 40565551.0, 40567797.0, 40624345.0, 40667466.0, 40824391.0, 40944227.0, 41129307.0, 41210096.0, 41277879.0, 41398494.0, 42073897.0, 42310155.0, 42546349.0, 42727821.0, 42826416.0, 42993293.0, 43014521.0, 43062470.0, 43220481.0, 43223027.0, 43301173.0, 43357321.0, 43478228.0, 43823348.0, 43876770.0, 44319684.0, 44369791.0, 44486085.0, 44531864.0, 45035300.0, 45066335.0, 45493803.0, 45495953.0, 45559863.0, 45925310.0, 45927155.0, 46300493.0, 46328187.0, 46798573.0, 46928022.0, 47018208.0, 47219813.0, 47296708.0, 47882498.0, 48535050.0, 48613244.0, 48692634.0, 49624233.0, 50184623.0, 50773492.0, 50775319.0, 51263061.0, 51581192.0, 51581222.0, 51842981.0, 52278746.0, 52280706.0, 52466999.0, 52544493.0, 52779208.0, 53403873.0, 54287989.0, 54782889.0, 54929359.0, 55019821.0, 55646830.0, 57249384.0, 57249913.0, 57325991.0, 59743243.0]',
57: '[3796196.0, 21858396.0, 25495565.0]',
58: '[3813145.0, 4154951.0, 6018005.0, 6040632.0, 6179742.0, 6395409.0, 6481277.0, 9158815.0, 9288505.0, 10699030.0, 13165538.0, 13755942.0, 14985984.0, 15515377.0, 15951653.0, 21965800.0, 22532548.0, 23301780.0, 23973288.0, 24550262.0, 24731087.0, 24876009.0, 25480283.0, 25489069.0, 26724897.0, 27296379.0, 27358904.0, 27410676.0, 46252098.0]',
59: '[3781927.0, 3789640.0, 3813305.0, 10731687.0, 11027021.0, 20414469.0, 23714925.0, 32595626.0, 33029875.0]',
60: '[3700898.0, 3764296.0, 3770459.0, 3773222.0, 3811210.0, 5987130.0, 6119308.0, 6262275.0, 6409776.0, 6450504.0, 6484157.0, 7640046.0, 7646955.0, 7762359.0, 7812486.0, 7813503.0, 7823236.0, 7886063.0, 8103745.0, 10347742.0, 10563528.0, 11004384.0, 11509383.0, 12543065.0, 12556976.0, 12589238.0, 12653339.0, 12666170.0, 12673679.0, 12702964.0, 14026520.0, 14266412.0, 14271281.0, 14325872.0, 14416179.0, 14516479.0, 14785130.0, 15044247.0, 15383496.0, 16127226.0, 16222285.0, 16960430.0, 17266862.0, 17401011.0, 17461197.0, 17474177.0, 17724327.0, 18063449.0, 18250669.0, 18265166.0, 18426307.0, 19300409.0, 19312456.0, 19372912.0, 19550439.0, 19638358.0, 19704233.0, 21801532.0, 21877403.0, 21974791.0, 22002267.0, 22026693.0, 22067617.0, 22089128.0, 22098429.0, 22164670.0, 22223747.0, 22244680.0, 22276463.0, 22298327.0, 22341037.0, 22385483.0, 22395684.0, 22439618.0, 22676560.0, 22718956.0, 22731313.0, 22904054.0, 22918676.0, 23080548.0, 23084056.0, 23218996.0, 23402016.0, 23423296.0, 23516757.0, 23601888.0, 23628604.0, 23848237.0, 23994110.0, 24030077.0, 24083853.0, 24132340.0, 24248118.0, 24295241.0, 24316904.0, 24422851.0, 24429865.0, 24443752.0, 24547890.0, 24589548.0, 24632640.0, 24741062.0, 24770649.0, 24785182.0, 24828348.0, 24839047.0, 24962082.0, 25028009.0, 25031599.0, 25341468.0, 25342918.0, 25378809.0, 25397848.0, 25410040.0, 25434196.0, 25449992.0, 25470970.0, 25494098.0, 25501373.0, 25514405.0, 25525923.0, 25540364.0, 26040210.0, 26228525.0, 26438189.0, 26450647.0, 26451566.0, 26470665.0, 26486031.0, 26707770.0, 26723069.0, 26723453.0, 26735162.0, 26748272.0, 26754314.0, 26870598.0, 26889379.0, 26889380.0, 26901249.0, 26985941.0, 26989589.0, 27000869.0, 27018916.0, 27025822.0, 27060755.0, 27060756.0, 27218208.0, 27293276.0, 27311622.0, 27316775.0, 27340467.0, 27569697.0, 31501140.0, 34800104.0, 37944191.0, 46149961.0, 46255262.0]',
61: '[3815245.0, 3817049.0, 4133414.0, 4237390.0, 6139410.0, 6302055.0, 6327475.0, 6359463.0, 7761745.0, 10634188.0, 10656776.0, 10799990.0, 11834232.0, 16311228.0, 16686050.0, 17340430.0, 21736076.0, 21792800.0, 22060322.0, 22083057.0, 22105805.0, 22177967.0, 22267098.0, 22415413.0, 22587189.0, 22605414.0, 22605428.0, 22626741.0, 22915051.0, 22915132.0, 22915137.0, 22916043.0, 23096413.0, 23212725.0, 23567105.0, 23567123.0, 23591762.0, 23793319.0, 23812585.0, 24102064.0, 24464348.0, 24622307.0, 25253365.0, 25352342.0, 25353269.0, 25427184.0, 25545290.0, 25671035.0, 26295357.0, 26368255.0, 26595469.0, 26726319.0, 26743135.0, 26822697.0, 26997208.0, 26997210.0, 27015502.0, 27015504.0, 27035582.0, 27038209.0, 27056966.0, 27062452.0, 27081705.0, 27383119.0, 27494547.0, 27547324.0]',
62: '[8193903.0, 8212273.0, 9247849.0, 9463029.0, 10512343.0, 11040434.0, 19848880.0, 21871975.0, 22614354.0, 25182231.0, 25355514.0, 27116547.0]',
63: '[3814657.0, 3816821.0, 3818830.0, 9372780.0, 22791620.0, 22805152.0, 23283422.0, 25248920.0, 25586333.0, 27020756.0, 27125092.0, 27145399.0, 27435241.0, 27449240.0, 27582841.0]',
64: '[3775561.0, 3778209.0, 3780242.0, 3783251.0, 3784665.0, 3788774.0, 3798212.0, 3811858.0, 3812283.0, 21830878.0, 21921748.0, 21993829.0, 22457245.0, 22460889.0, 23262728.0, 23400964.0, 23566456.0, 24092138.0, 24403780.0, 25289929.0, 25369658.0, 25618677.0, 25619320.0, 25629177.0, 25634619.0, 25645458.0, 26901477.0, 27038338.0, 27156461.0, 27158001.0, 27372667.0, 27391046.0, 27503418.0, 27537075.0]',
65: '[3769083.0, 3838826.0, 6518919.0, 7655380.0, 7671393.0, 9161974.0, 11933062.0, 12421582.0, 14111284.0, 15041555.0, 17038380.0, 17934524.0, 17951479.0, 17951704.0, 18736765.0, 21855631.0, 22254687.0, 22522730.0, 22525819.0, 22654614.0, 23072375.0, 23161341.0, 23682934.0, 23928270.0, 24002481.0, 25012845.0, 25464571.0, 25530090.0, 25936857.0, 26407346.0, 26861077.0, 41210539.0]',
66: '[3771617.0, 3807056.0, 8167498.0, 9489516.0, 13059819.0, 15236705.0, 17288890.0, 18106562.0, 18243976.0, 19449212.0, 19549705.0, 20360746.0, 21950670.0, 22523056.0, 22590937.0, 22822082.0, 22985088.0, 23085669.0, 23264894.0, 23454885.0, 23791789.0, 24158232.0, 24239892.0, 24257894.0, 24280874.0, 24434788.0, 24953310.0, 24990933.0, 25037706.0, 26312302.0, 26461656.0, 26569604.0, 26755930.0, 26802300.0, 26860472.0, 26891244.0, 26998345.0, 27036330.0, 27157297.0, 27377463.0]',
67: '[8223754.0, 21700957.0, 22248239.0, 24188773.0, 25199790.0, 25489601.0, 27370550.0]',
68: '[3824061.0, 10778962.0, 27157905.0]',
69: '[3885448.0, 4265687.0, 6453737.0, 15055174.0, 21588115.0, 22803210.0, 22810531.0, 22830406.0, 23778134.0, 23779509.0, 26598222.0, 27395145.0, 27536489.0]',
70: '[3817251.0, 3824297.0, 11604215.0, 13348182.0, 15295862.0, 17007082.0, 19729972.0, 19731450.0, 22867664.0, 23356034.0, 24169834.0, 25375270.0, 26970267.0, 27553681.0, 31500731.0, 31500732.0, 35705261.0]',
71: '[5931149.0, 19811894.0, 19812444.0, 22378265.0, 22409405.0, 23400964.0, 24164668.0, 25377816.0, 25484442.0, 26737825.0, 27395052.0, 27403058.0, 27517636.0]',
72: '[3772180.0, 4094759.0, 4099701.0, 4109923.0, 21758734.0, 22489510.0, 22802791.0, 23109074.0, 23332890.0, 23945495.0, 25404671.0, 26988331.0]',
73: '[22556333.0, 23537378.0, 23653584.0, 26050881.0, 26840895.0, 26877180.0, 27462050.0, 27463470.0]',
74: '[3775845.0, 24206625.0]',
75: '[4064369.0, 4172630.0, 8512849.0, 8513675.0, 10827902.0, 22681078.0, 24186095.0, 24990003.0, 26677157.0]',
76: '[4215108.0, 5754390.0, 6381956.0, 9309964.0, 13707851.0, 22117877.0]',
77: '[10969359.0, 11059344.0, 17714515.0, 19284446.0, 22690303.0, 26320567.0, 26415947.0]',
78: '[3888446.0, 3888996.0, 14727195.0, 22113364.0, 22782837.0, 25044309.0, 25167905.0, 26670443.0]',
79: '[3887054.0, 3889614.0, 3890522.0, 9303701.0, 9484895.0, 11363415.0, 14241244.0, 15291648.0, 16966026.0, 23250732.0, 24016081.0, 24393431.0, 24563127.0, 24788233.0, 25941613.0, 26366102.0, 27392409.0]',
80: '[27415886.0]'}}
within the list cited_docdb_list
, however, there are ids that do not appear id docdb_family_id
. What I would like to do is to detect the number of ids within cited_docdb_list
which also appear in docdb_family_id
. Is there a way to do so? My df is very large actually (almost 700000 observations). Please notice that the type of docdb_family_id
and cited_docdb_list
differs in the data.
The expected outcome, for instance for the first couple of docdb_family_ids
should be:
docdb_family_id nb_included
3498148, 2
3512921, 1
...
where 3498148, 2
comes from the fact that the cited_docdb_list
related to 3498148 cites 2 indices that appear in docdb_family_id
, namely 3802281 and 3944218. In the same fashion, 3512921 cites 3800683 within cited_docdb_list
.
Thank you
CodePudding user response:
First idea is test intersection of set
s with converted lists of strings to list of integers and get length of sets for nb_included
:
import ast
df['cited_docdb_list'] = df['cited_docdb_list'].apply(ast.literal_eval)
sets = set(df['docdb_family_id'])
df['nb_included']=[len(set(map(int,x)).intersection(sets)) for x in df['cited_docdb_list']]
print (df)
docdb_family_id cited_docdb_list \
0 3498148 [3454392.0, 3489764.0, 3492286.0, 3802281.0, 3...
1 3512921 [22785397.0, 3800683.0]
2 3525647 [3508710.0, 3832248.0, 6015961.0, 9173676.0, 2...
3 3636418 [3482303.0, 3518675.0, 3688207.0, 3688953.0, 7...
4 3673165 [7917626.0, 13587294.0, 15860525.0, 16099836.0...
.. ... ...
76 3886195 [4215108.0, 5754390.0, 6381956.0, 9309964.0, 1...
77 3887480 [10969359.0, 11059344.0, 17714515.0, 19284446....
78 3890389 [3888446.0, 3888996.0, 14727195.0, 22113364.0,...
79 3892024 [3887054.0, 3889614.0, 3890522.0, 9303701.0, 9...
80 3944218 [27415886.0]
nb_included
0 2
1 1
2 1
3 1
4 0
.. ...
76 0
77 0
78 0
79 0
80 0
[81 rows x 3 columns]
Pandas solution with DataFrame.explode
and Series.isin
for test membership, last for count True
s aggregate sum
:
df = (df.assign(cited_docdb_list = df['cited_docdb_list'].apply(ast.literal_eval))
.explode('cited_docdb_list')
.astype({'cited_docdb_list':int})
.assign(nb_included=lambda x: x['cited_docdb_list'].isin(x['docdb_family_id']))
.groupby('docdb_family_id', as_index=False)['nb_included']
.sum())
print (df)
docdb_family_id nb_included
0 3498148 2
1 3512921 1
2 3525647 1
3 3636418 1
4 3673165 0
.. ... ...
76 3886195 0
77 3887480 0
78 3890389 0
79 3892024 0
80 3944218 0
[81 rows x 2 columns]