I have such a list at hand. In this list, I want to filter the deposits
under each witdrawal
by removing the same ones from another list. This cluster is currently clustered over 2 withdrawals
, but this may vary. Therefore, as much as a withdrawal
cluster in one cycle, the deposit
in one withdrawal
should not be in another withdrawal
cluster. For this, I tried various lambda functions over deposit
id
, but I could not get the desired output. How can I provide this?
exampleList = [
{
"withdrawal": {
"amount": 250,
"id": 70916631583,
"date": "31-05-22 - 16:14:08",
"paytype": "withdrawal"
},
"deposit": [
{
"id": 71018974368,
"amount": 120,
"date": "01-06-22 - 14:27:26",
"paytype": "deposit"
},
{
"id": 71018971332,
"amount": 100,
"date": "01-06-22 - 14:27:23",
"paytype": "deposit"
}
]
},
{
"withdrawal": {
"amount": 220,
"id": 71019072820,
"date": "01-06-22 - 14:28:40",
"paytype": "withdrawal"
},
"deposit": [
{
"id": 71033338591,
"amount": 100,
"date": "01-06-22 - 17:03:19",
"paytype": "deposit"
},
{
"id": 71033144597,
"amount": 250,
"date": "01-06-22 - 17:01:20",
"paytype": "deposit"
},
{
"id": 71018974368,
"amount": 120,
"date": "01-06-22 - 14:27:26",
"paytype": "deposit"
},
{
"id": 71018971332,
"amount": 100,
"date": "01-06-22 - 14:27:23",
"paytype": "deposit"
}
]
}
]
Example Output:
exampleOutputList = [
{
"withdrawal": {
"amount": 250,
"id": 70916631583,
"date": "31-05-22 - 16:14:08",
"paytype": "withdrawal"
},
"deposit": [
{
"id": 71018974368,
"amount": 120,
"date": "01-06-22 - 14:27:26",
"paytype": "deposit"
},
{
"id": 71018971332,
"amount": 100,
"date": "01-06-22 - 14:27:23",
"paytype": "deposit"
}
]
},
{
"withdrawal": {
"amount": 220,
"id": 71019072820,
"date": "01-06-22 - 14:28:40",
"paytype": "withdrawal"
},
"deposit": [
{
"id": 71033338591,
"amount": 100,
"date": "01-06-22 - 17:03:19",
"paytype": "deposit"
},
{
"id": 71033144597,
"amount": 250,
"date": "01-06-22 - 17:01:20",
"paytype": "deposit"
}
]
}
]
The deposits with id 71018974368
and 71018971332
that I show in the sample printout are not available in the next one as they were in the previous withdrawal cluster. This is exactly what I wanted to do. This withdrawal clustering can be more than 2, so it can vary, so doing this by indexing the elements will not solve my problem.
I tried something like this. I waited for it to resend the ids into an empty list and filter through the loop, but the output I got did not change.
listLen = len(exampleList)
testList = []
if(listLen > 0):
while listLen > 0:
listLen -= 1
deposits = exampleList[listLen]['deposit']
withDrawal = exampleList[listLen]['withdrawal']
idList = [x['id'] for x in deposits]
filterFromList = list(filter(lambda x:x['id'] not in testList, deposits))
testList.append({"withdrawal" : withDrawal,"deposit" : filterFromList})
print(testList)
Output
[{'withdrawal': {'amount': 220, 'id': 71019072820, 'date': '01-06-22 - 14:28:40', 'paytype': 'withdrawal'}, 'deposit': [{'id': 71033338591, 'amount': 100, 'date': '01-06-22 - 17:03:19', 'paytype': 'deposit'}, {'id': 71033144597, 'amount': 250, 'date': '01-06-22 - 17:01:20', 'paytype': 'deposit'}, {'id': 71018974368, 'amount': 120, 'date': '01-06-22 - 14:27:26', 'paytype': 'deposit'}, {'id': 71018971332, 'amount': 100, 'date': '01-06-22 - 14:27:23', 'paytype': 'deposit'}]}, {'withdrawal': {'amount': 250, 'id': 70916631583, 'date': '31-05-22 - 16:14:08', 'paytype': 'withdrawal'}, 'deposit': [{'id': 71018974368, 'amount': 120, 'date': '01-06-22 - 14:27:26', 'paytype': 'deposit'}, {'id': 71018971332, 'amount': 100, 'date': '01-06-22 - 14:27:23', 'paytype': 'deposit'}]}]
There are repetitive deposit ids and elements as seen in the output.
CodePudding user response:
List = [
{
"withdrawal": {
"amount": 250,
"id": 70916631583,
"date": "31-05-22 - 16:14:08",
"paytype": "withdrawal"
},
"deposit": [
{
"id": 71018974368,
"amount": 120,
"date": "01-06-22 - 14:27:26",
"paytype": "deposit"
},
{
"id": 71018971332,
"amount": 100,
"date": "01-06-22 - 14:27:23",
"paytype": "deposit"
}
]
},
{
"withdrawal": {
"amount": 220,
"id": 71019072820,
"date": "01-06-22 - 14:28:40",
"paytype": "withdrawal"
},
"deposit": [
{
"id": 71033338591,
"amount": 100,
"date": "01-06-22 - 17:03:19",
"paytype": "deposit"
},
{
"id": 71033144597,
"amount": 250,
"date": "01-06-22 - 17:01:20",
"paytype": "deposit"
},
{
"id": 71018974368,
"amount": 120,
"date": "01-06-22 - 14:27:26",
"paytype": "deposit"
},
{
"id": 71018971332,
"amount": 100,
"date": "01-06-22 - 14:27:23",
"paytype": "deposit"
}
]
}
]
def func(d):
if type(d)==list:
for i in reversed(range(len(d))):
v=d[i]
if v.get('id') in (71018974368,
71018971332):
d.pop(i)
else:
func(v)
elif type(d)==dict:
for k,v in d.items():
func(v)
func(List)
print(List)
CodePudding user response:
You could keep a set of already-seen ids as you traverse the data. For each cluster, keep a side list of ids not seen and replace the the "deposit" list before advancing to the next cluster. This is a lot easier than trying to track indexes of the nested collections.
seen = set()
for cluster in exampleList:
filtered = []
for deposit in cluster["deposit"]:
if deposit["id"] not in seen:
seen.add(deposit["id"])
filtered.append(deposit)
cluster["deposit"][:] = filtered