Find coincidence and add column-CodePudding

I want to achieve this specific task, I have 2 files, the first one with emails and credentials:

[email protected]:Xavier
[email protected]:vocojydu
[email protected]:voluzigy
[email protected]:Pussycat5
[email protected]:xrhj1971
[email protected]:xrhj1971

and the second one, with emails and location:

[email protected]:BOSNIA
[email protected]:ROMANIA

I want that, whenever the email from the first file is found on the second file, the row is substituted by EMAIL:CREDENTIAL:LOCATION , and when it is not found, it ends up being: EMAIL:CREDENTIAL:BLANK

so the final file must be like this:

[email protected]:Xavier:BOSNIA
[email protected]:vocojydu:BLANK
[email protected]:voluzigy:ROMANIA
[email protected]:Pussycat5:BLANK
[email protected]:xrhj1971:BLANK
[email protected]:xrhj1971:BLANK

I have do several tries in python, but it is not even worth it to write it because I am not really close to the solution.

Regards !

EDIT:

This is what I tried:

import os
import sys


with open("test.txt", "r") as a_file:

  for line_a in a_file:

   stripped_email_a = line_a.strip().split(':')[0]


   with open("location.txt", "r") as b_file:


        for line_b in b_file:

          stripped_email_b = line_b.strip().split(':')[0]
          location = line_b.strip().split(':')[1]

          if stripped_email_a == stripped_email_b:
            a = line_a   ":"   location
            print(a.replace("\n",""))
          else:
            b = line_a   ":BLANK"
            print (b.replace("\n",""))

This is the result I get:

[email protected]:Xavier:BOSNIA
[email protected]:Xavier:BLANK
[email protected]:voluzigy:BLANK
[email protected]:voluzigy:ROMANIA
[email protected]:vocojydu:BLANK
[email protected]:vocojydu:BLANK
[email protected]:Pussycat5:BLANK
[email protected]:Pussycat5:BLANK
[email protected]:xrhj1971:BLANK
[email protected]:xrhj1971:BLANK
[email protected]:xrhj1971:BLANK
[email protected]:xrhj1971:BLANK

I am very close but I get duplicates ;)

Regards

CodePudding user response：

The duplication issue comes from the fact that you are reading two files in a nested way, once a line from the test.txt is read, you open the location.txt file for reading and process it. Then, you read the second line from test.txt, and re-open the location.txt and process it again.

Instead, get all the necessary data from the location.txt, say, into a dictionary, and then use it while reading the test.txt:

email_loc_dict = {}
with open("location.txt", "r") as b_file:
    for line_b in b_file:
        splits = line_b.strip().split(':')
        email_loc_dict[splits[0]] = splits[1]

with open("test.txt", "r") as a_file:
    for line_a in a_file:
        line_a = line_a.strip()
        stripped_email_a = line_a.split(':')[0]
        if stripped_email_a in email_loc_dict:
            a = line_a   ":"   email_loc_dict[stripped_email_a]
            print(a)
        else:
            b = line_a   ":BLANK"
            print(b)

Output:

[email protected]:Xavier:BOSNIA
[email protected]:vocojydu:BLANK
[email protected]:voluzigy:ROMANIA
[email protected]:Pussycat5:BLANK
[email protected]:xrhj1971:BLANK
[email protected]:xrhj1971:BLANK