Home > Back-end >  How can I filter a file based on my input file?
How can I filter a file based on my input file?

Time:03-06

I have list of phone number (25million) I want to use that list as input file. Lets say that I have email phone database and I want to only extract phone number that available in input file(25 million) How can I do that in em editor? Or in any large file?

CodePudding user response:

To extract all matched string

Suppose you have a 25 million phone number list (file A) and a phone-email database file (file B).

  1. Open file B, use a regular expression to extract phone numbers only. To do this, press Ctrl F to bring up the Find dialog box, set the Regular Expressions option, and depending on the phone number format, enter [0-9]{3,3}-[0-9]{3,3}-[0-9]{4,4} or \([0-9]{3,3}\)[0-9]{3,3}-[0-9]{4,4} to the Find box. Click Extract button, and save the phone numbers only database as a new file (file C).
  2. Open file A and select Tab (or any CSV format) in the CSV toolbar (or select the Edit menu - CSV - Tab separated).
  3. Open file C and select the same CSV format as 2.
  4. Click Join CSV button on the Sort toolbar (or select Edit menu - CSV - Join) to bring up the Join CSV dialog box.
  5. Select A.txt and set the Unique Key option as CSV Document 1, and select C.txt and set the Unique Key option as CSV Document 2.
  6. Select Whole strings match as the Conditions, and set the Match Case option.
  7. Deselect A.txt from the list box, and ensure C.txt is selected.
  8. Click Join Now. A new document will be created with all matched strings. Save this file as file D.

Join CSV

To extract all matched lines

If file D is small enough, you can use Advanced Filter to filter file B with the contents of file D.

  1. Copy the file D contents to the Clipboard. (To do this, Open file D with EmEditor, press CTRL A, and CTRL C)
  2. Open file B with EmEditor, click Advanced Filter on the Filter toolbar.
  3. Right-click on the list box, and paste the Clipboard contents.
  4. While all items in the list box are selected, make sure the Logical Disjunction (OR) to the Previous Condition option is set.
  5. Click the Filter button, and click the Close button if necessary.
  6. Click the Extract All button to extract all matched lines to a new document.
  • Related