Is there any method to count different items in the text file for every matched string?-CodePudding

The text file looks like

data/File_10265.data:

Apple:

Banana:

data/File_10276.data:

Apple:

Banana:

data/File_10278.data:

Apple:

Banana:

The code is as follows:

import re
f = open("Samplefruit.txt", "r")
lines = f.readlines()
Apple_count=0
Banana_count=0
File_count=0
Filename_list=[]
Apple_list=[]
Banana_list=[]
for line in lines:
    match1=re.findall('data/(?P<File>[^\/] (?=\..*data))',line)    
    if match1:
        Filename_list.append(match1[0])
        print('Match found:',match1)           
    if line.startswith("Apple"):
        Apple_count =1
    elif line.startswith("Banana"):
        Banana_count =1
    Apple_list.append(Apple_count)
    Banana_list.append(Banana_count)

The desired output:

Filename:File_10265

Apple:2

Banana:2

Filename:File_10276

Apple:3

Banana:4

Filename:File_10278

Apple:1

Banana:4

CodePudding user response：

Maybe there is a more efficient way to do this but here's one solution:

with open('filetest.txt') as f:
    lines = f.readlines()
    unique_lines = list(dict.fromkeys(lines))
    for line in unique_lines:
        print(line   str(lines.count(line)))
        f1 = open('file.txt', 'a')
        f1.write(line   str(lines.count(line)))
        f1.close()

You simply open the file, read all lines into a list, then get rid of any duplicates. Then you loop through the list (now with the duplicates removed), and use the .count (docs) function to get the number of occurrences of each unique item in the list.

CodePudding user response：

Try this,

pattern = re.compile(r"data/File_[\d] .data:")

lines = text.split("\n")
files = itertools.groupby(lines, lambda line:pattern.search(line) == None)

for k, content in files:
  if k == True:
    content = list(content)
    all_words = list(set(content))
    counts = {word:content.count(word) for word in all_words if word != ""}
    print(counts)

Output -

{'Banana:': 2, 'Apple:': 2}
{'Banana:': 4, 'Apple:': 3}
{'Banana:': 4, 'Apple:': 1}

CodePudding user response：

Try this:

text = {}

with open("items.txt", "r") as f1:
    for line in f1:
        if ("data" in  line):
            temp_key = line
            k = {}
            text[temp_key] = k
        elif (line.strip() != ""):  
            temp_word = line.strip()
            if temp_word in text[temp_key]:
                text[temp_key][temp_word]  = 1
            else:
                text[temp_key][temp_word] = 1
            
final_text = ""

for main_key in text:
    final_text  = main_key   "\n"
    for sub_key in text[main_key]:
        final_text  = sub_key   " "   str(text[main_key][sub_key])   "\n\n"

print(final_text)  #print the output on the idle

with open("new_items.txt", "w") as f2:
    f2.write(final_text)         #write the output to a new file

Output:

data/File_10265.data:

Apple: 2

Banana: 2

data/File_10276.data:

Apple: 3

Banana: 4

data/File_10278.data:

Apple: 1

Banana: 4