The text file looks like
data/File_10265.data:
Apple:
Apple:
Banana:
Banana:
data/File_10276.data:
Apple:
Apple:
Apple:
Banana:
Banana:
Banana:
Banana:
data/File_10278.data:
Apple:
Banana:
Banana:
Banana:
Banana:
The code is as follows:
import re
f = open("Samplefruit.txt", "r")
lines = f.readlines()
Apple_count=0
Banana_count=0
File_count=0
Filename_list=[]
Apple_list=[]
Banana_list=[]
for line in lines:
match1=re.findall('data/(?P<File>[^\/] (?=\..*data))',line)
if match1:
Filename_list.append(match1[0])
print('Match found:',match1)
if line.startswith("Apple"):
Apple_count =1
elif line.startswith("Banana"):
Banana_count =1
Apple_list.append(Apple_count)
Banana_list.append(Banana_count)
The desired output:
Filename:File_10265
Apple:2
Banana:2
Filename:File_10276
Apple:3
Banana:4
Filename:File_10278
Apple:1
Banana:4
CodePudding user response:
Maybe there is a more efficient way to do this but here's one solution:
with open('filetest.txt') as f:
lines = f.readlines()
unique_lines = list(dict.fromkeys(lines))
for line in unique_lines:
print(line str(lines.count(line)))
f1 = open('file.txt', 'a')
f1.write(line str(lines.count(line)))
f1.close()
You simply open the file, read all lines into a list, then get rid of any duplicates. Then you loop through the list (now with the duplicates removed), and use the .count
(docs) function to get the number of occurrences of each unique item in the list.
CodePudding user response:
Try this,
pattern = re.compile(r"data/File_[\d] .data:")
lines = text.split("\n")
files = itertools.groupby(lines, lambda line:pattern.search(line) == None)
for k, content in files:
if k == True:
content = list(content)
all_words = list(set(content))
counts = {word:content.count(word) for word in all_words if word != ""}
print(counts)
Output -
{'Banana:': 2, 'Apple:': 2}
{'Banana:': 4, 'Apple:': 3}
{'Banana:': 4, 'Apple:': 1}
CodePudding user response:
Try this:
text = {}
with open("items.txt", "r") as f1:
for line in f1:
if ("data" in line):
temp_key = line
k = {}
text[temp_key] = k
elif (line.strip() != ""):
temp_word = line.strip()
if temp_word in text[temp_key]:
text[temp_key][temp_word] = 1
else:
text[temp_key][temp_word] = 1
final_text = ""
for main_key in text:
final_text = main_key "\n"
for sub_key in text[main_key]:
final_text = sub_key " " str(text[main_key][sub_key]) "\n\n"
print(final_text) #print the output on the idle
with open("new_items.txt", "w") as f2:
f2.write(final_text) #write the output to a new file
Output:
data/File_10265.data:
Apple: 2
Banana: 2
data/File_10276.data:
Apple: 3
Banana: 4
data/File_10278.data:
Apple: 1
Banana: 4