Home > Enterprise >  How to write a text file to CSV based on appearance of keyword
How to write a text file to CSV based on appearance of keyword

Time:09-02

My text file looks like this. It is o/p from some program

**********End Transcription Grammar correction log***********
Downloading: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.29k/1.29k [00:00<00:00, 480kB/s]
Downloading: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 850M/850M [00:30<00:00, 29.0MB/s]
Downloading: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 773k/773k [00:00<00:00, 3.66MB/s]
Downloading: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.32M/1.32M [00:00<00:00, 7.99MB/s]
Line # 1
I settled the first question: "I have fit you n if you could choose one word to describe your experience of being pregnant, what would it be otherwise it would be lucky."
Line # 2
A because I had several pregnancy losses I'm prior to conceiving my daughter, and then this permetsii and I'm thinking about you, how do you think that things will change once you?
Line # 3
become apparent again Amen I (Thinka) having two children is going to be a lot more busy I'm going to have to really focus on staying organized a because it's hard enough.
Line # 4
With a toddler, O and my husband and I are going to really have to work as a team we already do, but it's going to take it to a different level.
Line # 5
SM O Andte Have you thought about things like breast eating, sleeping arrangements or child cae, yes, and a if you could elaborate on is O sura so I planned to you.
Line # 6
Nursing and My Daughter: My nurse to my my M toddler until she was about nine months UM and I plan to try to go as long as I can with this.

what I was trying to do is saving the this content in a csv file based on condation that it should look lile this....

num           text

Line #1       I settled the first question: "I have fit you n if you could choose one word to 
              describe your experience of being pregnant, what would it be otherwise it would be lucky."


Line #2      A because I had several pregnancy losses I'm prior to conceiving my daughter, 
                  and then this permetsii and I'm thinking about you, how do you think that 
              things will change once you?

''''........................
Line n       some text

code I've tried

content = ''
with open(r"output.txt", 'r') as txt_file:
  content = txt_file.read()

blocks = content.split('Line #')

csv_content = ''
for block in blocks:
  if block != '':
    csv_content  = 'Line # %s\n' % ' | '.join(block.splitlines())
    print(csv_content)

with open('csv_file.csv', 'w') as csv_file:
  csv_file.write(csv_content)

output I got

enter image description here

output not even splitting for keyword I want Line#. Am i missing anything here?. Any suggestions would be helpful

CodePudding user response:

You could use a regex to match your file contents, using re.findall to then generate a list of tuples of line number and contents. This can then easily be written to a csv file:

import re
import csv

with open(r"output.txt", 'r') as txt_file:
    content = txt_file.read()

blocks = re.findall(r'(Line #\s*\d )\n(.*?)(?=\nLine #\s*\d|$)', content, re.DOTALL)

with open('csv_file.csv', 'w') as csv_file:
    writer = csv.writer(csv_file, lineterminator='\n')
    writer.writerows(blocks)

Output:

Line # 1,"I settled the first question: ""I have fit you n if you could choose one word to describe your experience of being pregnant, what would it be otherwise it would be lucky."""
Line # 2,"A because I had several pregnancy losses I'm prior to conceiving my daughter, and then this permetsii and I'm thinking about you, how do you think that things will change once you?"
Line # 3,become apparent again Amen I (Thinka) having two children is going to be a lot more busy I'm going to have to really focus on staying organized a because it's hard enough.
Line # 4,"With a toddler, O and my husband and I are going to really have to work as a team we already do, but it's going to take it to a different level."
Line # 5,"SM O Andte Have you thought about things like breast eating, sleeping arrangements or child cae, yes, and a if you could elaborate on is O sura so I planned to you."
Line # 6,Nursing and My Daughter: My nurse to my my M toddler until she was about nine months UM and I plan to try to go as long as I can with this.

CodePudding user response:

"csv" means comma separated values. When you open your file in Excel, it is splitting at commas, which is why it starts a new cell after 'pregnant'.

It looks like you want to separate with the pipe character |. You can do this, but you need to specify it when opening the file in Excel by going to Data ribbon > Get Data > From File > From Text/CSV and setting | as a custom delimiter.

Also, you'll need newlines in your file to indicate new rows. Try the following:

content = ''
with open(r"output.txt", 'r') as txt_file:
content = txt_file.read()

lines = content.splitlines()

csv_content = ''
for i in range(0, len(lines), 2):
    csv_content = csv_content   lines[i]   ' | '   lines[i 1]   '\n'

with open('csv_file.csv', 'w') as csv_file:
csv_file.write(csv_content)

Running this and then opening in Excel as described above should get the output you're expecting for your example text, but if you actually want csv you should look at the csv module which will handle proper escaping of commas that appear within the text.

  • Related