Home > Blockchain >  Manipulating files using Python
Manipulating files using Python

Time:09-16

My old friend and I have been trying to save our chat histories recently for the nostalgia and the memories. Google chat history saves in a latest-oldest order. I'd like to make it to oldest-latest as well as change the pattern of the text. Any idea how I can implement this in Python?
For reference this is the Hangouts file.

From 1597166228247121622@xxx Sat Dec 30 18:33:39  0000 2017 
X-GM-THRID: 1597166193327506679 
X-Gmail-Labels: chat 
From: Nash  MIME-Version: 1.0 
Content-Type: text/plain

I understand, don't worry

From 1597166202534663022@xxx Sat Dec 30 18:33:06  0000 2017 
X-GM-THRID: 1597166193327506679 
X-Gmail-Labels: Chat 
From: Nash MIME-Version: 1.0 
Content-Type: text/plain

Have a safe trip

From 1588224320874515054@xxx Sat Dec 30 15:45:43  0000 2017 
X-GM-THRID: 1588205400363537982 
X-Gmail-Labels: Chat 
From: Sash  MIME-Version: 1.0 
Content-Type: text/plain

I FEEL YA

From 1588224307132362082@xxx Sat Dec 30 15:45:30  0000 2017 
X-GM-THRID: 1588205400363537982 
X-Gmail-Labels: Chat 
From: Sash  MIME-Version: 1.0 
Content-Type: text/plain

HOW IN THE WORLD ARE ALL OF YOU SHARING THE SAME HOTEL ROOM ??

And this is what I want it to look like.

[25/04/2018, 3:11:11 PM] Sash: pigeons ! 
[25/04/2018, 3:11:24 PM] Nash: pls no 
[25/04/2018, 3:11:55 PM] Nash: dont need em gutur guturs 
[25/04/2018, 3:13:13 PM] Sash: turn it up beetches

CodePudding user response:

You can use modules re and datetime.

Example for your text:

import re
import datetime

text = text.split('\n\n')

datetime_pattern = '\w{3} \w{3} \d{2} \d{2}:\d{2}:\d{2} .\d{4} \d{4}'
name_pattern = 'From:\s\w*\s MIME-Version:'

for i in range(0, len(text), 2):
    date = datetime.datetime.strptime(re.search(datetime_pattern, text[i]).group(), '%a %b %d %H:%M:%S %z %Y')
    name = re.search(name_pattern, text[i]).group()[6:-14].strip()
    message = text[i   1].strip()
    print('[{0}] {1}: {2}'.format(date.strftime('%d/%m/%Y, %I:%M:%S %p'), name, message))

The result:

[30/12/2017, 06:33:39 PM] Nash: I understand, don't worry
[30/12/2017, 06:33:06 PM] Nash: Have a safe trip
[30/12/2017, 03:45:43 PM] Sash: I FEEL YA
[30/12/2017, 03:45:30 PM] Sash: HOW IN THE WORLD ARE ALL OF YOU SHARING THE SAME HOTEL ROOM ??

CodePudding user response:

You could use the regex lib in python and grep the information out of your chat protocol and store it in variables. After that you just need to write the outputstrings into the new files.

To do that, i would read the whole protocol and work through each line.

CodePudding user response:

There is an example. It can me improved a lot, but it works

str = r"""From 1597166228247121622@xxx Sat Dec 30 18:33:39  0000 2017 
X-GM-THRID: 1597166193327506679 
X-Gmail-Labels: chat 
From: Nash  MIME-Version: 1.0 
Content-Type: text/plain

I understand, don't worry

From 1597166202534663022@xxx Sat Dec 30 18:33:06  0000 2017 
X-GM-THRID: 1597166193327506679 
X-Gmail-Labels: Chat 
From: Nash MIME-Version: 1.0 
Content-Type: text/plain

Have a safe trip

From 1588224320874515054@xxx Sat Dec 30 15:45:43  0000 2017 
X-GM-THRID: 1588205400363537982 
X-Gmail-Labels: Chat 
From: Sash  MIME-Version: 1.0 
Content-Type: text/plain

I FEEL YA

From 1588224307132362082@xxx Sat Dec 30 15:45:30  0000 2017 
X-GM-THRID: 1588205400363537982 
X-Gmail-Labels: Chat 
From: Sash  MIME-Version: 1.0 
Content-Type: text/plain

HOW IN THE WORLD ARE ALL OF YOU SHARING THE SAME HOTEL ROOM ??""";

# you have to store the message in a variable called str
messages = []
messages = str.split("From ")

for message in messages:
    if message != "":
        message_lines = message.split("\n")
        first_line = message_lines[0]
        parts = first_line.split(" ")
        date = "["   parts[3]   "/"   parts [2]   "/"   parts[6]  " "   parts[4]   "]"
        message_text = message_lines[6]
        print(date   " "   message_text)
    
  • Related