Home > Blockchain >  Python Pandas - Splitting a column
Python Pandas - Splitting a column

Time:01-27

I am trying to split a column from a CSV file. The first column contains a date (YYmmdd) and then time (HHmmss) so the string looks like 20221001131245. I want to split this so it reads 2022 10 01 in one column and then 13:12:45 in another.

I have tried the str.split but I recognise my data isn't in a string so this isn't working.

Here is my code so far:

import pandas as pd
 
CSVPath = "/Desktop/Test Data.csv"
data = pd.read_csv(CSVPath)
 
print(data)

CodePudding user response:

To answer the question from your comment:

You can use df.drop(['COLUMN_1', 'COLUMN_2'], axis=1) to drop unwanted columns.

I am guessing you want to write the data back to a .csv file? Use the following snippet to only write specific columns:

df[['COLUMN_1', 'COLUMN_2']].to_csv("/Desktop/Test Data Edited.csv") 

CodePudding user response:

Use to_datetime combined with strftime:

# convert to datetime
s = pd.to_datetime(df['col'], format='%Y%m%d%H%M%S')
# or if integer as input
# s = pd.to_datetime(df['col'].astype(str), format='%Y%m%d%H%M%S')

# format strings
df['date'] = s.dt.strftime('%Y %m %d')
df['time'] = s.dt.strftime('%H:%M:%S')

Output:

              col        date      time
0  20221001131245  2022 10 01  13:12:45

alternative

using string slicing and concatenation

s = df['col'].str
df['date'] = s[:4] ' ' s[4:6] ' ' s[6:8]
df['time'] = s[8:10] ':' s[10:12] ':' s[12:]
  • Related