I am trying to split a column from a CSV file. The first column contains a date (YYmmdd) and then time (HHmmss) so the string looks like 20221001131245. I want to split this so it reads 2022 10 01 in one column and then 13:12:45 in another.
I have tried the str.split but I recognise my data isn't in a string so this isn't working.
Here is my code so far:
import pandas as pd
CSVPath = "/Desktop/Test Data.csv"
data = pd.read_csv(CSVPath)
print(data)
CodePudding user response:
To answer the question from your comment:
You can use df.drop(['COLUMN_1', 'COLUMN_2'], axis=1)
to drop unwanted columns.
I am guessing you want to write the data back to a .csv file? Use the following snippet to only write specific columns:
df[['COLUMN_1', 'COLUMN_2']].to_csv("/Desktop/Test Data Edited.csv")
CodePudding user response:
Use to_datetime
combined with strftime
:
# convert to datetime
s = pd.to_datetime(df['col'], format='%Y%m%d%H%M%S')
# or if integer as input
# s = pd.to_datetime(df['col'].astype(str), format='%Y%m%d%H%M%S')
# format strings
df['date'] = s.dt.strftime('%Y %m %d')
df['time'] = s.dt.strftime('%H:%M:%S')
Output:
col date time
0 20221001131245 2022 10 01 13:12:45
alternative
using string slicing and concatenation
s = df['col'].str
df['date'] = s[:4] ' ' s[4:6] ' ' s[6:8]
df['time'] = s[8:10] ':' s[10:12] ':' s[12:]