Home > Blockchain >  check if column is blank in pandas dataframe
check if column is blank in pandas dataframe

Time:10-14

I have the next csv file:

A|B|C
1100|8718|2021-11-21
1104|21|

I want to create a dataframe that gives me the date output as follows:

            A       B                C
0        1100    8718   20211121000000
1        1104      21   ""

This means

if C is empty:
    put doublequotes
else:
    format date to yyyymmddhhmmss (adding 0s to hhmmss)

My code:

df['C'] = np.where(df['C'].empty, df['C'].str.replace('', '""'), df['C']   '000000')

but it gives me the next:

      A      B            C
0  1100   8718   2021-11-21
1  1104     21            0

I have tried another piece of code:

if df['C'].empty:
                df['C'] = df['C'].str.replace('', '""')
        else:
                df['C'] = df['C'].str.replace('-', '')   '000000'

OUTPUT:

      A      B                 C

0  1100   8718    20211121000000
1  1104     21           0000000

CodePudding user response:

Use dt.strftime:

df = pd.read_csv('data.csv', sep='|', parse_dates=['C'])
df['C'] = df['C'].dt.strftime('%Y%m%d%H%M%S').fillna('""')
print(df)

# Output:
      A     B               C
0  1100  8718  20211121000000
1  1104    21              ""

CodePudding user response:

A good way would be to convert the column into datetime using pd.to_datetime with parameter errors='coerce' then dropping None values.

import pandas as pd
x = pd.DataFrame({
    'one': 20211121000000,
    'two': 'not true',
    'three': '20211230'
}, index = [1])

x.apply(lambda x: pd.to_datetime(x, errors='coerce')).T.dropna()

# Output: 

                            1
one   1970-01-01 05:36:51.121
three 2021-12-30 00:00:00.000
  • Related