Home > Mobile >  How to specify default value when constructing Pandas Dataframe from two series (index and columns)?
How to specify default value when constructing Pandas Dataframe from two series (index and columns)?

Time:08-25

I'm trying to construct a boolean 2D array set to initial value of False. The following code sets it to True by default:

import pandas as pd
from datetime import date

date_start = date(2022, 1, 1)
date_end = date(2022, 8, 24)
valid_dates = pd.bdate_range(date_start, date_end)
cols = range(0,4)
df = pd.DataFrame(index=valid_dates, columns=cols, dtype='bool')

I know I can do the following to replace the values to False, but it takes significantly longer:

df = df.replace(df, False)

My actual columns is much larger e.g. ~500 columns. Is there a way to just initialize the dataframe to be False?

Thank You to @ivanp

This is a working version that set the dataframe to false using my previous example and @ivanp's solution

import pandas as pd
import numpy as np 
from datetime import date

date_start = date(2022, 1, 2)
date_end = date(2022, 8, 24)
valid_dates = pd.bdate_range(date_start, date_end)
cols = range(0, 500)
df = pd.DataFrame(data = np.full((len(valid_dates), len(cols)), False), index=valid_dates, columns=cols)
print(df)

CodePudding user response:

import pandas as pd
import numpy as np 

def makefalse_numpy():
    return pd.DataFrame(np.full((500, 500), False))

%timeit makefalse_numpy

output:

10.8 ns ± 0.0466 ns per loop (mean ± std. dev. of 7 runs, 100000000 loops each)
  • Related