Home > database >  Replacing None with NULL or blank for VARCHAR and 0 or blank for INT fields from pandas dataframe
Replacing None with NULL or blank for VARCHAR and 0 or blank for INT fields from pandas dataframe

Time:10-20

I am trying to fetch data from a database, put them in a dataframe and load them to AWS S3.

For the fields that contains None values, the values get loaded to S3 as None. I would like to have None in VARCHAR fields to be NULL or Blank and None in INT fields to be 0 or blank.

connection = pyodbc.connect(conn)
sql = 'SELECT id, name from table_a'
df = pd.read_sql_query(sql, connection)
df=df.applymap(str) 
csv_buffer = BytesIO()
s3 = boto3.resource('s3')
with gzip.GzipFile(mode='w', fileobj=csv_buffer) as zipped_file:
        df.to_csv(TextIOWrapper(zipped_file, 'utf8'), index=False)
s3.Object(bucket[env],dest_filename).put(Body=csv_buffer.getvalue())

The output that i am getting while print(df) is -

ID NAME
1 None
None John

I would want the dataframe to be loaded to S3 as below -

ID NAME
1
John

How can i handle this situation?

CodePudding user response:

To simply change the value of None to an empty value in python, you can try:

for col in df.columns:
    df[col] = df[col].fillna(str())

int() can be substituted for str() as well, or empty quotes: ''

  • Related