I'm using numpy to retrieving data from csv file, it contains 3 columns with data: offer_id, sms_limit, sms_price. I want to add validation:
- offer_id - only positive integers
- sms_limit - only positive integers
- sms_price - positive float number.
I've tried to write my own validator, something like this:
def int_validator(x):
if str(x).isdigit():
return x
raise ValueError('Invalid choice please use positive integer number')
pd.read_csv(
converters={'offer_id': int_validator, 'sms_limit': int, 'sms_price': int},
encoding='utf-8',
engine='python',
)
but it doesn't work at all :(
It only works if I use int
pd.read_csv(
converters={'offer_id': int, 'sms_limit': int, 'sms_price': int},
encoding='utf-8',
engine='python',
)
but it's not what I'm looking for. Also, it's only working for column offer_id if I type a string into sms_limit or sms_price there is no validation. Can smb explain how to write my validators and why only the first column accepts int conversion?
CodePudding user response:
Here's a solution that correctly checks if the first two columns contain positive integers and if the last column contains positive floats.
# This uses a try-except block to see if the given value is an integer,
# and an if-else block to see if the value is >= 0.
# Change the sign to > 0 if you want strictly positive values.
def int_validator(x):
try:
# A funny little quirk of python: If you have something like x = "7.0", then int(x) returns an error even though int(float(x)) does not.
x = int(float(x))
if x >= 0:
return x
else:
raise ValueError('Invalid choice for {}. Please use positive integer number'.format(x))
except:
raise ValueError('Invalid choice for {}. Please use positive integer number'.format(x))
# This does something similar to the int_validator, but checks if it's a float instead.
def float_validator(x):
try:
x = float(x)
if x >= 0:
return x
else:
raise ValueError('Invalid choice for {}. Please use positive float number'.format(x))
except:
raise ValueError('Invalid choice for {}. Please use positive float number'.format(x))
# Now we apply the validators to all the columns.
pd.read_csv("example.csv",
converters={'offer_id': int_validator, 'sms_limit': int_validator, 'sms_price': float_validator},
encoding='utf-8',
engine='python',
)
Let me know if you have questions!