I need help with truncating columns of decimals within a dataset, that ties in with other calcuations to output a 1 or 0 binary column called "scenario1".
I'm basically two columns (columns A and columns B) of decimals along a time-based index. The columns may have varied amount of decimal points.
e.g. ColA can be 4 decimals, ColB can be 6 decimals.
ColA | ColB |
---|---|
0.9954 | 0.995642 |
0.9854 | 0.997450 |
If the value of ColA and ColB is close enough, I want to output TRUE. To do that, I have to somehow truncate both ColA and ColB to one decimal place without any rounding up or down. So it becomes the following:
ColA | ColB |
---|---|
0.9 | 0.9 |
0.9 | 0.9 |
I need this truncation to happen within a function "scenario1", trying to have the code be efficient as possible. My current failed attempts are those lines with math.trunc(1000 * df.tenkan)/1000.
CodePudding user response:
You can use numpy trunc
:
df = pd.DataFrame({'ColA ': [0.9954, 0.9854], 'ColB': [0.995642, 0.99745]})
np.trunc(10 * df) / 10
Result:
ColA ColB
0 0.9 0.9
1 0.9 0.9
CodePudding user response:
I think the easiest way to do this would be to use applymap paired with the truncate function of the math module. Here is an example:
trunc = lambda x: math.trunc(10 * x)/10
df.applymap(trunc)
You'll need to apply this over your columns of interest, but i tested it on a few arbitrary examples and it worked well. Hope that helps! Can expound on detail if necessary.
CodePudding user response:
You could also make use of regular expression:
df = pd.DataFrame({'ColA ': [0.99989, 0.986767], 'ColB': [0.9890, 0.9588]})
func = lambda x: re.match(r'\d .\d{1}', str(x)).group(0)
df.applymap(func)
Alternatively here is the not so elegant approach where you first convert the number into a string, then you get the different parts of the string separately and lastly covert the string back to a float(Yes this is not efficient) :
def func(x):
# Convert number to a string
digits = str(x).split(".")
# Manually put the number back together:
digit = digits[0] "." digits[1][:1]
return float(digit)
df.applymap(func)
Results:
ColA | ColB
0.9 | 0.9
0.9 | 0.9