I am trying to speed up some python code using rust bindings with py03.
i have implemented the following function in both python and rust:
def _play_action(state, action):
temp = state.copy()
i1, j1, i2, j2 = action
h1 = abs(temp[i1][j1])
h2 = abs(temp[i2][j2])
if temp[i1][j1] < 0:
temp[i2][j2] = -(h1 h2)
else:
temp[i2][j2] = h1 h2
temp[i1][j1] = 0
return temp
#[pyfunction]
fn play_action(state: [[i32; 9]; 9], action : [usize;4]) -> [[i32; 9]; 9] {
let mut s = state.clone();
let h1 = s[action[0]][action[1]];
let h2 = s[action[2]][action[3]];
s[action[0]][action[1]] = 0;
s[action[2]][action[3]] = h1.signum() * (h1 h2).abs();
s
And to my great surprise the python version is faster... Any idea why?
CodePudding user response:
This is probably caused by the overhead of the communication between python and Rust, the data you're passing is too small so I assume you're calling play_action
many times. a better approach would be to batch your calls
#[pyfunction]
fn play_actions(data: Vec<([[i32; 9]; 9],[usize;4])>) -> Vec<[[i32; 9]; 9]> {
data.into_iter()
.map(|(state,action)| play_action(state,action))
.collect::<Vec<_>>()
}
fn play_action(state: [[i32; 9]; 9], action : [usize;4]) -> [[i32; 9]; 9] {
let mut s = state.clone();
let h1 = s[action[0]][action[1]];
let h2 = s[action[2]][action[3]];
s[action[0]][action[1]] = 0;
s[action[2]][action[3]] = h1.signum() * (h1 h2).abs();
s
}
CodePudding user response:
If you are calling the function written in rust from Python, there will have to be a conversion from Python objects to rust data structures. The time that this takes is overhead.
Since your function seems pretty small, it could easily be that the overhead overwhelms the runtime of the function.
I would encourage you to profile your python code (using the cProfile
module) before trying to make it faster. Profiling and the insight in the behavior of your code that it provides can enable significant performance gains.
Here is a link to the first of a series of articles that I've written about python profiling.
If you do a lot of number crunching, see if your problem is a good fit for numpy
.
If a relatively small function takes up a lot of the execution time because it is called very often, try using the functools.cache
decorator.
Keep in mind that a better algorithm generally beats optimizations.