I am currently trying to understand the following example of the boost library that uses parallel processing with thrust.
struct lorenz_system
{
struct lorenz_functor
{
template< class T >
__host__ __device__
void operator()( T t ) const
{
// unpack the parameter we want to vary and the Lorenz variables
value_type R = thrust::get< 3 >( t );
value_type x = thrust::get< 0 >( t );
value_type y = thrust::get< 1 >( t );
value_type z = thrust::get< 2 >( t );
thrust::get< 4 >( t ) = sigma * ( y - x );
thrust::get< 5 >( t ) = R * x - y - x * z;
thrust::get< 6 >( t ) = -b * z x * y ;
}
};
lorenz_system( size_t N , const state_type &beta )
: m_N( N ) , m_beta( beta ) { }
template< class State , class Deriv >
void operator()( const State &x , Deriv &dxdt , value_type t ) const
{
thrust::for_each(
thrust::make_zip_iterator( thrust::make_tuple(
boost::begin( x ) ,
boost::begin( x ) m_N ,
boost::begin( x ) 2 * m_N ,
m_beta.begin() ,
boost::begin( dxdt ) ,
boost::begin( dxdt ) m_N ,
boost::begin( dxdt ) 2 * m_N ) ) ,
thrust::make_zip_iterator( thrust::make_tuple(
boost::begin( x ) m_N ,
boost::begin( x ) 2 * m_N ,
boost::begin( x ) 3 * m_N ,
m_beta.begin() ,
boost::begin( dxdt ) m_N ,
boost::begin( dxdt ) 2 * m_N ,
boost::begin( dxdt ) 3 * m_N ) ) ,
lorenz_functor() );
}
size_t m_N;
const state_type &m_beta;
};
This example can be fully explored at:
https://github.com/headmyshoulder/odeint-v2/blob/master/examples/thrust/lorenz_parameters.cu
I am still pretty new to C , but have done introductory courses in C, and also learned programming in Java. So, I am mostly familiar with the concepts.
My question would be in the second operator overwrite. As far as I understand, thrust::for_each()
takes two iterators, one where it starts and one where it ends, as well as a functor. When thrust::make_zip_iterator()
gets called, it creates a zip_iterator
. As far as I understand, zip_iterator
creates a moving reference for the entire bundle. For this, it requires that the length are all the same.
However m_beta.begin()
appears in both iterators with no change whatsoever. Shouldn't this mean that m_beta
consists of only one value (which is 0, if you check the full code on GitHub) and therefore the zip_iterator
fails?
I ran the code and it works beautifully though, so what am I not understanding about zip_iterator
s, or specifically thrust's?
CodePudding user response:
In this case (probably in all parallel cases) Thrust only needs the "end" iterator to compute the distance to the "begin" iterator. For the zip iterator this is implemented at the moment as
__thrust_exec_check_disable__
template<typename IteratorTuple>
template <typename OtherIteratorTuple>
__host__ __device__
typename zip_iterator<IteratorTuple>::super_t::difference_type
zip_iterator<IteratorTuple>
::distance_to(const zip_iterator<OtherIteratorTuple> &other) const
{
return get<0>(other.get_iterator_tuple()) - get<0>(get_iterator_tuple());
} // end zip_iterator::distance_to()
I.e. only the first "end" iterator is used, all others are ignored. This is an implementation detail and should therefore not be relied upon. One could argue that at least in a debug build the difference should be computed for all pairs of iterators and an exception should be thrown if any pair produces something different from the others.
In short the code in the question works (for now), but it should be fixed to avoid relying on implementation details and to make it clearer to the reader that m_beta
is iterated over and not just the first element used, as a beginner might think at first glance.
Shouldn't this mean that
m_beta
consists of only one value [...]?
If it would work that way it would mean that m_beta
had zero elements because the end iterator is one past the last element.