parallel_deterministic_reduce

parallel_deterministic_reduce#

[algorithms.parallel_deterministic_reduce]

Function template that computes reduction over a range, with deterministic split/join behavior.

// Defined in header <oneapi/tbb/parallel_reduce.h>

namespace oneapi {
    namespace tbb {

        template<typename Range, typename Value, typename Func, typename Reduction>
        Value parallel_deterministic_reduce( const Range& range, const Value& identity, const Func& func, const Reduction& reduction, /* see-below */ partitioner, task_group_context& context);
        template<typename Range, typename Value, typename Func, typename Reduction>
        Value parallel_deterministic_reduce( const Range& range, const Value& identity, const Func& func, const Reduction& reduction, /* see-below */ partitioner);
        template<typename Range, typename Value, typename Func, typename Reduction>
        Value parallel_deterministic_reduce( const Range& range, const Value& identity, const Func& func, const Reduction& reduction, task_group_context& context);
        template<typename Range, typename Value, typename Func, typename Reduction>
        Value parallel_deterministic_reduce( const Range& range, const Value& identity, const Func& func, const Reduction& reduction);

        template<typename Range, typename Body>
        void parallel_deterministic_reduce( const Range& range, Body& body, /* see-below */ partitioner, task_group_context& context);
        template<typename Range, typename Body>
        void parallel_deterministic_reduce( const Range& range, Body& body, /* see-below */ partitioner);
        template<typename Range, typename Body>
        void parallel_deterministic_reduce( const Range& range, Body& body, task_group_context& context);
        template<typename Range, typename Body>
        void parallel_deterministic_reduce( const Range& range, Body& body);

    } // namespace tbb
} // namespace oneapi

A partitioner type may be one of the following entities:

  • const simple_partitioner&

  • const static_partitioner&

The function template parallel_deterministic_reduce is very similar to the parallel_reduce template. It also has the functional and imperative forms and has similar requirements.

Unlike parallel_reduce, parallel_deterministic_reduce has deterministic behavior with regard to splits of both Body and Range and joins of the bodies. For the functional form, Func is applied to a deterministic set of Ranges, and Reduction merges partial results in a deterministic order. To achieve that, parallel_deterministic_reduce uses a simple_partitioner or a static_partitioner only because other partitioners react to random work stealing behavior.

Caution

Since simple_partitioner does not automatically coarsen ranges, make sure to specify an appropriate grain size. See Partitioners section for more information.

parallel_deterministic_reduce always invokes the Body splitting constructor for each range split.

As a result, parallel_deterministic_reduce recursively splits a range until it is no longer divisible, and creates a new body (by calling the Body splitting constructor) for each new subrange. Like parallel_reduce, for each body split the method join is invoked in order to merge the results from the bodies.

Therefore, for given arguments, parallel_deterministic_reduce executes the same set of split and join operations no matter how many threads participate in execution and how tasks are mapped to the threads. If the user-provided functions are also deterministic (that is, different runs with the same input result in the same output), multiple calls to parallel_deterministic_reduce produce the same result. Note however that the result might differ from that obtained with an equivalent sequential (linear) algorithm.

Complexity

If the range and body take O(1) space, and the range splits into nearly equal pieces, the space complexity is O(P log(N)), where N is the size of the range and P is the number of threads.

See also: