File tree Expand file tree Collapse file tree 2 files changed +12
-1
lines changed
cpp/include/rapidsmpf/streaming/cudf
python/rapidsmpf/rapidsmpf/streaming/cudf Expand file tree Collapse file tree 2 files changed +12
-1
lines changed Original file line number Diff line number Diff line change @@ -20,12 +20,18 @@ namespace rapidsmpf::streaming::node {
2020/* *
2121 * @brief Asynchronously read parquet files into an output channel.
2222 *
23+ * @note This is a collective operation, all ranks named by the execution context's
24+ * communicator will participate. All ranks must specify the same set of options.
25+ * Behaviour is undefined if a `read_parquet` node appears only on a subset of the ranks
26+ * named by the communicator, or the options differ between ranks.
27+ *
2328 * @param ctx The execution context to use.
2429 * @param ch_out Channel to which `TableChunk`s are sent.
2530 * @param max_tickets Maximum number of tickets to throttle production of chunks. Up to
2631 * this many tasks can start producing data simultaneously.
2732 * @param options Template reader options. The files within will be picked apart and used
28- * to reconstruct new options for each read chunk.
33+ * to reconstruct new options for each read chunk. The options should therefore specify
34+ * the read options "as-if" one were reading the whole input in one go.
2935 * @param num_rows_per_chunk Target (maximum) number of rows any sent `TableChunk` should
3036 * have.
3137 *
Original file line number Diff line number Diff line change @@ -44,6 +44,11 @@ def read_parquet(
4444 Maximum number of tasks that may be suspended having read a chunk.
4545 options
4646 Reader options
47+
48+ Notes
49+ -----
50+ This is a collective operation, all ranks participating via the
51+ execution context's communicator must call it with the same options.
4752 """
4853 cdef cpp_Node _ret
4954 with nogil:
You can’t perform that action at this time.
0 commit comments