Tutorial 5: Sampler Pipelines¶
Sampler pipelines compile Builder steps into a sampler that returns a SubgraphArray.
They let you prepare dataset-ready subgraphs without leaving Rust execution.
This tutorial shows how to:
- generate a subgraph per node
- expand each to a neighborhood
- map a sampling sub-pipeline over each neighborhood
- emit the final SubgraphArray
What You’ll Build¶
- 1-hop neighborhood for every node
- Sample 100 nodes within each neighborhood
- Emit one subgraph per neighborhood
Step 1: Create the Builder¶
Step 2: Expand Neighborhoods¶
This produces a SubgraphArray — one neighborhood per node. Each element is a full Subgraph view.
If you want fewer seeds, you can sample first:
Step 3: Map a Sampler Over Each Neighborhood¶
The map(...) call runs a sub-pipeline for each neighborhood.
The lambda receives a builder instance scoped to that one subgraph.
Return a node/edge selection and it will be emitted as a subgraph automatically.
You can also sample edges instead:
Step 4: Emit Output Subgraphs¶
Modes:
- per_item: one output per input subgraph
- unified: a single subgraph from all outputs
Step 5: Run the Sampler¶
graph = gr.generators.karate_club()
samples = graph.view().sample(b.build_sampler())
print(len(samples))
Each element in samples is a Subgraph representing a sampled neighborhood.
Key Takeaways¶
- Samplers return
SubgraphArray. iterate_nodes()+neighbors()creates per-node neighborhoods.map(...)runs a sub-pipeline per subgraph.emit_subgraphs(..., mode="per_item")keeps one output per input.
Common Patterns¶
- Uniform sample → neighborhood → per-item sample
- Per-item sampling with
map(...)to avoid Python loops - Switch
mode="unified"to collapse outputs into a single subgraph
Next: Explore the Builder API reference in docs/builder/api/.