Appendix A: Glossary¶

Complete reference of Groggy terminology and concepts

Core Concepts¶

Graph¶

The main data structure in Groggy representing a network of nodes and edges. A Graph maintains both the graph structure (topology) and graph signal (attributes) in a columnar storage system. Implemented as a thin Python wrapper over Rust core data structures.

See: Graph API, Graph Core Guide

Node¶

An entity in the graph, also called a vertex. Nodes can have arbitrary attributes stored in the columnar pool. Identified by a unique NodeId.

Related: Edge, NodeId, Attribute

Edge¶

A connection between two nodes. Edges can be directed or undirected and can have arbitrary attributes. Identified by a unique EdgeId.

Related: Node, EdgeId, Directed Graph, Undirected Graph

NodeId¶

Integer identifier for a node. Used to reference nodes when creating edges and querying the graph.

Type: Integer (Rust: u32 or u64)

EdgeId¶

Integer identifier for an edge. Used to reference edges in queries and operations.

Type: Integer (Rust: u32 or u64)

Graph Types¶

Directed Graph¶

A graph where edges have direction - they go from a source node to a target node. The default in Groggy.

Example: Social network following relationships, web page links, dependency graphs

Undirected Graph¶

A graph where edges have no direction - the connection is bidirectional.

Example: Friendships, physical proximity, collaboration networks

Weighted Graph¶

A graph where edges (or nodes) have numerical weights representing strength, cost, distance, or other metrics.

Related: Attribute, NumArray

Multigraph¶

A graph that allows multiple edges between the same pair of nodes. Supported through edge attributes in Groggy.

Views and Substructures¶

Subgraph¶

A view into a Graph containing a subset of nodes and/or edges. Subgraphs are immutable views that reference the parent graph without copying data. Created through filtering operations.

See: Subgraph API, Subgraphs Guide

Related: View, Induced Subgraph, Filter

Induced Subgraph¶

A subgraph containing a specified set of nodes and all edges between those nodes in the parent graph.

Example:

nodes_to_keep = [0, 1, 2, 5]
induced = g.nodes[nodes_to_keep]  # Subgraph with all edges between these nodes

View¶

An immutable, lightweight reference to graph data that doesn't copy the underlying structure. Most Groggy operations return views rather than copies for performance.

Examples: Subgraph, Array slice, Table view

Related: Materialization, Lazy Evaluation

SubgraphArray¶

A collection of Subgraph objects, typically the result of graph algorithms like connected components or community detection. Supports delegation chains for batch operations.

See: SubgraphArray API, Subgraph Arrays Guide

Accessors¶

NodesAccessor¶

The g.nodes accessor providing node-level operations and queries. Enables filtering nodes, accessing attributes, and creating node-based subgraphs.

Access Pattern: g.nodes[condition] or g.nodes["attribute"]

See: NodesAccessor API, Accessors Guide

EdgesAccessor¶

The g.edges accessor providing edge-level operations and queries. Enables filtering edges, accessing attributes, and creating edge-based subgraphs.

Access Pattern: g.edges[condition] or g.edges["attribute"]

See: EdgesAccessor API, Accessors Guide

Data Structures¶

Attribute¶

Data attached to nodes or edges. Attributes are stored separately from the graph structure in a columnar pool for efficient bulk operations.

Key Concept: Nodes and edges only point to attributes; they never store them directly.

Example:

g.add_node(name="Alice", age=29, role="Engineer")
# name, age, role are attributes

Columnar Storage¶

Storage architecture where each attribute is stored as a separate column (array) rather than row-wise. Enables efficient bulk operations and SIMD optimization.

Benefits: - Cache-friendly access patterns - Efficient filtering and aggregation - Optimal for machine learning workflows

Related: GraphPool, Attribute

GraphSpace¶

The active state of the graph - which nodes and edges are currently alive. Part of the core Rust implementation.

Related: GraphPool, HistoryForest

GraphPool¶

The flyweight pool containing all node and edge attributes in columnar format. Part of the core Rust implementation.

Related: GraphSpace, Columnar Storage

HistoryForest¶

Git-like version control system for graphs, enabling time-travel queries and branching. Part of the core Rust implementation.

Related: State, Branch, Version Control

Tables¶

GraphTable¶

Unified tabular view of the entire graph containing both node and edge data. Can be split into NodesTable and EdgesTable.

See: GraphTable API, Tables Guide

NodesTable¶

Tabular view of node data with columns for node attributes. Each row represents a node.

See: NodesTable API

EdgesTable¶

Tabular view of edge data with columns for edge attributes. Each row represents an edge.

See: EdgesTable API

BaseTable¶

Base table type providing common table operations. Other table types (GraphTable, NodesTable, EdgesTable) inherit from this.

See: BaseTable API

Arrays¶

BaseArray¶

Base array type for attribute data. Provides common array operations and can be specialized to NumArray for numeric data.

Related: NumArray, Columnar Storage

NumArray¶

Specialized array for numeric data with statistical and mathematical operations (mean, sum, min, max, etc.).

See: NumArray API, Arrays Guide

NodesArray¶

Array of node IDs, typically returned from node queries. Supports node-specific operations.

See: NodesArray API

EdgesArray¶

Array of edge IDs, typically returned from edge queries. Supports edge-specific operations.

See: EdgesArray API

Matrices¶

GraphMatrix¶

Matrix representation of graph data, including adjacency matrices, Laplacian matrices, and embeddings.

See: GraphMatrix API, Matrices Guide

Adjacency Matrix¶

Matrix A where A[i,j] = 1 (or edge weight) if there's an edge from node i to node j, 0 otherwise.

Example:

A = g.adjacency_matrix()

Laplacian Matrix¶

Matrix L = D - A where D is the degree matrix and A is the adjacency matrix. Used in spectral graph theory.

Types: - Graph Laplacian: L = D - A - Normalized Laplacian: L_norm = I - D^(-1/2) A D^(-1/2) - Random Walk Laplacian: L_rw = I - D^(-1) A

Example:

L = g.laplacian_matrix()

Degree Matrix¶

Diagonal matrix D where D[i,i] equals the degree (number of connections) of node i.

Spectral Embedding¶

Low-dimensional representation of graph structure derived from eigenvectors of the Laplacian matrix.

Example:

embedding = g.spectral().compute(dims=128)

Related: Laplacian Matrix, Embedding

Algorithms¶

Connected Components¶

Maximal subgraphs where every node is reachable from every other node. For directed graphs, can be strongly or weakly connected.

Example:

components = g.connected_components()
# Returns SubgraphArray

See: Algorithms Guide

Shortest Path¶

Minimum-length path between two nodes, measured by edge count or edge weights.

Algorithms: - Dijkstra's algorithm (weighted) - Breadth-first search (unweighted)

Centrality¶

Measure of a node's importance in the network.

Types: - Degree Centrality: Number of connections - Betweenness Centrality: Number of shortest paths through node - Closeness Centrality: Average distance to all other nodes - Eigenvector Centrality: Importance based on neighbor importance

Community Detection¶

Algorithms for finding groups of densely connected nodes.

Algorithms: - Modularity optimization - Label propagation - Louvain method

Operations¶

Delegation¶

Pattern where objects forward method calls to related objects, enabling chainable operations.

Example:

result = (g.connected_components()
          .sample(5)
          .neighborhood(depth=2)
          .table()
          .agg({"weight": "mean"}))

See: Connected Views

Chaining¶

Calling multiple methods in sequence where each returns an object supporting the next method. Made possible by delegation.

Related: Delegation, Method Forwarding

Filtering¶

Selecting a subset of nodes or edges based on conditions. Returns a Subgraph view.

Example:

young = g.nodes[g.nodes["age"] < 30]
heavy_edges = g.edges[g.edges["weight"] > 5.0]

Materialization¶

Converting a lazy view into concrete data. Most operations in Groggy are lazy until materialized.

Example:

df = subgraph.table().to_pandas()  # Materializes as DataFrame

Related: View, Lazy Evaluation

Aggregation¶

Computing summary statistics across groups or the entire graph.

Example:

result = g.table().agg({"weight": ["mean", "sum", "count"]})

Architecture Terms¶

Three-Tier Architecture¶

Groggy's architectural layers: 1. Rust Core: High-performance algorithms and storage 2. FFI Bridge: PyO3 bindings (pure translation, no logic) 3. Python API: User-facing interface

See: Architecture

FFI (Foreign Function Interface)¶

The PyO3-based bridge between Rust and Python. Contains pure translation code with no business logic.

Related: PyO3, Three-Tier Architecture

PyO3¶

Rust framework for creating Python extensions. Used to expose Rust core to Python.

Method Forwarding¶

Technique where one object type delegates method calls to another type it can transform into.

Example: SubgraphArray forwards table() method which returns GraphTable

State and Versioning¶

State¶

A snapshot of the graph at a point in time, including which nodes/edges are alive and their attributes.

Related: GraphSpace, StateId

StateId¶

Identifier for a specific graph state in the history system.

Branch¶

Named sequence of graph states, similar to Git branches. Enables parallel development of graph versions.

Related: HistoryForest, Version Control

BranchName¶

String identifier for a branch in the history system.

Commit¶

Saving the current graph state to history. Creates a new StateId.

Inplace Operation¶

Operation that modifies the graph in place rather than returning a new object.

Example:

g.connected_components(inplace=True, label="component")
# Writes 'component' attribute to nodes

Data Interchange¶

Bundle¶

Serialized graph format containing both structure and attributes. Used for save/load operations.

Example:

g.save_bundle("graph.bundle")
g2 = gr.GraphTable.load_bundle("graph.bundle")

Related: GraphTable, Serialization

Parquet¶

Apache Parquet format for efficient columnar storage. Used for table export/import.

Example:

g.nodes.table().to_parquet("nodes.parquet")

Performance Terms¶

Amortized Complexity¶

Average time complexity over a sequence of operations, accounting for occasional expensive operations.

Example: Graph node insertion is O(1) amortized

Columnar Operation¶

Bulk operation on an entire attribute column, typically SIMD-optimized for performance.

Related: Columnar Storage

Sparse Matrix¶

Matrix where most entries are zero. Stored efficiently using sparse formats (CSR, COO).

Related: GraphMatrix, Adjacency Matrix

Dense Matrix¶

Matrix stored as a complete 2D array. More memory but faster access for dense data.

Integration Terms¶

NetworkX Compatibility¶

Ability to convert between Groggy and NetworkX graph formats.

See: Integration Guide

Pandas Integration¶

Converting Groggy tables to/from pandas DataFrames.

Example:

df = g.nodes.table().to_pandas()

NumPy Interop¶

Converting Groggy arrays and matrices to/from NumPy arrays.

Example:

np_array = num_array.to_numpy()

Common Abbreviations¶

Abbreviation	Full Term
API	Application Programming Interface
FFI	Foreign Function Interface
CSR	Compressed Sparse Row (matrix format)
COO	Coordinate List (matrix format)
SIMD	Single Instruction Multiple Data
GIL	Global Interpreter Lock (Python)
ADR	Architectural Decision Record
BFS	Breadth-First Search
DFS	Depth-First Search

Quick Reference¶

Common Object Transformations¶

Graph → Subgraph → SubgraphArray → GraphTable → NodesTable/EdgesTable
      → BaseArray → NumArray
      → GraphMatrix

Access Patterns¶

g.nodes[condition]          # Filter → Subgraph
g.nodes["attribute"]        # Column → BaseArray
g.edges[condition]          # Filter → Subgraph
g["attribute"]              # Alias for g.nodes["attribute"]
g.table()                   # Graph → GraphTable
g.adjacency_matrix()        # Graph → GraphMatrix