About Groggy¶

What is Groggy?¶

Groggy is a high-performance graph analytics library that bridges the gap between graph theory and practical data science. It combines:

Graph topology: Nodes, edges, and the relationships between them
Tabular data: Columnar attribute storage for efficient bulk operations
Rust performance: Memory-safe, high-speed core implementation
Python ergonomics: Intuitive, chainable API that feels natural

Who is Groggy For?¶

Data Scientists¶

Work with graph data using familiar pandas-like operations
Query, filter, and aggregate without leaving your comfort zone
Seamless integration with NumPy, pandas, and the PyData ecosystem

ML Engineers¶

Build graph neural networks with automatic differentiation
Efficient feature engineering on graph-structured data
High-performance embeddings and spectral analysis

Network Analysts¶

Analyze social networks, knowledge graphs, and complex systems
Run classic graph algorithms (connected components, centrality, etc.)
Visualize and explore graph structure interactively

Researchers¶

Git-like version control for reproducible graph experiments
Time-travel queries to analyze graph evolution
Extensible architecture for custom algorithms

What Makes Groggy Different?¶

1. Everything is a View¶

In Groggy, when you work with a graph, you're typically working with immutable views: - Subgraphs are views into the main graph - Tables are snapshots of graph state - Arrays are columnar views of attributes - Matrices represent graph structure or embeddings

This design enables powerful composition without unnecessary copying.

2. Delegation Chains¶

Groggy's signature feature: objects know how to transform into other objects.

result = (
    g.connected_components()    # → SubgraphArray
     .sample(5)                 # → SubgraphArray (filtered)
     .neighborhood(depth=2)     # → SubgraphArray (expanded)
     .table()                   # → GraphTable
     .agg({"weight": "mean"})   # → AggregationResult
)

Once you learn the transformation patterns, the entire API becomes intuitive.

3. Columnar Storage¶

Attributes are stored separately from graph structure: - GraphSpace: Which nodes/edges are alive (topology) - GraphPool: Where attributes are stored (columnar data)

This separation enables: - Efficient bulk attribute operations - Time-series tracking of graph changes - Memory-efficient storage and versioning

4. Three-Tier Architecture¶

┌──────────────────────────────────────┐
│        Python API Layer              │  Intuitive, chainable
│  (Graph, Table, Array, Matrix)       │
├──────────────────────────────────────┤
│          FFI Bridge                  │  Pure translation
│         (PyO3 bindings)              │
├──────────────────────────────────────┤
│         Rust Core                    │  High-performance
│  (Storage, State, Algorithms)        │  algorithms
└──────────────────────────────────────┘

Rust Core: All algorithms, storage, and performance-critical code
FFI Bridge: Pure translation layer, no business logic
Python API: User-facing interface optimized for developer experience

Design Philosophy¶

"Everything is a Graph"¶

Even Groggy itself is a graph: - Nodes = Object types (Graph, Subgraph, Table, Array, Matrix) - Edges = Methods that transform one type into another

This conceptual model makes the library easier to learn and use.

Performance First, Ergonomics Close Second¶

All core operations meet O(1) amortized complexity targets
Memory usage scales linearly with data size
FFI overhead <100ns per call for simple operations
But never at the expense of a confusing API

Test-Driven Documentation¶

Every documented feature has a working test that validates it. If it's in the docs, it works. If it works, it's in the docs.

Columnar Thinking¶

Optimize for bulk operations over single-item loops: - Process entire attribute columns at once - Cache-friendly data access patterns - Leverage SIMD and parallelization where possible

Project Goals¶

Short Term¶

Comprehensive API coverage for core graph operations
Solid foundation of graph algorithms
Excellent documentation with real-world examples
Robust testing and benchmarking

Medium Term¶

Advanced graph neural network support
Integration with PyTorch Geometric and DGL
Distributed graph processing capabilities
Rich visualization ecosystem

Long Term¶

Industry-standard graph analytics platform
Reference implementation for graph data structures
Foundation for graph machine learning research
Community-driven algorithm library

Project Status¶

Groggy is under active development. The core architecture is stable, but the API is still evolving based on user feedback and real-world usage patterns.

Current version: 0.5.1

See the changelog for recent updates and the roadmap for planned features.

Community¶

Groggy is open source (MIT License) and welcomes contributions:

Code: Bug fixes, new features, performance improvements
Documentation: Tutorials, examples, typo fixes
Testing: Edge cases, performance benchmarks, real-world use cases
Ideas: Feature requests, API design discussions, architectural feedback

Join the community:

GitHub: rollingstorms/groggy
Issues: Bug reports and feature requests
Discussions: Questions and ideas

License¶

Groggy is released under the MIT License.

See LICENSE for full text.

Ready to get started? Check out the Installation Guide or jump straight to the Quickstart.