GraphForge: 3D Knowledge Graph
December 22, 2025
AI-powered 3D interactive knowledge graph search engine. Extracted 11k entities and relationships from Wikipedia using Groq + Gemini, stored in Neo4j, with dynamic 3D visualization.
What is it?
I built this as a 3D knowledge graph that you can explore in the browser. The user asks a question in plain English, the backend turns that question into a Cypher query, Neo4j runs it, and the frontend renders the result as an interactive graph with nodes and edges you can click around.
What I liked here was that it made graph databases feel visual instead of abstract. It was not just a backend experiment. I wanted the result to feel like something you could actually explore, not just inspect in a terminal.
How it evolved: from hand-crafted seeds to AI extraction
The first version was tiny and mostly manual. I wrote seed JSON files myself just to get the graph shape working. After that, I started generating synthetic data and then moved to a larger pipeline that fetched articles and used LLMs to extract triples like subject, relation, object.
That sounds clean on paper, but in practice the extracted data was messy. Entity names were inconsistent. Some relations were vague. Some targets were broken. So a lot of the real engineering work was not extraction itself, but cleanup scripts, repair passes, deduping, and data validation. This project taught me that LLM-generated data pipelines almost always need a second layer of boring cleanup work.
Under the hood: Neo4j and Cypher
I liked Neo4j for this because the data model matched the product. Entities become nodes, relationships become edges, and traversing the graph feels natural. That is much cleaner than trying to fake graph-style traversal with a bunch of SQL joins.
The interesting piece was query generation. Users never write Cypher directly. Instead, I pass the natural-language question and a description of the graph schema to the model, and the model produces a Cypher query. Then the backend runs that query and returns structured results for the frontend. If the generated query is weak or fails, I fall back to simpler keyword-style search logic so the user still gets something useful.
LLM rotation for rate limit resilience
I did not want the whole app to freeze just because one model started rate-limiting me. So I built the backend to rotate through multiple Groq-hosted models. If one returns a 429 or otherwise fails, I try the next model in priority order.
That sounds like a small feature, but it changes reliability a lot. Instead of one brittle dependency, I get a model pool. The quality is not always identical across models, but uptime improves because the system can keep moving even when one provider path is temporarily unhappy.
Deployment: Cloud Functions Gen2
The backend runs on Cloud Functions Gen2 with a FastAPI app adapted to work in that environment. I used the `agraffe` adapter to bridge the ASGI app into the Cloud Functions model. On the frontend side, I used `react-force-graph-3d` on Netlify to render the graph in the browser.
I also tried to keep the loading behavior practical. Instead of pulling the whole graph at once, I load an initial chunk and expand it as users interact. That keeps the UI more responsive and stops the browser from getting punished by a graph that is bigger than it needs to be on first render.
Key takeaways
- Neo4j AuraDB: cloud graph DB setup, Cypher query language, LLM-generated queries
- LLM-to-Cypher: prompt engineering to turn natural language into graph traversal queries
- Multi-model rotation on Groq for rate limit resilience, priority ordering and 429 handling
- Cloud Functions Gen2 with ASGI: agraffe adapter for FastAPI inside Cloud Functions
- Iterative data quality: why graph data needs repair scripts (broken targets, duplicate entities)