Dual Database Pipeline: Preparing for EVE Frontier Universe Updates

What if you could regenerate your entire EVE Frontier map database from scratch after a major universe update—in under an hour, with full verification, without breaking anything? That's the challenge we faced: EVE Frontier's universe will change in 1-2 months, and when it does, EF-Map needs to rebuild its core routing database from fresh game data. But here's the catch—the rebuild needs to be automated and LLM-friendly, because a future AI agent may need to execute the entire workflow independently.

This article chronicles our journey from fragmented scripts and tribal knowledge to a battle-tested dual database pipeline with comprehensive documentation—ensuring that when the universe changes, we're ready.

The Challenge: Two Databases, Two Purposes

EF-Map relies on two distinct databases, each serving a different purpose in our data architecture:

Production Database (VULTUR → map_data_v2.db)

This 30 MB SQLite database powers the live interactive map. It contains:

24,426 star systems with 3D coordinates
284 regions and 2,213 constellations
Stargate connections defining routing adjacency
Station locations (optional, from game client files)
System/region labels for search and display

Users depend on this database for pathfinding, navigation, and multi-waypoint route optimization. If it's wrong or missing after a universe update, the entire map breaks.

Exploratory Database (Phobos → complete_solarsystems.db)

This 70 MB database enables deeper analysis and future features. It includes everything from the production DB, plus:

416,780 Lagrange Points (L1-L5 gravitational anchors)
Star statistics (16 columns: luminosity, radius, temperature, etc.)
Celestial bodies (planets, moons, asteroid belts)
System bounds (min/max X/Y/Z coordinates)
Ship data and blueprints (for future manufacturing tools)

This database isn't user-facing yet, but it's critical for prototyping new features like mining optimization, jump range visualization, and celestial navigation aids.

The Problem: Missing Documentation, Hidden Steps

When we audited our database regeneration workflows, we discovered a critical gap: the documentation was incomplete. A fresh LLM agent—tasked with rebuilding the databases after a universe update—would have failed within minutes.

Critical Discovery: The Missing Step

The VULTUR production pipeline requires a preprocessing step (process_labels.py) that combines three separate label files into a single labels.json file. This step was completely undocumented. Without it, the database builder would fail with FileNotFoundError: labels.json not found—and a fresh LLM would have no way to diagnose or fix it.

Other gaps included:

No verification scripts to catch extraction errors early
No clear guide on where to find game client files (like mapobjects.db)
Fragmented instructions across multiple files
No time estimates or performance benchmarks
Missing error handling and troubleshooting guidance

Our confidence assessment: VULTUR pipeline 60%, Phobos pipeline 95%. The Phobos workflow already had a comprehensive 600-line guide; VULTUR needed urgent attention.

The Solution: Comprehensive Documentation + Verification Scripts

We tackled this systematically, following the vibe coding methodology that powers all EF-Map development: describe the goal, let the LLM implement, verify rigorously.

1. Created VULTUR Setup Guide (1,000+ Lines)

We built a complete step-by-step guide covering:

Part 1: Prerequisites

Python 3.x, Git, EVE Frontier client installation, 5 GB disk space.

Part 2: VULTUR Installation

Clone from our backup fork (Diabolacal/eve-frontier-tools) to eliminate external dependency risk.

Part 3: Run Extraction

Execute export_static_data.py to generate 6 JSON files from game client.

Part 4: Copy Files

Move extraction outputs to EF-Map repo root. Optional: copy mapobjects.db from C:\CCP\EVE Frontier\ResFiles\.

Part 5: Verify Extraction

Run python verify_vultur_extraction.py to check file structure, counts, and JSON validity.

Part 6: Process Labels (CRITICAL)

Run python process_labels.py to combine three label files into labels.json. This step was missing from all previous documentation.

Part 7: Build Database

Execute python create_map_data.py to generate map_data_v2.db. Takes ~5 minutes.

Part 8: Verify Database

Run python verify_map_database.py to validate table structure, row counts, and stargate connectivity.

Part 9: Test in Web App

Launch npm run dev and verify map rendering, search, and routing all function correctly.

Part 10: Document Changes

Update decision log with extraction date, file sizes, and any anomalies discovered.

The guide includes time estimates (~25 minutes total), troubleshooting for 10 common issues, and a quick-reference command sequence for copy-paste execution.

2. Built Verification Scripts (Tested & Working)

Automated validation catches errors before they propagate:

verify_vultur_extraction.py (120 lines):

Validates presence of 4 required files (stellar_systems.json, stellar_regions.json, stellar_constellations.json, labels.json)
Checks JSON structure (dict vs list) and minimum entry counts
Verifies optional mapobjects.db if present
Returns exit code 0 (success) or 1 (failure) for automation

verify_map_database.py (180 lines):

Validates map_data_v2.db schema (7 tables)
Checks row counts against minimum thresholds
Samples a known system (Jita) to verify data integrity
Validates stargate connectivity (no orphan gates)
Provides actionable next steps on failure

Both scripts were tested end-to-end and confirmed working with our existing production data.

3. Updated All Cross-References

Documentation is only useful if it's discoverable. We updated:

README.md section 6 with clear entry points to both pipelines
AGENTS.md with setup guide references and verification script paths
UNIVERSE_DATA_PIPELINE.md with the missing process_labels.py step
Decision log with discovery details and testing results

The Phobos Pipeline: Already Battle-Tested

While VULTUR needed urgent documentation, the Phobos exploratory pipeline was already in excellent shape. Built over multiple iterations, it includes:

600+ line setup guide with 5 phases (extraction, conversion, database build, verification, querying)
3 verification scripts (verify_extraction.py, verify_ndjson_conversion.py, verify_database.py)
PowerShell automation (extract_game_data.ps1) for one-command extraction
DuckDB query interface for exploratory SQL analysis

The Phobos workflow demonstrates what comprehensive documentation looks like: a fresh LLM can execute the entire pipeline from scratch with 95% confidence.

Dual Pipeline Comparison

Aspect	VULTUR (Production)	Phobos (Exploratory)
Purpose	Routing database for live map	Analysis DB with extended data
Output Size	~30 MB	~70 MB
Time Required	~25 minutes	~40 minutes
External Tool	eve-frontier-tools (backup fork)	Phobos (branch: fsdbinary-t1)
Python Requirement	Any Python 3.x	Exactly Python 3.12
Critical Data	Systems, regions, stargates	+ Lagrange Points, star stats
User-Facing	Yes (production map)	No (prototyping only)
Risk Level	⭐ HIGH (map breaks if wrong)	🔬 LOW (exploratory only)

Results: From 60% to 95% Confidence

After a full day of documentation work, testing, and verification, we achieved:

VULTUR confidence: 60% → 95% (comprehensive guide, missing step documented, verification scripts working)
Phobos confidence: 95% → 95% (already excellent, enhanced with backup repo links)
Overall system: 95%+ (both pipelines fully documented and tested)

Fresh LLM Success Path

A fresh LLM agent starting from scratch can now:

Find clear entry point in README.md section 6
Follow 600+ line step-by-step guides for both pipelines
Use backup repositories under our control (eliminates external dependency risk)
Verify success at each checkpoint with automated scripts
Successfully regenerate both databases within ~1 hour total

The critical process_labels.py discovery was the game-changer. Without documenting that step, database regeneration would have failed at Step 7, and a fresh LLM would have had no way to diagnose the issue. Now it's safely captured in Part 6 of the VULTUR guide, with clear explanations of what it does and why it's required.

External Dependencies: Controlled and Mitigated

One risk with data pipelines is external tool maintenance. If VULTUR's original maintainer abandons the project, we're blocked. We mitigated this by:

Forking both external tools under our control:
- VULTUR: Diabolacal/eve-frontier-tools
- Phobos: Diabolacal/Phobos (branch: fsdbinary-t1)
Documenting exact clone URLs in setup guides
Version-pinning Python dependencies (especially Phobos requiring Python 3.12)
Storing sample outputs for validation testing

If upstream repositories disappear, we have stable forks ready to continue without interruption.

Future-Proofing: What's Next

With comprehensive documentation in place, we're ready for the upcoming universe update. But documentation is a living artifact. Future improvements include:

Phase 1: Universe Update (1-2 Months)

Execute VULTUR pipeline with new game client data
Execute Phobos pipeline for updated exploratory DB
Verify routing integrity (no broken stargate links)
Deploy updated map_data_v2.db to production

Phase 2: Data Consolidation (Future)

Investigate merging VULTUR and Phobos workflows (single extraction, dual outputs)
Compare stargate data from both sources (validate equivalence)
Prototype unified extraction script (if proven safe)

Phase 3: Lagrange Point Integration (Future)

Surface Lagrange Points in web app UI (currently in exploratory DB only)
Add jump range visualization using L-point coordinates
Enable celestial navigation mode (orbit L-points instead of stars)

Lessons Learned

Document before you need it. We caught the missing process_labels.py step during a routine audit, not during a crisis. If we'd discovered this gap during a universe update with a deadline, the pressure would have been intense.

Verification scripts save hours. Automated checks catch errors at the earliest possible moment. Without verify_vultur_extraction.py, we'd only discover file issues when the database builder fails—wasting time backtracking.

External dependencies are risks. Forking and documenting exact tool versions eliminates "it worked yesterday" surprises when maintainers change APIs or abandon projects.

LLM-friendly documentation has structure. Comprehensive guides aren't wall-of-text essays. They're numbered steps, command sequences, troubleshooting sections, and clear success criteria. A future LLM agent needs actionable instructions, not high-level architecture descriptions.

Try It Yourself

Curious about EVE Frontier's universe structure? The Phobos exploratory database is yours to query. Clone the repo, follow tools/data-query/UNIVERSE_CHANGE_PLAN.md, and start exploring:

Which systems have the most Lagrange Points?
What's the temperature distribution of stars in high-security regions?
How far are asteroid belts from their parent stars?

The data is open, the tools are documented, and the queries are waiting.

Conclusion

Database regeneration isn't glamorous work. There's no flashy UI, no user-facing feature announcement. But it's foundational—the kind of infrastructure work that ensures EF-Map remains reliable when EVE Frontier's universe evolves.

By investing a day in comprehensive documentation, verification scripts, and workflow testing, we've transformed database regeneration from a risky manual process into a battle-tested, LLM-executable pipeline. When the universe changes in 1-2 months, we'll be ready.

And when a fresh LLM agent needs to rebuild the databases years from now? The documentation will still be there, waiting.