Deep Space Research Platform
System Design & Technical Architecture

Vision to Execution

Prabhu Sadasivam
ORCID: 0009-0008-0069-693X
Classification: Public

1. Vision & Strategic Intent

1.1 Mission

Build a unified, publicly accessible platform for deep-space scientific research that consolidates real-time telemetry from NASA missions, interstellar object observations, and theoretical cosmology simulations into an integrated analytical and visualization system — while maintaining near-zero operational cost and production-grade reliability.

1.2 Strategic Objectives

Objective	Outcome
Scientific accessibility	Make NASA mission data explorable by anyone through a web interface — no specialized tools required
Cross-domain correlation	Enable analytical connections across Voyager 1 telemetry, interstellar object tracking, space weather, and black hole cosmology
Reproducible research	Persist every data ingestion with provenance tracking so past analyses can be reproduced even after upstream APIs change
Operational simplicity	Run the entire platform on a single EC2 instance with automated recovery — no Kubernetes, no managed services, no vendor lock-in beyond commodity compute
Architectural extensibility	Design every component — schema, API, templates, ingestion — for additive extension without modifying existing contracts

1.3 Why This Architecture

The deep-space research domain has a distinctive constraint profile:

Data is authoritative but unreliable in delivery. NASA APIs return high-quality scientific data but can be intermittent, rate-limited, or structurally changed without notice.
Computation is bursty, not sustained. Trajectory calculations and plot rendering spike on page load, then idle.
Users are readers, not writers. The web interface is read-only; data ingestion is operator-initiated.
Budget is minimal. A research platform should cost less than a coffee subscription to run.

These constraints drove every major architectural decision: server-side rendering over SPA frameworks, SQLite over managed databases, synthetic fallback over hard failures, and a single-instance deployment over distributed systems.

2. Platform Overview

The Deep Space Research Platform comprises five interconnected projects published under a single domain:

┌──────────────────────────────────────────────────────────────────────────┐
│                     prabhusadasivam.com                                  │
│                                                                          │
│  ┌──────────────────────────────────────────────────────────────────┐   │
│  │                  Deep Space Portal                                │   │
│  │   Flask web app · 15 HTML templates · 9 API endpoints            │   │
│  │   Nginx reverse proxy · Gunicorn WSGI · AWS deployment           │   │
│  └─────────┬────────────────────┬───────────────────────┬───────────┘   │
│            │                    │                       │               │
│  ┌─────────┴───────────┐ ┌─────┴───────────────┐ ┌────┴────────────┐   │
│  │   Voyager 1 Suite   │ │  3I/ATLAS Research   │ │  Black Hole     │   │
│  │   4 analysis modules│ │  Jupyter pipeline    │ │  Simulation     │   │
│  │   Trajectory, PWS,  │ │  Ephemerides + MAST  │ │  Bouncing       │   │
│  │   Density, Magneto  │ │  Orbital elements    │ │  cosmology      │   │
│  └─────────┬───────────┘ └──────────┬───────────┘ └───────┬────────┘   │
│            │                         │                       │           │
│            └─────────────┬───────────┘───────────────────────┘           │
│                          ▼                                               │
│              ┌───────────────────────┐                                   │
│              │    Unified Analytics  │                                   │
│              │    Database           │                                   │
│              │    (deep_space_db)    │                                   │
│              └───────────────────────┘                                   │
└──────────────────────────────────────────────────────────────────────────┘

Project	Repository	Purpose
Deep Space Portal	PSadasivam/deep-space-portal	Flask web application, HTML templates, nginx config, deployment infrastructure — presentation layer for all research projects
Voyager 1 Analysis	PSadasivam/voyager1-analysis	Scientific analysis modules: trajectory, plasma waves, electron density, magnetometer — pure computation, no web dependencies
3I/ATLAS Research	PSadasivam/3I-ATLAS-research	Jupyter pipeline for interstellar comet C/2025 N1 (ATLAS): ephemerides, MAST archive queries, orbital elements
Black Hole Simulation	PSadasivam/universe-inside-blackhole	Bouncing cosmology: Schwarzschild radius of total universe mass using Planck 2018 parameters
Unified Analytics DB	PSadasivam/deep-space-db	SQLite database consolidating all research data with S3 backup and audit logging

3. System Context & Integration Landscape

3.1 External System Dependencies

                              ┌─────────────────────┐
                              │    prabhusadasivam   │
                              │        .com          │
                              └──────────┬───────────┘
                                         │
         ┌──────────┬──────────┬─────────┼─────────┬───────────┬──────────┐
         ▼          ▼          ▼         ▼         ▼           ▼          ▼
    ┌─────────┐ ┌────────┐ ┌────────┐ ┌──────┐ ┌───────┐ ┌────────┐ ┌──────┐
    │   JPL   │ │  NASA  │ │  NASA  │ │ NASA │ │ NASA  │ │  NOAA  │ │ MAST │
    │HORIZONS │ │  SPDF  │ │  PDS   │ │ NeoWs│ │ DONKI │ │  SWPC  │ │(STScI│
    │         │ │        │ │  PPI   │ │      │ │       │ │        │ │)     │
    └─────────┘ └────────┘ └────────┘ └──────┘ └───────┘ └────────┘ └──────┘
    Trajectory   Magneto-   Plasma     Near-    Solar     Kp Index   HST/JWST
    Ephemerides  meter CDF  Wave CDF   Earth    Flares    Forecast   Archive
                                       Objects  CMEs/GST

Upstream System	Protocol	Auth	Failure Mode	Fallback Strategy
JPL HORIZONS	`astroquery` (HTTP)	None	Timeout / 5xx	Synthetic trajectory model
NASA SPDF	HTTPS file download	None	404 / timeout	Synthetic magnetometer data
NASA PDS PPI	HTTPS file download	None	404 / timeout	Synthetic plasma wave generator
NASA NeoWs	REST JSON	API key (env var)	Rate limit / 5xx	15-min stale cache
NASA DONKI	REST JSON	API key (env var)	Rate limit / 5xx	15-min stale cache
NOAA SWPC	REST JSON	None	Timeout	15-min stale cache
MAST (STScI)	`astroquery` (HTTP)	None	Timeout	Cached CSVs
MPC	Static files	None	N/A	Local JSON/obs files

Key design decision: Every external dependency has an explicit fallback path. The platform never returns an error page due to upstream API failure.

3.2 Downstream Consumers

Consumer	Interface	Usage
Web browsers	HTTPS (HTML + embedded base64 images)	Primary end-user interface
Search engines	`/sitemap.xml`, `/robots.txt`, SEO meta tags	Discovery and indexing
LLM/AI crawlers	`/ai-index` structured knowledge page	Machine-readable research index
SQLite clients	File-based `.db` access	Ad-hoc research queries
Jupyter notebooks	`pandas.read_sql()` via `sqlite3`	Analytical workflows

4. Architecture Principles

These principles govern all design decisions across the platform:

#	Principle	Rationale	Application
1	Graceful degradation over hard failure	NASA APIs are unreliable; pages must always render	Synthetic fallback generators for every data source
2	Server-side rendering over client-side frameworks	Eliminates JavaScript build toolchain, reduces attack surface, ensures SEO	Flask + Jinja2 templates with base64-embedded plots
3	Zero third-party runtime dependencies where possible	Minimizes supply chain risk and dependency management	`deep_space_db` uses Python stdlib only; web app pins exact versions
4	Additive extension, not modification	New research projects should not require changes to existing code	New tables, new ingestion functions, new Flask routes — each additive
5	Data provenance on every record	Research integrity requires knowing where data came from	`source` column on every row; `ingestion_log` table
6	Cost proportional to value	Research infrastructure should not cost more than the research itself	SQLite (free), single EC2 (~$15/mo), S3 (< $0.01/mo)
7	Operational simplicity over architectural elegance	One person operates the entire platform	Single process, single instance, systemd restart, git-pull deploys

5. Technology Stack & Decision Rationale

5.1 Stack Overview

Layer	Technology	Version
Language	Python	3.11+ (EC2), 3.13 (local)
Web framework	Flask	Latest stable
WSGI server	Gunicorn	2 workers
Reverse proxy	Nginx	Amazon Linux default
TLS	Let's Encrypt / Certbot	Auto-renewing
Compute	AWS EC2 t3.small	2 vCPU, 2 GB RAM
Static IP	AWS Elastic IP	Associated to EC2
DNS	GoDaddy	A + CNAME records
Database	SQLite 3 (WAL)	Bundled with Python
Backup storage	AWS S3	Versioned, private
Process management	systemd	`Restart=always`
Scientific computing	NumPy, SciPy, Matplotlib, Astropy, Astroquery	Pinned in requirements.txt

5.2 Key Technology Decisions

Why Flask over Django/FastAPI: Django's ORM, admin panel, and auth middleware are unnecessary for a read-only research site. FastAPI's async model adds complexity with no benefit when every request is a synchronous NASA API call followed by matplotlib rendering. Flask provides the minimal surface needed: routing, Jinja2 templates, and request handling.

Why server-side matplotlib over D3.js/Plotly: Scientific plots require precise control over axes, annotations, colormaps, and dual-axis layouts that matplotlib provides natively. Embedding plots as base64 PNGs in JSON responses eliminates JavaScript rendering, client-side library loading, and cross-browser compatibility issues. The trade-off (no client-side interactivity) is acceptable for a research presentation platform.

Why SQLite over PostgreSQL/DynamoDB: The workload is single-writer, read-heavy, and local. SQLite requires no server process, no connection pooling, no credential management, and no monthly cost. WAL mode enables concurrent reads during ingestion. The 281 TB theoretical limit far exceeds projected needs. If multi-user access becomes necessary, the schema is standard SQL and can migrate to PostgreSQL without DDL changes.

Why AWS CLI over boto3 for S3: The backup script needs exactly three operations: upload, list, download. The AWS CLI handles these with subprocess calls, avoiding a boto3 dependency, its transitive dependency tree, and the associated supply chain surface. For a script that runs once per day, the subprocess overhead is negligible.

Why Gunicorn with 2 workers: Each worker handles one request at a time. Matplotlib is not thread-safe, so preload_app=True with sync workers is the correct model. Two workers provide basic concurrency (one can render a plot while the other serves a cached page) without exceeding the 2 GB RAM of a t3.small.

6. Infrastructure Architecture

6.1 Deployment Topology

                              Internet
                                 │
                            ┌────┴────┐
                            │ GoDaddy │
                            │   DNS   │  A: prabhusadasivam.com → Elastic IP
                            └────┬────┘  CNAME: www → prabhusadasivam.com
                                 │
                    ┌────────────┴────────────┐
                    │    AWS EC2 t3.small      │
                    │    Amazon Linux 2023     │
                    │    Elastic IP attached   │
                    │                          │
                    │  ┌────────────────────┐  │
                    │  │      Nginx         │  │
                    │  │  :80 → 301 HTTPS   │  │
                    │  │  :443 → TLS term   │  │
                    │  │  proxy_pass :8000   │  │
                    │  │  security headers   │  │
                    │  │  dotfile deny       │  │
                    │  └─────────┬──────────┘  │
                    │            │              │
                    │  ┌─────────┴──────────┐  │
                    │  │    Gunicorn        │  │
                    │  │  127.0.0.1:8000    │  │
                    │  │  2 sync workers    │  │
                    │  │  systemd managed   │  │
                    │  └─────────┬──────────┘  │
                    │            │              │
                    │  ┌─────────┴──────────┐  │
                    │  │  Flask Application │  │
                    │  │  (deep_space_portal)│  │
                    │  │  15 page routes    │  │
                    │  │  9 API endpoints   │  │
                    │  │  4 utility routes  │  │
                    │  └────────────────────┘  │
                    │                          │
                    └──────────┬───────────────┘
                               │
                    ┌──────────┴───────────────┐
                    │        AWS S3             │
                    │  db-backups/ (versioned)  │
                    └──────────────────────────┘

6.2 Network Security

Port	Protocol	Source	Purpose
22	SSH	Operator IP only	Deployment and maintenance
80	HTTP	0.0.0.0/0	Redirect to HTTPS
443	HTTPS	0.0.0.0/0	Application traffic
8000	HTTP	127.0.0.1 only	Gunicorn (not internet-exposed)

6.3 Nginx Hardening

# Security headers on every response
X-Frame-Options:           DENY
X-Content-Type-Options:    nosniff
Referrer-Policy:           strict-origin-when-cross-origin

# Deny access to dotfiles (.git, .env, etc.)
location ~ /\. { deny all; }

# Proxy configuration — loopback only
proxy_pass http://127.0.0.1:8000;

6.4 Process Lifecycle

                    Boot
                      │
                      ▼
              systemd starts
         deep_space_portal.service
                      │
                      ▼
              Gunicorn spawns
              2 Flask workers
                      │
              ┌───────┴───────┐
              │               │
           Worker 1        Worker 2
              │               │
              ▼               ▼
         Serve requests  Serve requests
              │
              │ ← Process crash
              ▼
        systemd detects
        exit code ≠ 0
              │
              ▼
        Restart=always
        RestartSec=3
              │
              ▼
        Gunicorn respawns

7. Application Architecture

7.1 Request Flow

Browser GET /trajectory
         │
         ▼
     Nginx (TLS termination, security headers)
         │
         ▼
     Gunicorn → Flask route handler
         │
         ├── render_template("trajectory.html")
         │       ├── nav_caret=True (dropdown nav config)
         │       ├── page-specific meta tags (SEO)
         │       └── Template extends base layout
         │
         └── Template includes <script> that calls /api/trajectory
                    │
                    ▼
              Flask API handler
                    │
                    ├── Try: JPL HORIZONS query (real data)
                    │         │
                    │         ├── Success → matplotlib plot → base64 PNG
                    │         └── Failure ──┐
                    │                       │
                    ├── Fallback: synthetic trajectory model
                    │         │
                    │         └── matplotlib plot → base64 PNG
                    │
                    └── Return JSON:
                          {
                            "plot": "data:image/png;base64,...",
                            "events": [...],
                            "galactic_coords": {...},
                            "data_source": "jpl_horizons" | "synthetic"
                          }

7.2 Rendering Strategy

All visualization follows a consistent server-side rendering pipeline:

Data acquisition — Call external API or generate synthetic data
Scientific computation — NumPy/SciPy processing (filtering, FFT, ridge detection)
Plot generation — Matplotlib with Agg backend (no display server required)
Encoding — io.BytesIO → base64 string
Transport — JSON response with embedded base64 PNG
Display — Jinja2 template sets <img src=""> or JavaScript img.src = data.plot

This approach guarantees:

Identical rendering across all browsers and devices
No client-side JavaScript libraries for charting
SEO-friendly — content is in the initial HTML response
Cacheable — API responses can be cached at any layer

7.3 Caching Architecture

Layer	Strategy	TTL	Scope
Space Intelligence APIs	In-memory Python dict	15 minutes	Per-worker process
Stale cache fallback	Return last successful response on API failure	Until next success	Per-worker process
Browser	Standard HTTP caching headers via Nginx	Nginx defaults	Per-client

Design decision: No external cache (Redis, Memcached) is used. The in-memory approach is sufficient for single-instance deployment and eliminates an infrastructure dependency. If scaling to multiple instances, a shared cache layer would be introduced.

7.4 Template Architecture

All 15 page templates plus 2 shared partials (_footer.html, _scroll_top.html) share:

Dark mission-control theme (#0c0c0c → #1a1a2e → #16213e)
3D glassmorphic navigation bar with split-button dropdowns (Voyager 1 · Space Intelligence · Deep Research)
SEO meta tags (title, description, Open Graph, Twitter Cards, JSON-LD where applicable)
Consistent <header> → <main> → journey navigation → <footer> structure

Voyager 1 analytical pages are connected through a sequential journey navigation:

Facts → Trajectory → Plasma Waves → Density → Magnetometer → 3I/ATLAS

Each analytical page includes "Previous" / "Next" links, creating a guided research narrative through the data. The home page exposes every section via the hero CTA dropdowns (canonical navigation) and a curated "Where To Go Next" prose pointer rather than a sitemap-style card grid.

7.5 Shared Scientific Models (ADR-002)

A small set of scientific constants — Voyager 1's heliopause anchor (121.0 AU on 2012-08-25), post-heliopause drift rate (3.6 AU/yr), and J2000 pointing direction (RA 17.22 h, Dec 12.08°) — are exposed by a single Python module, voyager1_project/voyager1_position_model.py. Every consumer routes through it.

Consumer	Use
`app.py:_voyager1_live_stats()`	Distance, light-time, mission-age on `/facts` and `/` (date-keyed `lru_cache`)
`templates/home.html` (server-rendered)	Hero distance and "X-year journey" line
`voyager1_outbound_trajectory.fetch_trajectory_synthetic()`	Synthetic trajectory points and direction vector for `/trajectory`
`verify_voyager_position.py`	Annual reconciliation against JPL Horizons

Why it matters. Consistency of message is a product principle and an architectural one. Two implementations of the same constant inevitably drift; a visitor who notices the inconsistency loses trust in every number on the site. The single-module approach makes drift impossible — a re-anchor against JPL Horizons updates every public surface in one commit. The pattern (constants in one pure-arithmetic module; all consumers import) generalises to any future scientific model the platform exposes.

Validation cadence.

Layer	Cadence	Mechanism
Bound check	Every CI run	`tests/test_facts.py` asserts `150 ≤ AU ≤ 250` for any date 2025–2035
End-point agreement	Every CI run	`tests/test_outbound_trajectory_position.py` asserts the synthetic-trajectory endpoint equals `voyager1_distance_au()` exactly when the step grid lands on `end_date`, and within rounding otherwise
Annual reconciliation	Once per year (next: May 2027)	`verify_voyager_position.py` against JPL Horizons; re-anchor if \|Δ\| > 1.0 AU
Public-facing transparency	Always	`/facts` page footer discloses the model and reconciliation schedule

7.6 Architectural Decision Records

ADRs are captured inline in the ticket that introduced them rather than in a dedicated docs/adr/ directory. The current set originated from the dynamic-/facts work and now governs every page that displays a Voyager 1 number.

ADR	Decision	Rationale (one line)
001	Compute distance; do not compute speed	Compute only values whose change exceeds display precision within the page's refresh cadence. Distance changes ~0.01 AU/day (visible); speed is constant at this resolution.
002	Single source of truth for the Voyager 1 position model	Two implementations of the same constant inevitably drift. One module, one number, every page. See §7.5.
003	No live JPL Horizons / network calls on the request path of storytelling pages (`/`, `/facts`)	Horizons is a validation concern, not a serving concern. Real-time queries would couple every page render to an external service with no value to the visitor.
004	Server-rendered values; no client-side ticking counters	Spinning digits trivialise the achievement. A single rendered number reads as fact, not as decoration.

ADRs 001 and 004 are editorial decisions encoded in code; 002 and 003 are structural and enforced by the test suite (§7.5). When a future decision rises to the level of “we will not revisit this without a written reason”, append it to this table.

8. Data Architecture

8.1 Data Domains

Domain	Data Nature	Volume Characteristic	Primary Source
Voyager 1 Telemetry	Time-series instrument readings	Sparse (daily/hourly resolution from ~170 AU heliocentric)	NASA SPDF, PDS, JPL HORIZONS
3I/ATLAS Tracking	Positional ephemerides + archival observations	Batch (periodic notebook runs)	JPL HORIZONS, MAST, MPC
Black Hole Simulation	Physical constants + derived quantities	Static (recomputed on demand)	Planck 2018 parameters
Space Intelligence	NEO approaches + solar activity	Streaming-like (15-min refresh)	NASA NeoWs, DONKI, NOAA SWPC
Research Insights	Cross-project findings and hypotheses	Append-only (human-authored)	Manual entry

8.2 Unified Database Schema

┌──────────────────────────────────────────────────────────────────┐
│                     deep_space_research.db                        │
│                                                                   │
│  VOYAGER 1 (5 tables)          3I/ATLAS (4 tables)               │
│  ┌────────────────────┐        ┌────────────────────┐            │
│  │ magnetic_field     │        │ ephemerides        │            │
│  │ plasma_wave        │        │ mast_observations  │            │
│  │ electron_density   │        │ orbital_elements   │            │
│  │ trajectory         │        │ datasets           │            │
│  │ events             │        └────────────────────┘            │
│  └────────────────────┘                                          │
│                                                                   │
│  BLACK HOLE (1 table)          SPACE INTEL (2 tables)            │
│  ┌────────────────────┐        ┌────────────────────┐            │
│  │ simulations        │        │ neos               │            │
│  └────────────────────┘        │ solar              │            │
│                                └────────────────────┘            │
│                                                                   │
│  METADATA (3 tables)                                             │
│  ┌────────────────────┐                                          │
│  │ research_insights  │  ← cross-project analytical findings     │
│  │ ingestion_log      │  ← every data load audited               │
│  │ s3_backup_log      │  ← every backup audited                  │
│  └────────────────────┘                                          │
└──────────────────────────────────────────────────────────────────┘

8.3 Schema Design Invariants

Invariant	Implementation	Purpose
Temporal key	`timestamp_utc TEXT NOT NULL` (ISO 8601)	Range queries via string comparison
Provenance	`source TEXT DEFAULT '<origin>'`	Distinguish real vs. synthetic vs. derived data
Audit timestamp	`ingested_at TEXT DEFAULT (datetime('now'))`	Track when data entered the system, independent of observation time
Indexed access	B-tree index on primary timestamp column	Sub-millisecond range scans on time-series data
Parameterized writes	All SQL uses `?` placeholders	Prevent injection regardless of data content

8.4 Data Lineage

NASA SPDF → voyager1_magneticfield_nTS_analysis.py → /api/magnetometer → JSON
                                                   ↘
                                                    → init_db.py → voyager1_magnetic_field table
                                                                              ↓
                                                                   s3_backup.py → S3

JPL HORIZONS → astroquery → /api/trajectory → JSON
             → astroquery → 3I_ATLAS_research_notebook.ipynb → ephemerides.csv
                                                              → init_db.py → atlas_3i_ephemerides table

Every data point can be traced from its NASA/JPL origin through the ingestion pipeline to the database row, identified by the source column.

9. API Design & Contract

9.1 API Inventory

Endpoint	Method	Parameters	Response Shape
`/api/trajectory`	GET	—	`{ plot, events[], galactic_coords, data_source }`
`/api/position`	GET	—	`{ plot, distance_au, coordinates, data_source }`
`/api/magnetometer`	GET	`?days=`	`{ plot, statistics{}, data_source }`
`/api/plasma`	GET	`?hours=&freq_min=&freq_max=`	`{ spectrogram, spectrum, time_series, statistics }`
`/api/density`	GET	`?hours=`	`{ process_plot, nasa_plot, statistics }`
`/api/space-intelligence`	GET	—	`{ neos[], flares[], cmes[], storms[], kp_index, forecast[], highlights[] }`
`/api/status`	GET	—	`{ status, cdflib, data_sources{} }`

9.2 Response Contract

All plot-bearing endpoints follow a consistent contract:

{
  "plot": "data:image/png;base64,...",     // Always present
  "data_source": "jpl_horizons|synthetic", // Always present — transparency
  "statistics": { ... },                    // Domain-specific metrics
  "events": [ ... ]                         // Optional: notable data points
}

The data_source field is critical to architectural integrity: it tells the consumer whether they are viewing real NASA data or a synthetic approximation. This transparency extends the provenance principle from the database layer to the API layer.

9.3 Error Handling Contract

API endpoints never return HTTP 5xx. The degradation hierarchy is:

Real data (preferred) — external API succeeded
Cached data — external API failed, in-memory cache is fresh (< 15 min)
Stale cache — external API failed, cache is stale but usable
Synthetic data — no cache available; generate mathematically plausible data
Empty response with explanation — only if generation itself fails (extremely rare)

10. Cross-Project Integration Strategy

10.1 Integration Architecture

┌──────────────┐    File System     ┌──────────────┐
│  3I/ATLAS    │ ──── CSVs/JSONs ──►│  deep_space  │
│  Research    │                     │      _db     │
└──────┬───────┘                     └──────────────┘
       │                                    ▲
       │ PNGs to Images/                    │ CSVs/JSONs
       ▼                                    │
┌──────────────┐                     ┌──────┴───────┐
│  Voyager 1   │ ◄── imports ───────│  Deep Space   │
│  (science)   │                     │  Portal       │
└──────────────┘                     └──────────────┘
       │                                    │
       │                                    │ Serves all pages under one domain
       ▼                                    ▼
  Analysis outputs              prabhusadasivam.com

Mechanism	Pattern	Example
File-system sharing	Sibling directories	`init_db.py` reads `../3I-Atlas-Research/ephemerides.csv`
Python path import	Portal adds `../voyager1_project` to `sys.path`	`from voyager1_magneticfield_nTS_analysis import fetch_ephemeris`
Presentation integration	Portal Flask renders templates for all projects	`/atlas`, `/blackhole`, `/mars` pages in portal app
Analytical integration	`research_insights` table links findings across domains	"Voyager 1 measures the interstellar medium that 3I/ATLAS traveled through"

10.2 Integration Design Rationale

Projects are deliberately not microservices. A monolithic Flask application (the portal) serves all pages because:

There is one operator, one deployment target, one domain
Cross-project pages share navigation, styling, and SEO configuration
The complexity cost of inter-service communication exceeds the benefit at this scale
Adding a new research project is a single Flask route + template — no new infrastructure

The key architectural improvement is that science modules are now decoupled from presentation. Voyager 1 analysis scripts can be used as CLI tools, imported into Jupyter notebooks, or called by the portal — without carrying Flask/Gunicorn dependencies.

This will be revisited if the platform requires multi-team development or independent scaling (see §17).

11. Resilience & Graceful Degradation

11.1 Failure Mode Analysis

Failure	Detection	Recovery	User Impact
NASA API timeout	`requests.Timeout` / `astroquery` exception	Synthetic data generator	Page renders with approximated data; `data_source: synthetic`
NASA API rate limit	HTTP 429	15-min in-memory cache serves stale data	Transparent — data is at most 15 min old
Flask worker crash	systemd detects exit code ≠ 0	`Restart=always`, `RestartSec=3`	< 3 seconds of unavailability
EC2 instance reboot	systemd `WantedBy=multi-user.target`	Auto-start on boot	Minutes of unavailability
Database corruption	`PRAGMA integrity_check` failure	`python s3_backup.py restore`	Database queries fail until restored (web app is unaffected)
S3 backup deletion	Missing object on `aws s3 ls`	S3 versioning recovers previous version	No user impact
Let's Encrypt renewal failure	Certbot logs / browser TLS error	Manual `certbot renew`	HTTPS certificate warning
Disk full	Write failure	WAL checkpoint + cleanup	Ingestion/backup fails; web app continues serving

11.2 Synthetic Data Architecture

The synthetic data generators are not test mocks — they are mathematically grounded models:

Generator	Model	Accuracy
Trajectory / position	Heliopause-anchored linear drift (`121.0 AU @ 2012-08-25 + 3.6 AU/yr`) sourced from `voyager1_position_model.py`; J2000 pointing applied for 3-D positions	< 0.3 AU vs. JPL Horizons at year-out distances; reconciled annually (see §7.5)
Magnetometer	Gaussian noise around 0.1 nT baseline (interstellar medium conditions)	Physically plausible range
Plasma wave	Multi-frequency synthetic spectrogram with realistic power-law spectrum	Structurally accurate; not observation data
Electron density	Derived from synthetic plasma frequency	Formula-exact; input is synthetic

11.3 Resilience Principle

The platform renders every page, every time. No combination of external failures produces a user-facing error. The quality of data may degrade (real → cached → synthetic), but the experience never breaks.

12. Security Architecture

12.1 Defense in Depth

Layer 1: Network
├── Security group: SSH restricted to operator IP
├── HTTP/HTTPS open (required for public website)
└── Gunicorn bound to 127.0.0.1 only

Layer 2: TLS
├── Let's Encrypt certificate (auto-renewal)
├── HTTP → HTTPS redirect (Nginx)
└── HSTS implied by redirect

Layer 3: Nginx
├── X-Frame-Options: DENY
├── X-Content-Type-Options: nosniff
├── Referrer-Policy: strict-origin-when-cross-origin
├── Dotfile access denied (blocks .git, .env)
└── /api/ disallowed in robots.txt

Layer 4: Application
├── Flask debug mode off by default (env-var gated)
├── SECRET_KEY via environment variable (secure random fallback)
├── flask-limiter: 10 req/min on all /api/* endpoints
├── Path traversal blocked: rejects '..' in image filenames
├── SRI integrity hashes on all CDN resources (KaTeX)
├── No user input to SQL queries (read-only web interface)
├── Parameterized SQL in all ingestion scripts
└── NASA API key via environment variable

Layer 5: Data
├── SQLite WAL mode (crash consistency)
├── Source provenance on every row
├── .gitignore excludes .db files and .env
└── No PII in any data table

Layer 6: Cloud
├── S3 bucket: all public access blocked
├── S3 versioning: protects against deletion
├── S3 SSE-S3: encryption at rest
├── IAM: scoped credentials recommended
└── AWS CLI: HTTPS enforced for all S3 operations

12.2 Secrets Inventory

Secret	Location	In Source Control?
AWS Access Key / Secret Key	`~/.aws/credentials`	No
Flask SECRET_KEY	Environment variable on EC2	No
NASA API Key	`NASA_API_KEY` env var	No
SSH private key	`~/voyager1-deploy.pem` (local)	No
S3 bucket name	`S3_BACKUP_BUCKET` env var	No

No secrets exist in any committed file across all four repositories.

12.3 STRIDE Threat Model

A complete STRIDE analysis is documented separately, covering 14 identified threats across Spoofing, Tampering, Repudiation, Information Disclosure, DoS, and Elevation of Privilege — with trust boundary diagrams, asset inventory with CIA ratings, attack surface assessment, and incident response playbooks.

13. Scalability & Growth Strategy

13.1 Current Capacity

Dimension	Current	Comfortable Ceiling
Concurrent users	~5–10 (2 Gunicorn workers)	~50 with 4 workers
Database rows	86	10 million+ (SQLite with indexes)
Database size	116 KB	1 GB+
S3 backup cost	< $0.01/mo	< $1/mo at 10 GB
Page load time	2–5 sec (API-dependent)	Cacheable to < 1 sec

13.2 Scaling Triggers & Responses

Trigger	Threshold	Response
Concurrent users > 50	Gunicorn worker saturation	Increase workers (up to CPU count); add response caching
Concurrent users > 200	EC2 t3.small limit	Upgrade instance type; add CloudFront CDN for static assets
Concurrent users > 1000	Single-instance limit	Add ALB + multiple EC2 instances; externalize cache to ElastiCache
Database > 1 GB	SQLite performance risk	Evaluate DuckDB (analytical) or PostgreSQL RDS (multi-user)
Multiple contributors	Concurrent write contention	Migrate to PostgreSQL; add application-level auth
Multiple research projects > 6	Template/route sprawl	Extract shared framework; consider project-specific Flask Blueprints

13.3 What Will NOT Change

These are architectural invariants that hold regardless of scale:

Server-side plot rendering — matplotlib's scientific fidelity is not replaceable by JavaScript charting
Data provenance on every row — source and ingested_at columns are non-negotiable
Graceful degradation — synthetic fallback remains the failure strategy
Standard SQL — no ORM, no SQLite-specific syntax, migration-ready at all times
Single source of truth for shared scientific constants (ADR-002) — Voyager 1's heliopause anchor, drift rate, and pointing direction live in exactly one Python module; every consumer imports. The pattern generalises to any future shared scientific model.

14. Operational Excellence

14.1 Deployment Model

Aspect	Current State	Target State
Deployment method	`git pull` + `systemctl restart` via SSH	GitHub Actions → EC2 deploy
Rollback	`git revert` + `systemctl restart`	Automated rollback on health check failure
Health check	Manual `curl` / browser	`/api/status` endpoint (exists); automated monitoring pending
Log access	`journalctl -u voyager1` on EC2	CloudWatch Logs agent
Uptime monitoring	Manual	External ping service (UptimeRobot / AWS Route 53 health check)

14.2 Database Operations

Operation	Command	Frequency
Full init	`python init_db.py`	On schema change
Re-ingest	`python init_db.py --ingest-only`	When source data updates
Schema update	`python init_db.py --schema-only`	On table additions
Backup	`python s3_backup.py backup`	After each ingestion (manual; daily cron recommended)
Restore	`python s3_backup.py restore`	On corruption or data loss
Audit query	`SELECT * FROM ingestion_log ORDER BY ingested_at DESC`	Ad-hoc

14.3 Observability

Signal	Current	Gap
Application logs	stdout → journalctl	No centralized log aggregation
Error tracking	Log inspection	No alerting on exceptions
API latency	Not measured	Add request timing middleware
Uptime	Not monitored	Add external health check
Data freshness	`ingested_at` column queryable	No automated staleness alerts

15. Risks, Constraints & Trade-Offs

15.1 Architectural Trade-Offs

Decision	What We Gain	What We Accept
Server-side rendering	SEO, simplicity, no JS build chain	No client-side interactivity on plots
Single EC2 instance	Simplicity, low cost	Single point of failure for compute
SQLite over PostgreSQL	Zero cost, zero ops, portable	Single writer, no concurrent access
Monolithic Flask app	One deploy, shared navigation	All projects coupled in one process
Synthetic fallback	Pages always render	Users may see approximated data
Manual deployment	No CI/CD infrastructure to maintain	Human error risk; slower deploy cycle
2 Gunicorn workers	Fits t3.small memory	Limited concurrent request handling

15.2 Technical Debt Register

Item	Severity	Effort	Impact if Unaddressed
No CI/CD pipeline	Medium	Medium	Manual deploys remain error-prone
No uptime monitoring	Medium	Low	Outages detected by manual observation
No centralized logging	Low	Medium	Debugging requires SSH to EC2
Root AWS credentials in use	High	Low	Over-privileged access; security risk
No database encryption at rest	Low	Low	Acceptable — no PII or classified data

Resolved (recently): A 72-test pytest regression suite (deep_space_portal/tests/) now covers route smoke tests, voice/structure integrity for the home page, the Voyager 1 position model bound check, and endpoint agreement between the synthetic trajectory and the shared position helper. Run on every commit before deploy.

15.3 Constraints

Constraint	Source	Impact
NASA API rate limits	External policy	Drives caching strategy (15-min TTL)
Matplotlib not thread-safe	Library limitation	Mandates sync Gunicorn workers
SQLite single-writer	Engine limitation	Prevents concurrent ingestion processes
EC2 t3.small 2 GB RAM	Instance type	Limits Gunicorn workers and in-memory data
No PII in scope	Data classification	Simplifies security requirements significantly

16. Roadmap & Evolution Path

Phase 1 — Foundation (Complete)

Flask web application — 15 page routes, 9 API endpoints, 4 utility routes
Real-time NASA/JPL data integration with synthetic fallback on every external source
Voyager 1 analytical suite: facts, trajectory, plasma waves, electron density, magnetometer
Voyager Story long-form narrative page (/voyager-story)
Single source of truth for Voyager 1 position (ADR-002) — voyager1_position_model.py consumed by /facts, /, /trajectory, and verify_voyager_position.py
Dynamic /facts page with calibrated linear position model and annual JPL Horizons reconciliation
Space Intelligence (NEOs + space weather), Orbital Density, Live Orbit 3D
3I/ATLAS Jupyter research pipeline
Black hole bouncing cosmology simulation
Mars 1993 mission page; AI-index structured knowledge page; this Architecture page
72-test pytest regression suite (routes, voice, position model, end-point agreement)
Unified SQLite analytics database (15 tables)
S3 backup with versioning and audit logging
EC2 deployment with Nginx, Gunicorn, HTTPS, systemd
Security threat model and architecture documentation

Phase 2 — Operational Maturity

Automated backup schedule (cron/Task Scheduler)
CI/CD pipeline (GitHub Actions → EC2)
Uptime monitoring (Route 53 health check or UptimeRobot)
Replace root AWS credentials with scoped IAM user
Quarterly drift-telemetry job: append (date, horizons_au, model_au, Δ) to a CSV (currently manual; see §7.5)

Phase 3 — Data Pipeline Automation

Scheduled ingestion from Flask /api/* endpoints into database
Space intelligence data persistence (NEOs, solar events)
Staleness detection and alerting on aged data
Data quality dashboards in Jupyter

Phase 4 — Analytics & Insights

Parquet export for S3/Athena serverless queries
Cross-project anomaly detection (magnetic field × solar activity)
Research insight timeline visualization
Streamlit or Jupyter dashboard for interactive analysis

Phase 5 — Scale (If Warranted)

CloudFront CDN for static assets and plot caching
PostgreSQL migration for multi-user access
Flask Blueprints for project-specific modules
Multi-AZ deployment for high availability

17. Appendix — Reference

A. Repository Map

Repository	URL	Branch
Deep Space Portal	github.com/PSadasivam/deep-space-portal	`main`
Voyager 1 Analysis	github.com/PSadasivam/voyager1-analysis	`main`
3I/ATLAS Research	github.com/PSadasivam/3I-ATLAS-research	`main`
Black Hole Simulation	github.com/PSadasivam/universe-inside-blackhole	`main`
Unified Analytics DB	github.com/PSadasivam/deep-space-db	`main`

B. External API Reference

API	Endpoint	Data	Auth
JPL HORIZONS	`astroquery.jplhorizons`	Positions, velocities, ephemerides	None
NASA SPDF	`spdf.gsfc.nasa.gov`	Magnetometer CDF/CSV files	None
NASA PDS PPI	`pds-ppi.igpp.ucla.edu`	Plasma wave CDF files	None
NASA NeoWs	`api.nasa.gov/neo/rest/v1/feed`	Near-Earth objects	API key
NASA DONKI	`api.nasa.gov/DONKI/{FLR,CME,GST}`	Solar flares, CMEs, storms	API key
NOAA SWPC	`services.swpc.noaa.gov/products/`	Kp index, forecast	None
MAST (STScI)	`astroquery.mast.Observations`	HST/JWST archive metadata	None
MPC	Local `3I_mpc_orb.json`	Orbital elements	N/A

C. Key Configuration Files

File	Location	Purpose
`deep_space_portal.nginx.conf`	deep_space_portal/	Nginx reverse proxy + security headers
`voyager1.service`	/etc/systemd/system/ (EC2)	systemd service definition
`requirements.txt`	deep_space_portal/	Web + science dependencies
`requirements.txt`	voyager1_project/	Science-only dependencies
`schema.sql`	deep_space_db/	Database DDL
`.gitignore`	deep_space_db/	Excludes .db, WAL, .env

D. Related Documentation

Document	Location	Scope
Database Architecture	deep_space_db/docs/database-architecture.md	Schema design, ingestion, queries, scalability
Security Threat Model	deep_space_portal/docs/security-threat-model.md	STRIDE analysis, controls, incident response
AWS Deployment Guide	deep_space_portal/docs/aws-deployment.md	EC2 setup, Nginx, Certbot, systemd
Getting Started	voyager1_project/docs/getting-started.md	Local development setup

Software Architecture

Deep Space Research PlatformSystem Design & Technical Architecture