Cybercastor
Cybercastor is the Riverscapes Consortium's cloud-based batch processing engine — purpose-built to run geospatial models at scale, without you having to think about servers, infrastructure, or queuing systems. Submit a job, walk away, and come back to find hundreds or thousands of completed model outputs waiting for you in the Riverscapes Data Exchange.
Originally designed to run the Riverscapes network models (VBET, BRAT, RSContext, and more), Cybercastor has grown into a general-purpose cloud execution platform. If your model is Riverscapes-compliant and produces a project.rs.xml output file, Cybercastor can run it.
What Cybercastor Has Already Done
The proof is in the data. The Riverscapes Consortium has used Cybercastor to run production-grade network models across more than 99% of the continental United States — approximately 15,000 HUC10 watersheds. This work has generated over 115,000 projects now freely available in the Riverscapes Data Exchange, representing 54 million individual riverscapes spanning the conterminous U.S.
That data can be explored, filtered, and exported through the Riverscapes Reports platform, which lets anyone draw an area of interest and immediately pull curated summaries or raw data exports from the full national dataset — all powered by model runs that Cybercastor completed.
Status map showing completed Riverscapes Metric Engine runs across CONUS, generated by Cybercastor.
The most recent national run, funded by the Bureau of Land Management, processed the full Riverscapes network model waterfall across ~15,000 watersheds. This is the third successive continental-scale model run produced using Cybercastor. Learn more about the 2025 CONUS run →
Why Cybercastor?
Running geospatial models across hundreds or thousands of watersheds is a challenge that local computers, individual laptops, and manual workflows simply cannot meet. The computational demands are too high, the human cost too steep, and the coordination too complex. Cybercastor solves this.
| Local Install / Codespaces | Cybercastor | |
|---|---|---|
| Scale | One watershed at a time | Hundreds running in parallel |
| Speed | Days to weeks per large dataset | Entire regions completed in days |
| Human effort | High — configure each run manually | Low — submit once, monitor a dashboard |
| Infrastructure | You manage your compute | Fully managed cloud infrastructure |
| Data integration | Manual upload/download | Tight integration with the Data Exchange |
| Cost | Your time + compute | Pay only for what you run |
The compute costs of running models through Cybercastor are typically minimal compared to the human labour costs of configuring and managing equivalent runs locally or via GitHub Codespaces.
How It Works
Cybercastor runs as a serverless application on Amazon Web Services (AWS), built and maintained by North Arrow Research Ltd. Here's the basic workflow:
- Inputs from the Data Exchange — Cybercastor pulls upstream project outputs directly from the Riverscapes Data Exchange, eliminating the need to stage input data manually.
- Parameterized job submission — Administrators define jobs using a wizard interface, specifying the model, the set of watersheds (or other units) to run, and any configuration parameters.
- Parallel cloud execution — Jobs are distributed across AWS infrastructure and run in parallel. The same software environment used in GitHub Codespaces is used here, ensuring consistency between development and production runs.
- Outputs back to the Data Exchange — Upon completion, each model's outputs are automatically uploaded to the Data Exchange as a fully compliant Riverscapes project, complete with metadata, provenance tracking, and QAQC records.
- Monitor and manage — A web-based dashboard gives administrators full visibility into in-flight jobs. Individual tasks can be monitored, paused, or restarted as needed.
The Cybercastor administrator dashboard provides real-time visibility into running jobs.
The job creation wizard makes it straightforward to configure and launch new batch runs.
Key Features
- Unlimited parallelism — Run hundreds of model instances simultaneously, constrained only by your budget and input data availability.
- Deep Data Exchange integration — Projects flow in and out of the Data Exchange automatically, with proper ownership, metadata, and versioning.
- Provenance and auditability — Every project produced by Cybercastor records how it was generated. The
project.rs.xmlmetadata captures the runner (Cybercastor), model version, run date, and QAQC status, making outputs fully traceable. - Consistent environments — The Docker-based execution environment is identical to the one used in development Codespaces, so what works in testing works in production.
- Any compliant model — If your model writes Riverscapes-compliant output, it can be run in Cybercastor. The platform is not limited to the Riverscapes network models.
- Dashboard monitoring — Real-time administrative control over every job in flight, with the ability to inspect logs, stop runaway tasks, and retry failed ones.
Who Is Cybercastor For?
Organizations Needing Regional or National Coverage
If you need model outputs for an entire state, region, river basin, or country, Cybercastor is the only practical way to do it. The alternative — running models one watershed at a time — would take years of human effort. Cybercastor can complete large-scale runs in days.
Federal and state agencies, watershed councils, and conservation organizations have all benefited from data produced by Cybercastor-powered runs.
Tool Developers Ready to Scale Up
If you have a production-grade Riverscapes-compliant model and want to run it at scale, Cybercastor provides the infrastructure. Rather than building and maintaining your own cloud pipeline, you can plug into an existing, battle-tested platform.
Research Teams Needing Reproducibility
Cybercastor's consistent execution environment and automatic provenance recording make it well-suited for research projects where reproducibility matters. Every run is logged, versioned, and traceable — and the outputs are immediately available to collaborators through the Data Exchange.
Running Your Model in Cybercastor
The primary prerequisite for running a model in Cybercastor is Riverscapes compliance. Specifically, your model must:
- Be containerized and runnable from a command-line interface
- Produce outputs that include a valid
project.rs.xmlmetadata file - Write outputs to a consistent, predictable folder structure
If your model already meets these requirements, the onboarding process is straightforward. If not, the Riverscapes team can advise on what changes are needed.
Contact support@riverscapes.freshdesk.com to discuss running your models using Cybercastor. We can advise on compliance requirements, pricing, and timelines. Compute costs are typically a fraction of what you'd spend on equivalent human effort.
Cybercastor in the Riverscapes Ecosystem
Cybercastor is the production execution layer that ties together the rest of the Riverscapes platform:
Riverscapes Network Models
The production-grade geospatial models (RSContext, VBET, BRAT, RCAT, Metric Engine, and more) that Cybercastor runs at scale across thousands of watersheds.
Riverscapes Data Exchange
The cloud data warehouse where Cybercastor retrieves input projects and deposits completed model outputs. Over 115,000 projects generated by Cybercastor are available here.
Riverscapes Reports
Extract metrics and generate reports for any area of interest, drawing on 54 million riverscapes produced by Cybercastor-powered CONUS model runs.
For a broader comparison of options for running Riverscapes network models — including local install, GitHub Codespaces, and Cybercastor — see the How to Run Network Models page.
Technical Overview
Cybercastor is built and maintained by North Arrow Research Ltd and runs on AWS. Key architectural details:
- Serverless Node.js application — scales elastically with workload, no fixed infrastructure costs
- AWS-native — uses EC2, Lambda, S3 and related services for execution, storage, and orchestration
- Docker-based model execution — each model runs in an isolated container, ensuring environment consistency
- GraphQL API — the same API that powers the Data Exchange is used for project retrieval and upload
- Automated QAQC — model outputs are checked against quality thresholds before being finalized in the Data Exchange
Show your support! Check out the official North Arrow Research merchandise store for Cybercastor t-shirts, stickers, and more.
