Skip to main content

Architecture

Overview

Arch is built using best in class open source tools combined together in an intuitive way with our custom Next.js web application. It's built for the way that engineers work. Code lives in your git repositories but is exposed in a web application for ease of operating. All entities are designed to components within a template for a "build once, instantiate many times" workflow.

Overview Diagram

1. Extract & Load

Arch is built by the team behind Meltano, Meltano Singer SDK, and MeltanoHub. Data movement is one of our core competencies.

Supported Sources

The platform supports best in class open source connectors for 600+ sources maintained by both the Meltano and Airbyte communities. With massive communities of contributors, bug fixes are addressed faster than you can report them and new features are being added daily.

Although the Arch team is the primary backer of Meltano and the Singer community, we evaluate all available open source connectors and chose the best option for our customers whether thats Meltano/Singer, Airbyte, or something else.

Custom Sources

Your custom sources are also supported using the Meltano Singer SDK. Once the connector is developed locally it can be registered and executed by Arch as a custom source. All custom sources are private and owned by you.

See the sources and source configuration docs for more details.

2. Data Warehousing & Storage

Arch has two options, either using Arch's managed Postgres with the Hydra columar extension or by bringing your own cloud data warehouse. You also have the option to mix and match depending on the tenant.

Arch Managed Storage

This is the default storage option in Arch. All compute and storage costs are included in the price of the Arch platform. This option comes with many benefits including auto provisioning, isolation, a rich library of Postgres extensions, and cost savings compared to expensive cloud data warehouses.

Why Postgres?

At Arch we have a strong conviction that most data use cases are best served by a data warehouse built on Postgres. This conviction is based on a number of factors including the experience of our founder and CEO Taylor Murphy having used Postgres as the primary data warehouse for the data team at GitLab for several years.

With the Hydra columnar extension, Postgres gains huge performance improvements to tackle massive analytics workloads. However, we find that many customers don't have data at that scale and Arch can support them with a standard Postgres instance as well.

Postgres is a trusted, open source, and scalable database. Using Postgres can limit the exorbant costs and complexity of other cloud data warehouses with similar performance.

Arch is partnered with Tembo as the database provider for all Postgres instances.

Postgres Extensions

Arch Postgres instances are provisioned with a default set of extensions but new extensions can be added as needed.

Multi Tenant Isolation

For multi tenant use cases Arch is able to spin up new Postgres instances per tenant or share a single instance with database and schema level isolation. If data needs to be blended across tenants, its usually best to use schema isolation instead of instance level isolation.

Bring Your Own Warehouse

In addition to the Arch managed storage, you also have the option to bring your own data warehouse. You get all the benefits of the Arch multi tenant platform while still using your existing warehouse of choice. Warehouse compute costs are not included in the Arch platform pricing.

Snowflake is currently supported and other major data warehouses (e.g. BigQuery) are on the roadmap, reach out for more details around when those will be released.

3. Transformation

Arch uses dbt-core as it's SQL transformation layer. dbt has become the transformation tool of choice for data engineers.

You develop your dbt models and register the git respository with Arch. When changes are merged into your repository they'll be automatically pulled into Arch. Given commands and schedules, your dbt transforms will be executed on the platform.

See the transformation docs for more details.

4. Dashboarding

The dashboarding features of Arch serve embedded analytics. Admins can build their charts and dashboards directly in supported BI tools like Metabase or Evidence (reach out to request your preferred BI tool) and expose them in the Arch web app.

Dashboards can be exposed in a static or interactive way, depending on the tool of choice. Static dashboards can still have filters and drill downs but only admins can create, edit, or publish dashboards.

Arch can also be used in a headless manner where an external application consumes the data.

Templates and Components

A core feature of Arch is the ability to build a data platform once and instantiate it many times for each of your tenants (clients, companies, etc.). In some cases, each tenant has slightly different needs so you're able to componentize the building blocks and create each tenant using pre-built components (sources, transformations, schedules, configurations, etc.).