Spectre

Ny
< 10 employees
< 10 engineers
$500k - $1m funding
Pre-series a

Spectre aims to be the primary system to discover, exchange, and manage data between organizations. By launching an integrated data platform that facilitates the entire data transaction process from discovery to delivery, Spectre will provide users with a seamless experience to procuring external data and will be well positioned to becoming the center of the data ecosystem.


Why join us?

  • We are exceptionally passionate about data. Data is the lifeblood of the modern day organization and as such must be utilized strategically and with careful deliberation. Unfortunately, this is not how things work today.

    As internal data management improves, organizations are turning to new, external sources of data to continue building their data foundation. What everyone fails to talk enough about is the effort required to find and put that data into use. Data procurement and exchange today is a multi-billion dollar industry and remains a largely manual and painstaking process. While the tools to set up automated data pipelines, otherwise known as ETL, exist, they are inadequate at best and require significant operational monitoring to ensure reliability and consistency over time.

    We know how to fix this. Instead of having every data buyer create and monitor data pipelines with each individual vendor, there is tremendous value in adding a distribution layer to facilitate the infrastructure for this ongoing data discovery and exchange. Enter Spectre.

  • We are a team of 5 consisting of UC Berkeley and Columbia University alumni with strong connections in the Silicon Valley and New York technology scene. We have significant early stage startup experience as well as affiliation to Y Combinator.

  • We are advised by world class operators in the data industry who come with tremendous experience out in the field. Hailing from both the business and academic worlds, our team of advisors will be assisting us every step of the way - from early product development to corporate and go to market strategy.


Engineering at Spectre

Engineering team and processes

We are currently an engineering team of 3 (founder included), split out primarily along two main themes - the Spectre Portal and the Spectre Platform.

The Spectre Portal is primarily focused around building out features for our website to interact with the underlying data platform and manage accounts and subscriptions. This work is full stack in nature and is similar to the work required of a traditional enterprise SaaS application.

The Spectre Platform consists of all interaction with the underlying data, including data validation, data transfers, and data quality monitoring. This work utilizes a combination of scalable systems and data engineering principles to automate much of the traditionally labor intensive ETL work that most companies deal with today.

As we are a lean team looking to punch above our weight, we have made significant investments in DevOps to streamline our engineering process. Deployments to our cloud development and production environments are fully automated using a combination of Gitlab, Docker and Terraform. Additionally, our local development environment provides a similar experience to our cloud environments to reduce erroneous deployment friction and increase developer productivity. We run unit tests on every build and integration tests on every deploy to relieve ourselves of manual testing.

We have a small product team, including a product manager and designer, that provide the necessary context and designs for new feature requests to the engineering team. From there, engineering discusses implementation details, occasionally creating a design document if deemed appropriate, before commencing development. We use Notion to track all progress across the product and engineering teams.

Technical Challenges

We are building an enterprise-grade data exchange that must scale to thousands of datasets and companies. From this comes many challenges.

Data must be transferred in a secure and timely fashion to multiple destinations upon an update from a data vendor. Metadata must be searchable in a scalable fashion, accounting for various security constraints. Data quality must be monitored across all datasets to provide high quality assurance guarantees. Administrative workflows must be integrated into the platform to facilitate data product evaluation and subscriptions.

We take pride in our ability to solve these problems so that our customers do not need to worry about them.

Projects you might work on
  • You would get to work on our data transfer architecture that handles copying data between cloud accounts and data systems. Various optimizations including caching, polling frequency, and error monitoring are in scope. [Spectre Platform]

  • You would develop our keyword and thematic search for scalable data exploration. This entails synchronizing our metadata catalog with a search index in Elasticsearch or Algolia and creating a Query API for retrieving results. This also includes developing the front end components for search on our web application. [Spectre Portal]

  • You would build our consumer Python API from scratch. This includes creating API keys from Auth0 for authentication along with interaction with our various Rest APIs to retrieve and store data. [Spectre Platform]

Tech stack
Python
React
AWS
PostgreSQL
Docker
Google Cloud Platform
Gitlab

Interested in this company?
Skip straight to final-round interviews by applying through Triplebyte.

Apply