Engineering Blog

WePay's Logging Infrastructure

Introduction Without logs all of us would be stumbling in the dark. We would know that something is wrong, but be unable to figure out exactly what. This article is going to talk about how WePay’s logging infrastructure is set...

Choosing an Apprenticeship at WePay

As one of five apprentices in WePay’s pilot apprenticeship program, I have been working for the last six months as an entry-level software engineer. As a nontraditional junior developer with a bootcamp background, I want to share what WePay’s apprenticeship...

An Offline to Online Data Pipeline at WePay

Background Historically, our approach to fraud detection primarily relied upon the data available in production. This meant that all attributes required for making a decision on whether each payment transaction is legitimate or fraudulent relied solely upon the availability of...

A Purposeful ProtoBuf Repository

In 2018, the Platform team at WePay integrated and released the first version of WePay’s internal gRPC ecosystem and toolchain to the WePay developers for a smooth migration from RESTful APIs to gRPC. Shifting, adopting, and migrating large portions of...

Logging Machine Learning Data at WePay

Our Goal At WePay, we use machine learning models for risk analysis and fraud detection associated with payments. Each model is optimized for different fraud patterns and checkpoints associated with the timeline of the payment. Across all models, there is...

Effective Software Design Documents

Introduction At WePay, we have a well defined process and structure to document software design. This post describes our design process and software design template. Background Every early stage software company has one goal: to ship software ASAP. This attitude...

Balancing the Books at Scale

As a payment processing company, reconciliation, or making sure all of our payments add up, is one of WePay’s foremost concerns. We need to know where all the money is, all the time. This is more easily said than done;...

Autoscaling CI/CD on Google Cloud: Part 2

In the last post we drew a picture of what was detailed as a distributed autoscaling system using Google Cloud Platform’s Compute Engine resources, specifically using Google Compute Managed Instance Groups. A few limitations and challenges with the distributed approach...

Autoscaling CI/CD on Google Cloud: Part 1

At WePay we make heavy use of our Continuous Integration/Continuous Delivery (CI/CD) system to provide specialized and automated pipelines to all of our internal development teams, making it easier for the teams to build, test, verify, and ship their software...

Waltz: A Distributed Write-Ahead Log

We are happy to announce the open source release of Waltz. Waltz is a distributed write-ahead log. It was initially designed to be the ledger of money transactions on the WePay system and was generalized for broader use cases of...

Streaming Cassandra at WePay - Part 2

In the first half of this blog post series, we explained our decision-making process of designing a streaming data pipeline for Cassandra at WePay. In this post, we will break down the pipeline into three sections and discuss each of...

Streaming Cassandra at WePay - Part 1

Historically, MySQL had been the de-facto database of choice for microservices at WePay. As WePay scales, the sheer volume of data written into some of our microservice databases demanded us to make a scaling decision between sharded MySQL (i.e. Vitess)...

Migrating APIs from REST to gRPC at WePay

In the previous posts in our service mesh series, we talked about how we’ve set up our service mesh infrastructure to modernize our microservice and load balancing architecture, and how we ensure the service mesh infrastructure is highly available so...

Preparing for Top-Tier Engineering Interviews

Interviewing as a software engineer in a hyper competitive environment like Silicon Valley can be stressful or even terrifying. Many technology companies follow a similar format that includes technical interviews over the phone and on the Internet before more thorough...

Highly Available MySQL Clusters at WePay

This post describes WePay’s highly available MySQL architecture, and how we achieve short outage times during failures. WePay uses a variety of relational database management systems (RDBMS) and noSQL databases. MySQL remains the main RDBMS that completes all critical operations....

A Highly Available Service Mesh at WePay

In the last two posts in this series, Using Linkerd as a Service Mesh Proxy at WePay and Sidecar and DaemonSet: Battle of containerization patterns, we had some fun digging deep into some of the more specific parts of a...

Sidecars and DaemonSets: Battle of containerization patterns

In our recent post, using Linkerd as a service mesh proxy, we kickstarted a series documenting WePay Engineering’s look at technologies and patterns for introducing service mesh and gRPC to our infrastructure. For the second part of the series, we’re...

Improving Airflow UI Security

Airflow is a platform to programmatically author, schedule and monitor workflows (called directed acyclic graphs–DAGs–in Airflow). When we first adopted Airflow in late 2015, there were very limited security features. This meant that any user that gained access to the...

Using Linkerd as a Service Mesh Proxy at WePay

In the upcoming months, we are going to write a series of posts documenting WePay Engineering’s journey from traditional load balancers to a service mesh on top of Google’s Kubernetes Engine (GKE). In this first part of the series, we...

Configuration Management Framework at WePay

Configuration management is a common problem that needs to be solved by every company dealing with large scale deployments across many environments. Provisioning environments, deploying applications, and managing infrastructure service(s) is dependant on ensuring the correct configuration is available at...

WeTools: An Elixir Command Line Tool

Within WePay’s tools team, we are always looking for ways to make our engineers more productive. One of the challenging things we are trying to do is distribute a series of automated tasks to our developers, such as a way...

Data Backfilling at WePay

A perennial issue in the payments world is fraud. At WePay, we have our own fraud-detection system to deal with a variety of fraud issues. For example, when a merchant receives a payment of $50 from a credit card, our...

Splitting traffic with SplitIO

At WePay we are constantly evolving our core infrastructure to better meet the needs of our customers, whether that’s broadening the feature set for our end users, or delivering faster transactions for our payment processors. As part of scaling, WePay...

Simple Developer Docs using Jekyll, GitHub Pages, and GitHub Enterprise

I am a fan of simple solutions. In this post, I explain how we used Jekyll and GitHub Pages to solve our need for a new developer docs system. Like most startups, our first developer docs were just another controller...

Streaming databases in realtime with MySQL, Debezium, and Kafka

Change data capture has been around for a while, but some recent developments in technology have given it new life. Notably, using Kafka as a backbone to stream your database data in realtime has become increasingly common. If you’re wondering...

Sensu at WePay

My first project here at WePay was to replace our legacy monitoring system Check_MK, with something that we can easily configure and scale to meet the needs of our infrastructure and application services. Our monitoring needs have outgrown traditional tools...

Supporting Chip Cards at WePay

You have probably noticed that your credit cards now have little chips in them. You may have already used them at stores by inserting (or “dipping”) them into the card reader instead of swiping. While waiting for the transaction to...

Kafka Avro client

WePay uses Apache Kafka as its real time message broker service to publish and consume realtime events. Messages published to Kafka topics can adhere to a specific schema. To manage the schemas in a centralized location, we use Confluent’s schema-registry,...

Interviewing at WePay - The Why

In my last post, I consumed your valuable screen and brain space discussing what our interviewing process looks like - what we do, and what we are trying to achieve. This post (taking up more of your valuable screen real...

Training machine learning models with Airflow and BigQuery

WePay uses various machine-learning models to detect fraudulent payments and manage risk for payers, merchants and their platforms. The Problem In a previous blog post, our Data Science team described how we use a Random Forest algorithm to achieve an...

Loading data from Kafka into BigQuery with Kafka Connect

WePay recently released an open-source Kafka-BigQuery Connector on GitHub. We’ve decided to celebrate with a blog post detailing what exactly a Kafka Connector is, how we implemented ours, and why we needed one in the first place. Enjoy! Tracking events...

Building WePay's Webhook delivery system with Google Cloud Pub/Sub

Introduction Webhooks are user-defined HTTP callbacks. At WePay, we make use of webhooks we call Instant Payment Notifications or IPNs to update our partners on the status of transactions happening in our system. IPNs allow our partners to receive notifications...

Software Engineering Interviewing at WePay

What is this post about interviewing doing taking up valuable screen real estate on an engineering blog? Shouldn’t it be filled with discussions of cool new technology, code snippets, or coding how-to guides? Interviewing, like many things, is a problem...

BigQuery at WePay

Our previous posts provided an overview of our data warehouse, and discussed how we use Airflow to schedule our ETL pipeline. In this post, we’ll focus on how we’ve set up BigQuery as the database that powers our data warehouse....

Airflow at WePay

NOTE: We recently gave an Airflow at WePay talk to the Bay Area Airflow meetup group. The video and slides are both available. Our last post provided an overview of WePay’s data warehouse. In this post, we’ll be diving into...

Building WePay's data warehouse using BigQuery and Airflow

Over the coming weeks, we’ll be writing a series of posts describing how we’ve built and run WePay’s data warehouse. This series will cover our usage of Google Cloud Platform, BigQuery, and Apache Airflow (incubating), as well as how we...

How WePay mocks SFTP payment processor backends

In payments, as the volume of transactions increases, the number of ways things can go wrong also increases. This is particularly true at the point where a processor interacts with its bank acquirer. Here, declines can happen for a whole...