EP102: Encoding vs Encryption vs Tokenization

This week’s system design refresher:


Register for POST/CON 24 | April 30 - May 1 (Sponsored)

Postman’s annual user conference will be one of 2024’s top developer events and an unforgettable experience! Join the wider API community in San Francisco and be the first to learn about the latest Postman product advancements, elevate your skills in hands-on workshops with Postman experts, and hear from industry leaders, including:

See the full agenda and register now to get a 30% Early Adopter discount!

Bonus: Did we mention there's an awesome after-party with a special celebrity guest?

Register Now


Caching Pitfalls Every Developer Should Know


Encoding vs Encryption vs Tokenization

Encoding, encryption, and tokenization are three distinct processes that handle data in different ways for various purposes, including data transmission, security, and compliance.

In system designs, we need to select the right approach for handling sensitive information.

graphical user interface, application, PowerPoint

Latest articles

If you’re not a paid subscriber, here’s what you missed this month.

  1. The Top 3 Resume Mistakes Costing You the Job

  2. How Video Recommendations Work - Part 1

  3. How to Design a Good API?

  4. How do We Design for High Availability?

  5. Good Code vs. Bad Code

To receive all the full articles and support ByteByteGo, consider subscribing:

Subscribe now


Kubernetes Tools Stack Wheel

Kubernetes tools continually evolve, offering enhanced capabilities and simplifying container orchestration. The innumerable choice of tools speaks about the vastness and the scope of this dynamic ecosystem, catering to diverse needs in the world of containerization.

No alt text provided for this image

In fact, getting to know about the existing tools themselves can be a significant endeavor. With new tools and updates being introduced regularly, staying informed about their features, compatibility, and best practices becomes essential for Kubernetes practitioners, ensuring they can make informed decisions and adapt to the ever-changing landscape effectively.

This tool stack streamlines the decision-making process and keeps up with that evolution, ultimately helping you to choose the right combination of tools for your use cases.

Over to you: I am sure there would be a few awesome tools that are missing here. Which one would you like to add?


Fixing bugs automatically at Meta Scale

Wouldn't it be nice if a system could automatically detect and fix bugs for us

Meta released a paper about how they automated end-to-end repair at the Facebook scale. Let's take a closer look.

No alt text provided for this image

The goal of a tool called SapFix is to simplify debugging by automatically generating fixes for specific issues.

How successful has SapFix been?

Here are some details that have been made available:

Here’s how SapFix actually works:

  1. Developers submit changes for review using Phabricator (Facebook’s CI system)

  2. SapFix selects appropriate test cases from Sapienz (Facebook’s automated test case design system) and executes them on the Diff submitted for review

  3. When SapFix detects a crash due to the Diff, it tries to generate potential fixes. There are 4 types of fixes - template, mutation, full revert and partial revert.

  4. For generating a fix, SapFix runs tests on the patched builds and checks what works. Think of it like solving a puzzle by trying out different pieces.

  5. Once the patches are tested, SapFix selects a candidate patch and sends it to a human reviewer for review through Phabricator.

  6. The primary reviewer is the developer who raised the change that caused the crash. This developer often has the best technical context. Other engineers are also subscribed to the proposed Diff.

  7. The developer can accept the patch proposed by SapFix. However, the developer can also reject the fix and discard it.

Reference:


The one-line change that reduced clone times by a whopping 99%, says Pinterest.

While it may sound cliché, small changes can definitely create a big impact.

diagram

The Engineering Productivity team at Pinterest witnessed this first-hand.

They made a small change in the Jenkins build pipeline of their monorepo codebase called Pinboard.

And it brought down clone times from 40 minutes to a staggering 30 seconds.

For reference, Pinboard is the oldest and largest monorepo at Pinterest. Some facts about it:

Cloning monorepos having a lot of code and history is time consuming. This was exactly what was happening with Pinboard.

The build pipeline (written in Groovy) started with a “Checkout” stage where the repository was cloned for the build and test steps.

The clone options were set to shallow clone, no fetching of tags and only fetching the last 50 commits.

But it missed a vital piece of optimization.

The Checkout step didn’t use the Git refspec option.

This meant that Git was effectively fetching all refspecs for every build. For the Pinboard monorepo, it meant fetching more than 2500 branches.

𝐒𝐨 - 𝐰𝐡𝐚𝐭 𝐰𝐚𝐬 𝐭𝐡𝐞 𝐟𝐢𝐱?

The team simply added the refspec option and specified which ref they cared about. It was the “master” branch in this case.

This single change allowed Git clone to deal with only one branch and significantly reduced the overall build time of the monorepo.

Reference:


SPONSOR US

Get your product in front of more than 500,000 tech professionals.

Our newsletter puts your products and services directly in front of an audience that matters - hundreds of thousands of engineering leaders and senior engineers - who have influence over significant tech decisions and big purchases.

Space Fills Up Fast - Reserve Today

Ad spots typically sell out about 4 weeks in advance. To ensure your ad reaches this influential audience, reserve your space now by emailing [email protected].