EP66: Comparison of URL, URI, and URN

This week’s system design refresher:

OAuth 2 explained in simple terms (Youtube video)
Comparison of URL, URI, and URN
Data Warehouse vs Data Lake
Twitter 1.0 Tech Stack
Sponsor us

OAuth 2 Explained in Simple Terms

URL, URI, URN - Do you know the differences?

The diagram below shows a comparison of URL, URI, and URN.

URI stands for Uniform Resource Identifier. It identifies a logical or physical resource on the web. URL and URN are subtypes of URI. URL locates a resource, while URN names a resource.

A URI is composed of the following parts:

scheme:[//authority]path[?query][#fragment]

URL stands for Uniform Resource Locator, the key concept of HTTP. It is the address of a unique resource on the web. It can be used with other protocols like FTP and JDBC.

URN stands for Uniform Resource Name. It uses the urn scheme. URNs cannot be used to locate a resource. A simple example given in the diagram is composed of a namespace and a namespace-specific string.

If you would like to learn more detail on the subject, I would recommend W3C’s clarification.

What are the differences between a data warehouse and a data lake?

The diagram below shows their comparison.

A data warehouse processes structured data, while a data lake processes structured, semi-structured, unstructured, and raw binary data.
A data warehouse leverages a database to store layers of structured data, which can be expensive. A data lake stores data in low-cost devices.
A data warehouse performs Extract-Transform-Load (ETL) on data. A data lake performs Extract-Load-Transform (ELT).
A data warehouse is schema-on-write, which means the data is already prepared when written into the data warehouse. A data lake is schema-on-read, so the data is stored as it is. The data can then be transformed and stored in a data warehouse for consumption.

Over to you: Do you use a data warehouse or a data lake to retrieve data?

Twitter 1.0 Tech Stack

This post is based on research from many Twitter engineering blogs and open-source projects. If you come across any inaccuracies, please feel free to inform us.

Mobile: Swift, Kotlin, PWA

Web: JS, React, Redux

Services: Mesos, Finagle

Caching: Pelikan Cache, Redis

Databases: Manhattan, MySQL, PostgreSQL, FlockDB, MetricsDB

Message queues: Kafka, Kestrel

Data processing: Heron, Flume, Tableau, SummingBird, Scalding

Data storage: Hadoop, blob store

Data centers: Twitter data center, AWS, Google Cloud

Tools: Puppet, Audubon, Wilson

SPONSOR US

📈 Feature your product in the biggest technology newsletter on Substack

ByteByteGo is the biggest technology newsletter on Substack with over 450,000 readers working at companies like Apple, Meta, Amazon, Google, etc. They have the influence and autonomy to make large purchase decisions. Learn more today.