Building a Tiny URL Service: Part 1 - The Backend Basics
You've probably clicked on a shortened URL from Twitter, a text message, or a QR code. That bit.ly/abc123 is doing something quietly clever: it's storing the mapping between that short code and a much longer URL like https://www.example.com/products/category/subcategory/item?utm_source=twitter&utm_campaign=summer2024.
In this article, we're building exactly that—a service that takes long URLs and gives you short, memorable codes. We'll start simple, then in Part 2 we'll explore what happens when your service needs to handle millions of requests per second.
Why Build This?
URL shortening is more than just making links prettier. It's useful for:
Analytics: You can track when and how often a shortened URL is accessed without modifying the destination site.
URL Obfuscation: Long URLs with complex parameters are unwieldy in marketing materials and QR codes.
Branding: Custom short domains (like your own mysite.app/abc123 instead of a generic shortener) reinforce your brand. For our purposes though, this is an excellent learning problem. It touches databases, API design, encoding, and basic caching concepts—all critical skills in system design.
Part 1 Requirements
Let's keep this focused. For the first part, we need to implement three core features:
- Create: Accept a long URL and return a short code
- Retrieve: Accept a short code and redirect to the original URL
- Idempotent creation: If the same user submits the same long URL twice, return the same short code
We'll deliberately ignore distributed systems complexity, high-traffic optimization, and custom domain routing. Those are coming in Part 2.
System Design: The Simple Version
Here's how it works:
- User submits a long URL to our service
- We generate a short code (we'll use base62 encoding)
- We store the mapping in a database: short_code → original_url
- When someone visits the short code, we look it up and redirect them
The magic is in the short code generation. Instead of creating random strings (which requires checking for collisions), we'll use a hash of the destination and encode it in base62.
Base62 uses the characters 0-9, a-z, and A-Z. This is denser than hex (which only uses 0-9 and a-f) and avoids special characters that are awkward in URLs.
The Data Model
We need a simple table to store our URL mappings:
CREATE TABLE tinyurls (
code VARCHAR PRIMARY KEY,
original_url VARCHAR NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
UNIQUE KEY unique_original_url (original_url)
);
Why this design?
- code: Generated short code.
- original_url: We store the full URL. The UNIQUE constraint ensures the same URL always maps to the same short code (idempotency).
- created_at: Useful for analytics and debugging.
The Spring Boot Implementation
Let's build this with Spring Boot and PostgreSQL. First, the entity:
@Entity
@Table(name = "tinyurls")
@AllArgsConstructor
@NoArgsConstructor
@Data
public class TinyUrl {
@Id
@Column(name = "code", unique = true)
private String code;
@Column(name = "original_url")
private String url;
@Column(name = "created_at", insertable = false, updatable = false)
private LocalDateTime createdAt;
}
Next, the repository:
@Repository
public interface TinyUrlRepository extends JpaRepository<TinyUrl, String> {
}
Now, the service to handle URL shortening:
@Service
@RequiredArgsConstructor
public class TinyUrlService {
private final TinyUrlRepository tinyUrlRepository;
@Value("${application.url}")
private String baseUrl;
private static final String ALPHANUMERIC = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789";
private static final int CODE_LENGTH = 10;
public String encode(String url) {
validateUrl(url);
var generatedCode = generateCode(url);
var redirectUrl = baseUrl + "/" + generatedCode;
tinyUrlRepository.save(new TinyUrl(generatedCode, url, null));
return redirectUrl;
}
private void validateUrl(String url) {
if (url == null || url.isBlank()) {
throw new IllegalArgumentException("URL cannot be null or empty");
}
try {
URI uri = new URI(url);
if (uri.getScheme() == null) {
throw new IllegalArgumentException("URL must have a valid scheme (e.g., http, https)");
}
} catch (URISyntaxException e) {
throw new IllegalArgumentException("Invalid URL format: " + e.getMessage());
}
}
public String decode(String code) {
TinyUrl tinyUrl = tinyUrlRepository.findById(code).orElse(null);
if (tinyUrl == null) {
return null;
}
return tinyUrl.getUrl();
}
/**
Generates a unique alphanumeric code of length 10 for the given URL. Uses MD5 hashing and base62 encoding.
**/
private String generateCode(String url) {
try {
MessageDigest md = MessageDigest.getInstance("MD5");
byte[] hash = md.digest(url.getBytes(StandardCharsets.UTF_8));
StringBuilder code = new StringBuilder();
// Builds alphanumeric code from URL hash
for (int i = 0; i < CODE_LENGTH; i++) {
int index = Byte.toUnsignedInt(hash[i % hash.length]) % ALPHANUMERIC.length();
code.append(ALPHANUMERIC.charAt(index));
}
return code.toString();
} catch (NoSuchAlgorithmException e) {
throw new RuntimeException("MD5 algorithm not found", e);
}
}
}
Finally, the controller to expose the API:
@AllArgsConstructor
@RestController()
@RequestMapping("/tinyurl")
public class TinyUrlController {
private final TinyUrlService tinyUrlService;
@PostMapping()
public ResponseEntity<Map<String, String>> encode(@RequestBody TinyUrlRequest request) {
var redirectUrl = tinyUrlService.encode(request.getUrl());
return ResponseEntity.ok().body(Map.of("url", redirectUrl));
}
@GetMapping("/{code}")
public ResponseEntity<String> decode(@PathVariable String code) {
return ResponseEntity.status(HttpStatus.FOUND).location(URI.create(tinyUrlService.decode(code))).build();
}
}
What We've Built
We've created a working URL shortening service with:
- A clean separation of concerns: controller, service, repository Idempotent creation: the same URL always maps to the same code
- A simple, collision-free encoding scheme
- A testable architecture
This works great for small-to-medium traffic. You could run this on a single server and handle thousands of requests per second without breaking a sweat.
What's Coming in Part 2
But what happens when your tiny URL service becomes popular? What when you need to handle millions of requests per second? When you need to shard your database? When you need to generate IDs without a single point of failure?
In Part 2, we'll explore:
- Caching strategies to avoid database hits for every redirect
- Distributed ID generation to avoid relying on a single database for sequence numbers
- Database sharding to split the load across multiple databases
- Read replicas and consistency considerations
- Rate limiting and abuse prevention
- Custom domain support for branded short URLs
- Performance testing to understand your bottlenecks
For now, you have a solid foundation. Deploy this, understand how it works, and then we'll build the infrastructure to scale it to handle the world.