Nathan played a pivotal role in the development of the backend for our project. His expertise and reliability were instrumental in ensuring the project’s success. Having worked with Nathan personally for several years, I can confidently say that his knowledge and dedication are unparalleled.

Nathan’s attention to detail and proactive approach ensured that our backend infrastructure was robust, efficient, and scalable. His deep understanding of Elixir and Elasticsearch allowed him to anticipate potential issues and address them before they became problems.

I highly recommend Nathan for any web development projects. His technical acumen, reliability, and attentiveness make him a standout developer and a trusted colleague.

Jack McLaughlin - Founder - The Car Comparison

Project Overview

The Car Comparison is a startup that aggregates used car listings from multiple data sources, displaying them all in one place to allow customers to quickly find and compare used cars. Users can search, filter, and compare cars from multiple major used car marketplaces, all in one place.

Having already built a working prototype, The Car Comparison approached me to develop a production-ready back-end for their car comparison website, and to develop additional features. The system needed to stay up-to-date with data sources, integrate seamlessly with their front-end application, and return search results quickly.

Key Challenges

  • Multiple Varied Data Sources: Each provider had its own unique data source, each with its own challenges and limitations.
  • Performance: We needed to index hundreds of thousands of cars at a time, requiring the system to handle large numbers of used car listings quickly.
  • Listing Uniqueness: Cars could be listed by multiple marketplaces, and not all of them shared the car’s number plate. We needed a way to avoid duplicating car listings.

Solution

  • Maintain an Index of Available Cars: Scheduled worker processes fetched car data from each source using configurable ‘adapters’ to handle different connection protocols, data formats, and schemas. As listing removal was not explicitly communicated, we used a listing’s repeated presence as a heartbeat. If a listing hadn’t been seen for some time, we removed it from our index.
  • Performance with Large Datasets Under Load: We used an Elasticsearch cluster and integrated Elixir with the Elasticsearch API to handle fast searching and filtering of large numbers of car listings. The Phoenix Framework was used for a fast and scalable RESTful API.
  • De-duplication of Car Listings: We used deterministic fingerprinting of car listings for repeatable and accurate deduplication. While having the car’s number plate from all marketplaces would be ideal, our fingerprinting method proved to be sufficiently effective.

Technical Implementation

  • Self-hosted Elasticsearch Cluster: Docker and Docker Swarm simplified the task of managing an Elasticsearch cluster over multiple servers. This approach was cost-effective, scalable, and highly performant for indexing car listings, service logs, and heartbeat data.
  • Kibana for Visualizing System Performance: In a distributed system, a centralized place to monitor each component’s performance is crucial.
  • Phoenix Framework: We used Phoenix for the RESTful API and Oban workers. Elixir, and consequently Phoenix, is well-suited for handling concurrency.
  • PostgreSQL: Used for the primary data store.
  • Infrastructure: The system was deployed as Docker containers, using Docker Swarm for container orchestration across multiple servers. An Nginx reverse proxy sat in front of the other containers.

Results

The result is a performant, scalable back-end system capable of handling a growing number of listings effectively. The configurable per-marketplace adapter pattern has already allowed us to add new marketplaces quickly and will facilitate future additions. The self-hosted Elasticsearch cluster is extremely performant and can be scaled up to handle an increased number of listings as needed. Kibana provides clear visualization of each service’s load, helping us anticipate scaling needs. The Phoenix API is efficient and provides an excellent platform for adding new features as the business grows.

Thoughts and Feelings

Architecting and building complex systems is exactly the kind of work I enjoy. There were several fresh coding problems to solve, and a chance to gain further experience with Elasticsearch. It was a pleasure to build this system, and I am proud of the finished result.