We designed a real-time recommendation engine combining collaborative filtering with a neural matrix factorisation model. The system processes user behaviour signals — clicks, dwell time, cart additions, and purchases — to build dynamic user preference vectors updated in near real-time.
The architecture runs on AWS SageMaker for model training and AWS Lambda for inference, achieving sub-50ms response times at peak load. A/B testing infrastructure was built in from day one, allowing continuous model iteration without production risk.
We also built a content-based fallback for new users (cold start), ensuring every visitor receives a relevant, personalised experience from their first page view.