Shared from twixb · aws.amazon.com

Cost effective deployment of vision-language models for pet behavior detection on AWS Inferentia2

aws.amazon.com·May 6, 2026

Tomofun, the company behind the Furbo Pet Camera, successfully reduced its deployment costs by 83% by migrating its pet behavior detection model from GPU-based instances to AWS Inferentia2-powered EC2 Inf2 instances, achieving high throughput for real-time inference without altering the core logic of its existing model. The transition involved using lightweight wrapper classes to maintain compatibility while leveraging the cost efficiency and performance of Inferentia2 for scaling their service.

The key takeaway for someone in enterprise AI and SaaS is the significant cost reduction achieved by Tomofun through migrating from GPU-based instances to AWS Inferentia2-based EC2 Inf2 instances for real-time vision-language model inference. This transition resulted in an 83% decrease in deployment costs without compromising performance, highlighting the potential for cost optimization in AI workloads by leveraging purpose-built AI chips like Inferentia2. For your own projects, consider assessing the cost-effectiveness and scalability benefits of switching to specialized AI infrastructure for large-scale inference tasks.

Powered by twixb

Want more content like this?

twixb tracks your favorite blogs and social media, filters by keywords, and delivers personalized key learnings — straight to your inbox.

More from Enterprise AI & SaaS News

Recent stories curated alongside this one.