The Shift to Local AI: Why We’re Building The Catalyst Without the Cloud
In January 2025, a Chinese AI research group called DeepSeek released a family of open-weight language models that changed the conversation around what’s possible with local, non-cloud AI. DeepSeek-R1 was trained using approximately 2,000 Nvidia H800 GPUs over 55 days—a process that cost around $5.6 million. That’s nearly one-tenth the cost of models like GPT-4, yet DeepSeek matched or exceeded them in performance benchmarks.
That single release has already started to reshape AI infrastructure decisions worldwide. Microsoft, one of the largest investors in AI infrastructure, has begun rethinking its Azure buildout strategy, particularly in the face of advances from China. As we move deeper into 2025, it’s becoming clear: you don’t need a billion-dollar datacenter to run an intelligent system anymore.
At What Comes Next, LLC, we’ve taken that reality to heart. As we build The Catalyst—our privacy-first behavioral coaching platform—we’re proving that advanced AI doesn’t require cloud hosting. Our proof of concept runs on a desktop development machine and a residential server, fully offline and under our control.
Why Go Local?
Cloud-based AI comes with serious tradeoffs: bandwidth costs, vendor lock-in, latency, and the ever-present risk of your users’ data being exposed. That’s not a price we’re willing to pay. Local LLMs, especially with the rise of models like DeepSeek, let us build personal, performant, and private systems without compromise.
- Privacy: All user data stays local—no cloud sync, no third-party processors.
- Cost Efficiency: Eliminate monthly hosting bills by running on in-house hardware.
- Performance: Low latency and no internet dependency means faster, more reliable experiences.
- Autonomy: No API keys, no throttling, no waiting for access.
Why DeepSeek Changed the Game
Beyond just cost, DeepSeek-R1 and its successors offer serious power for local deployments. With context windows up to 128K tokens and throughput that rivals API-based models, DeepSeek’s open weights mean developers like us can fine-tune, self-host, and iterate fast—all while maintaining full control over the environment.
We’ll explore all of that (and more) in our upcoming whitepaper, including:
- Hardware and GPU specs used to host LLMs locally
- Benchmarks across model types and quantization strategies
- Network design in non-enterprise environments (LAN segmentation, VPN access, firewall strategy)
- Future-state architecture for post-MVP growth
If you’re a developer, founder, or even just an AI-curious tinkerer who’s felt locked out of this space—this is your invitation in. We’ll be open-sourcing our tools and sharing our lessons learned so others can do the same.
Follow the Build
We’re sharing the journey as we go. Follow What Comes Next on LinkedIn and check out the WCN Blog for whitepaper updates, demo videos, and real-world benchmarks.
Because ethical AI doesn’t start with a slogan—it starts with infrastructure.