Tag: self-hosted
21 discussions across 9 posts tagged "self-hosted".
AI Signal - April 28, 2026
-
A self-funded IT infrastructure professional built a local LLM cluster using 4 Mac Mini systems over 2 months. While light on technical details in the main post, the project demonstrates the growing accessibility of serious local AI infrastructure for individual developers willing to invest in hardware, representing a trend toward democratized AI compute.
AI Signal - April 14, 2026
- 24/7 Headless AI Server on Xiaomi 12 Pro (Snapdragon 8 Gen 1 + Ollama/Gemma4) r/LocalLLaMA Score: 524
A detailed technical write-up on converting a Xiaomi 12 Pro smartphone into a dedicated local AI inference node: LineageOS flashed for minimal overhead, Android framework frozen, headless networking via custom-compiled wpa_supplicant, and custom thermal management daemons. Running Gemma4 via Ollama on ~9GB of freed RAM. This is a creative and replicable approach to always-on local AI that doesn't require dedicated server hardware.
-
A hardware upgrade post (2015-era machine to a new high-end GPU) paired with plans for a local-first AI project. Low informational density but notable as a community signal: mainstream engineers who previously wouldn't consider local AI are now investing serious hardware budgets in it. The comment thread likely contains useful configuration advice.
-
A detailed parts list and build log for a dual RTX PRO 6000 workstation: Threadripper PRO 7965WX, WRX90 motherboard, 256GB ECC DDR5, dual 10GbE, IPMI. This represents the high end of consumer/prosumer local AI infrastructure. Useful as a reference for anyone designing a serious multi-GPU inference node, and as a data point on what serious local AI investment looks like in 2026.
-
A community thread inviting members to share their most unconventional home inference setups — featuring oven grills, egg cartons, and improvised cooling solutions. Low-information but high-character. A reminder that local AI is a hands-on, tinkerer culture, and sometimes the best insight comes from how people are actually running things.
AI Signal - March 31, 2026
-
Developer built Phantom, an open-source system giving Claude its own persistent VM with vector memory, self-evolution engine, and MCP server. It runs continuously via Slack integration, maintains context across sessions, and autonomously evolves its capabilities. The project demonstrates what happens when AI agents get persistent infrastructure rather than ephemeral sessions.
AI Signal - March 17, 2026
- If you have your OpenClaw working 24/7 using frontier models like Opus, you're easily burning $300 a day. r/AIagents Score: 1101
A stark cost comparison between cloud-based AI agents and local deployments. Running OpenClaw 24/7 with Opus costs ~$300/day ($110k/year), while the author's setup with 3 Mac Studios and a DGX Spark running local models cost one-third of that yearly cost upfront — usable for years with complete privacy. Makes a compelling economic and privacy case for local AI infrastructure.
AI Signal - March 03, 2026
-
Onyx is a self-hostable AI chat platform supporting any LLM, with built-in support for custom agents, knowledge source connections, and hybrid search/retrieval workflows. This is squarely in the intersection of self-hosted AI and RAG interests — a production-grade platform, not a toy demo.
AI Signal - February 24, 2026
- I'm now running 3 of the most powerful AI models in the world on my desk, completely privately, for just the cost of power. r/AIagents Score: 2209
Developer running Kimi K2.5 (600GB), MiniMax 2.5 (120GB), Qwen 3.5 (220GB), and GOT OSS 120B Heretic (60GB) across 3 Mac Studios with 512GB RAM each using EXO labs for distributed inference. This demonstrates that frontier-class models are now accessible for completely private, self-hosted deployment at reasonable hardware costs. Running 4 OpenClaws instances enables 24/7 coding, writing, and research workflows without cloud dependencies or rate limits.
AI Signal - February 17, 2026
- Built a 6-GPU local AI workstation for internal analytics + automation — looking for architectural feedback r/LocalLLM Score: 179
A detailed account of building a $38K 6-GPU local AI workstation running three open models concurrently for internal business analytics and automation. Rare real-world documentation of what a serious on-premise AI infrastructure deployment looks like, including hardware specifics and lessons learned. With 94 comments, the thread drew genuine architectural discussion useful for anyone planning self-hosted AI at scale.
AI Signal - January 20, 2026
- 🧠💥 My HomeLab GPU Cluster – 12× RTX 5090, AI / K8s / Self-Hosted Everything r/StableDiffusion Score: 901
An impressive self-hosted GPU cluster featuring 12 RTX 5090s (1.5TB+ VRAM total) across 6 machines running Kubernetes with GPU scheduling. Built for AI/LLM inference, training, image/video generation, and self-hosted APIs—a glimpse into serious local AI infrastructure.
-
A detailed build log for a 4x AMD R9700 system (128GB VRAM) funded through a 50% digitalization subsidy in Germany. Built to run 120B+ models locally for data privacy, with comprehensive benchmarks and real-world performance data for local LLM deployment.
-
A practical guide to running Claude Code sessions remotely via phone using Tailscale (VPN) + tmux. The setup enables terminal access to home MacBook sessions from anywhere, demonstrating creative mobile workflows for AI-assisted development.
-
A sequel build featuring 4x R9700 GPUs (128GB VRAM total) optimized for local LLM deployment. The post includes detailed upgrade path from previous MI100 setup, performance benchmarks, and lessons learned—valuable for anyone planning serious local AI infrastructure.
-
A detailed perspective on the shift from cloud to local AI, citing rising subscription costs and over-tuning/censorship as primary motivations. After weeks testing Llama 3.3, Phi-4, and DeepSeek locally, the author argues 2026 marks the inflection point for local AI viability.
-
A unique mobile AI workstation in a Thermaltake Core W200 case featuring 10 GPUs (8× 3090 + 2× 5090 = 768GB VRAM), Threadripper Pro 3995WX, and 512GB DDR4. Built for extra-large MoE models and video generation at ~$17k total cost with full enclosure and portability.
-
A fun comparison post from someone with both maxed M3 Ultra (512GB) and ASUS GB10 in the same room, asking the community for 24-hour experiment ideas. The discussion explores practical use cases and benchmarks for high-end local AI hardware.
AI Signal - January 02, 2026
-
Practical discussion of GPU procurement in Shenzhen's electronics markets for local AI deployment, including modded cards and domestic alternatives. Provides insight into the global GPU market and alternative sourcing strategies.
- Industry Update: Supermicro Policy on Standalone Motherboards Sales Discontinued r/LocalLLaMA Score: 60
Significant policy change affecting DIY server builders: Supermicro discontinuing standalone motherboard sales in favor of complete systems only. This constrains options for custom AI infrastructure builds and drives up costs for self-hosting enthusiasts.
- TIL you can allocate 128 GB of unified memory to normal AMD iGPUs on Linux via GTT r/LocalLLaMA Score: 156
Technical discovery enabling AMD integrated GPUs to access massive amounts of system RAM as unified memory on Linux, opening new possibilities for memory-bound AI workloads on consumer hardware. This demonstrates creative solutions for working around VRAM limitations.
- LLM server gear: a cautionary tale of a $1k EPYC motherboard sale gone wrong on eBay r/LocalLLaMA Score: 192
Detailed account of challenges selling high-end server hardware on eBay, including buyer disputes and platform limitations. Important practical advice for the self-hosting community buying and selling equipment.