Tag: image-generation

23 discussions across 4 posts tagged "image-generation".

AI Signal - January 27, 2026

Z-Image Base Model Released by Alibaba r/StableDiffusion Score: 366

Alibaba's Tongyi-MAI released Z-Image base model on HuggingFace with official ComfyUI support merged within hours. The model represents a new generation of open image generation, with the community rapidly integrating it into existing workflows.

#image-generation #open-source
LTX-2 Image-to-Video Adapter LoRA r/StableDiffusion Score: 275

High-rank LoRA adapter for LTX-Video 2 that substantially improves image-to-video generation quality. Direct image embedding pipeline without complex workflows, preprocessing, or compression tricks. Addresses reliability issues with base model's image-to-video capabilities.

#image-generation #open-source
Lazy weekend with flux2 klein edit - lighting experiments r/StableDiffusion Score: 876

User tested Flux2 Klein's lighting capabilities by feeding the official prompting guide into an LLM to generate varied benchmark prompts. Lighting has the single greatest impact on Klein output quality, requiring photographer-style descriptions rather than generic terms.

#image-generation
Anyone else feel this way about StableDiffusion workflows? r/StableDiffusion Score: 589

Argument that output quality issues are about settings, not workflows. Good prompts + good settings + high resolution + patience = great output. Lock seed and perform parameter search on CFG, model shift, LoRA strength. ComfyUI isn't scary - build incrementally with clean, modular nodes.

#image-generation

AI Signal - January 20, 2026

🧠💥 My HomeLab GPU Cluster – 12× RTX 5090, AI / K8s / Self-Hosted Everything r/StableDiffusion Score: 901

An impressive self-hosted GPU cluster featuring 12 RTX 5090s (1.5TB+ VRAM total) across 6 machines running Kubernetes with GPU scheduling. Built for AI/LLM inference, training, image/video generation, and self-hosted APIs—a glimpse into serious local AI infrastructure.

#local-models #self-hosted #image-generation
LTX 2 is amazing : LTX-2 in ComfyUI on RTX 3060 12GB r/StableDiffusion Score: 956

LTX-2 video generation running successfully on modest consumer hardware (RTX 3060 12GB). The creator produced coherent spy story scenes with cyberpunk aesthetic, demonstrating that high-quality video generation is accessible without datacenter GPUs.

#image-generation #local-models
LTX-2 Updates r/StableDiffusion Score: 848

The LTX-2 team releases improvements based on community feedback just two weeks after launch. The post highlights rapid iteration cycles, community engagement through configurations/LoRAs shared across Discord and Civitai, and the value of responsive open-source development.

#image-generation #open-source
How to generate proper Japanese in LTX-2 r/StableDiffusion Score: 484

A technical deep-dive into generating authentic Japanese audio with LTX-2 video generation. The author tests whether the model can produce real Japanese (not gibberish), shares successful workflows, and provides practical guidance for multilingual content generation.

#image-generation
Flux.2 Klein (Distilled)/ComfyUI - Use "File-Level" prompts to boost quality while maintaining max fidelity r/StableDiffusion Score: 195

A clever prompting technique for Flux 2 Klein: using "file-level" technical prompts (e.g., "sharpen edges," "increase local contrast") instead of descriptive prompts prevents the model from hallucinating new faces when upscaling/restoring old photos.

#image-generation
Flux Klein gives me SD3 vibes r/StableDiffusion Score: 113

A critique comparing Flux2 Klein's text-to-image quality unfavorably to Z Image Turbo, particularly for difficult poses which result in "body horror almost every time." While Flux2's editing ability is praised, this raises concerns about the distilled model's image generation quality.

#image-generation
Last week in Image & Video Generation r/StableDiffusion Score: 226

A curated weekly roundup of open-source image and video generation highlights, including FLUX.2 Klein release, LTX-2 updates, and other multimodal AI developments. Useful digest for staying current without scrolling through everything.

#image-generation #open-source

AI Signal - January 06, 2026

LTX-2 open source video generation model released r/StableDiffusion Score: 246

Lightricks released LTX-2, their multimodal model for synchronized audio and video generation, as fully open source with model weights, distilled versions, LoRAs, modular trainer, and RTX-optimized inference. Runs in 20GB FP4 or 27GB FP8, works on 16GB GPUs, and integrates directly with ComfyUI.

#image-generation #open-source #local-models
GPT prompt using lowest-probability tokens creates unique images r/ChatGPT Score: 1119

Prompting GPT to rewrite image prompts using lowest-probability tokens (avoiding clichés and default aesthetics) produces distinctly non-standard visual results. Technique forces model away from common patterns into more creative territory.

#image-generation #llm
Open-sourced photo to Game Boy ROM converter using AI r/StableDiffusion Score: 650

Tool converts photos into playable Game Boy ROMs by generating pixel art and optimizing for Game Boy constraints (4 colors, 256 tiles, 8KB RAM). Output includes animated character, scrolling background, music and sound effects. Open source Windows tool.

#image-generation #open-source
Venezuela crisis shows reality hacked by AI-generated imagery r/ArtificialInteligence Score: 229

During Venezuela crisis, AI-generated images of Maduro arrest, crowds, and troops flooded social media before being identified as fake. Demonstrates real-time information warfare using generative AI to shape perception during breaking news.

#image-generation
SVI 2.0 Pro enables infinite length videos with seamless transitions r/StableDiffusion Score: 2050

Workflow for Wan 2.2 allows infinite video length with invisible transitions. Generated 1280x720, 20-second continuous video in 340 seconds. Fully open source. Represents significant improvement in video generation capabilities for coherent long-form content.

#image-generation #open-source
Brie's Lazy Character Control Suite updated for Qwen Edit 2511 r/StableDiffusion Score: 491

Updated RePose workflow to Qwen Edit 2511, competing with AnyPose for pose capture. Includes Lazy Character Sheet and Lazy RePose workflows. Community workflow tooling for consistent character control across generations.

#image-generation #open-source

AI Signal - January 02, 2026

SVI 2.0 Pro for Wan 2.2 is amazing, allowing infinite length videos with no visible transitions r/StableDiffusion Score: 1558

A breakthrough in video generation with SVI 2.0 Pro enabling truly continuous video creation at remarkable speed (340 seconds for 20s at 1280x720). This represents a significant leap in local video generation capabilities, making long-form video synthesis practical on consumer hardware with ComfyUI workflows.

#image-generation #open-source
Qwen-Image-2512 r/LocalLLaMA Score: 671

Qwen's latest image generation model release marks a significant improvement in human realism, natural detail rendering, and text accuracy. The model addresses the "AI-generated" look and delivers substantially enhanced quality for human subjects, landscapes, and text rendering compared to the previous version.

#image-generation #open-source #llm
Continuous video with wan finally works! r/StableDiffusion Score: 393

Successful implementation of continuous video generation using Wan 2.2 with seamless transitions, a major milestone for open-source video AI. The workflow demonstrates that professional-quality continuous video is achievable with consumer hardware.

#image-generation #open-source
[P] My DC-GAN works better than ever! r/MachineLearning Score: 264

Successful debugging and optimization of a Deep Convolutional GAN implementation, with community discussion around architecture optimization for resource-constrained training. Shows continued relevance of classical generative approaches.

#machine-learning #image-generation
Some ZimageTurbo Training presets for 12GB VRAM r/StableDiffusion Score: 199

Community-contributed training configurations optimized for 12GB VRAM, making fine-tuning accessible on consumer GPUs. Demonstrates ongoing effort to democratize AI training through optimization and configuration sharing.

#image-generation #local-models
Amazing Z-Image Workflow v3.0 Released! r/StableDiffusion Score: 854

Major update to popular ComfyUI workflows for Z-Image-Turbo, featuring style selectors and user-friendly interfaces. Represents the maturation of the ComfyUI ecosystem with increasingly polished user experiences.

#image-generation #open-source