pwshub.com

How to Run the Powerful Flux AI Image Generator on Less Powerful PCs

When Flux burst onto the scene a few days ago, it quickly earned a reputation as the crown jewel of open-source image generators. It matched Midjourney's aesthetic prowess while absolutely crushing it in prompt understanding and text generation. The catch? You needed a beefy GPU with over 24GB of VRAM (or even more) just to get it running. That's more horsepower than most gaming rigs, let alone your average work laptop.

But the AI community, never one to back down from a challenge, rolled up its collective sleeves and got to work. Through the magic of quantization—a fancy term for compressing the model's data—they've managed to shrink Flux down to a more manageable size without sacrificing too much of its artistic mojo.

Let's break it down: The original Flux model used full 32-bit precision (FP32), which is like driving a Formula 1 car to the grocery store—overkill for most. The first round of optimizations brought us FP16 and FP8 versions, each trading a smidge of accuracy for a big boost in efficiency. The FP8 version was already a game-changer, letting folks with 6GB GPUs (think RTX 2060) join the party.

The Flux Schnell (FP8) runs smoothly on a 6GB RTX 2060 after disabling the shared memory fallback for ComfyUI.
Prompt executed in 107.47 seconds —4 steps, no OOM.
16.86s/it
512x768 Image.
1024x1024 takes considerably longer.
I would recommend a High-res fix or another upscaling… pic.twitter.com/LKe1rWzyQV

— jaldps (@jaldpsd) August 5, 2024

To do this, you need to disable System Memory Callback for Stable Diffusion, so your GPU can offload some of its work from its internal VRAM to your system RAM. This avoids the infamous OOM (out-of-memory) error—albeit at the cost of it running considerably slower. To disable this option, follow this tutorial by Nvidia.

But hold onto your hats, because it gets even better.

The real MVPs of the AI world have pushed the envelope further, releasing 4-bit quantized models. These bad boys use something called "Normal Point" (NP) quantization, which delivers a sweet spot of quality and speed that'll make your potato PC feel like it just got a turbo boost. NP quantization does not degrade quality as much as FP quantization, so in general terms, running this model gives great results, at high speeds, requiring little resources.

It's almost too good to be true, but it is true.

Results obtained with Flux AI Dev in fp8 and np4. Same prompt different seeds
Results obtained with Flux Dec in fp8 and np4. Same prompt, different seeds.

How to run Flux on lower-end GPUs

So, how do you actually run this streamlined version of Flux? First, you'll need to grab an interface like SwarmUI, ComfyUI, or Forge. We love ComfyUI for its versatility, but in our tests, Forge gave around a 10-20% speed boost over the others, so that's what we're rolling with here.

Head over to the Forge GitHub repository (https://github.com/lllyasviel/stable-diffusion-webui-forge) and download the one-click installation package. It's open-source and vetted by the community, so no sketchy business here.

For the NP4 Flux models themselves, Civit AI is your one-stop shop. You've got two flavors to choose from: Schnell (for speed) and Dex (for quality). Both can be downloaded from this page.

Once you've got everything downloaded, it's installation time:

  1. Unzip the Forge file and open the Forge folder.
  2. Run update.bat to get all the dependencies.
  3. Fire up run.bat to complete the setup.

Now, drop those shiny new Flux models into the \webui\models\Stable-diffusion folder within your Forge installation. Refresh the Forge web interface (or restart if you're feeling old school), and boom—you're in business.

Pro tip: To really squeeze every last drop of performance out of your resurrected rig, dial back the resolution. Instead of going for full SDXL (1024x1024) resolutions, try the more modest SD1.5 sizes (768x768, 512x768, and similar). You can always upscale later and use Adetailer for those crispy details.

Let's talk numbers: On a humble RTX 2060 with 6GB of VRAM, Flux Schnell in NP4 mode can churn out a 512x768 image in about 30 seconds, versus 107 seconds required by the FP8 version. Want to go big? It'll take about five minutes to upscale that bad boy to 1536x1024 with a high-res fix.

Want to go big without breaking your GPU? A better option is to start with Flux Schnell at SD1.5 resolutions, then send that creation through img2img. Upscale using a standard Stable Diffusion model (SD1.5 or SDXL) with low denoise strength. The whole process clocks in around 50 seconds, rivaling MidJourney's output on a sluggish day. You'll get impressive large-scale results without melting your graphics card.

The real kicker? Some mad lads have reportedly got Flux Schnell NP4 running on a GTX 1060 with 3GB of VRAM, with Flux Dev taking 7.90s per iteration. We're talking about a GPU that's practically wheezing on life support, and it's out here generating cutting-edge AI art. Not too shabby for hardware that's practically eligible for a pension.

Generally Intelligent Newsletter

A weekly AI journey narrated by Gen, a generative AI model.

Source: decrypt.co

Related stories
1 month ago - Flux is an advanced, open-source text-to-image model with 12 billion parameters. We compare it to three top contenders, and one comes out on top.
12 hours ago - Cryptocurrency staking is a popular way to increase digital assets. It is similar to depositing money in a bank, which uses customer deposits to form loans for others and incentivizes them with interest payments. More specifically, crypto...
1 week ago - UPDATED: Will Nintendo showcase the Switch 2 later this month? Industry insiders suggest that some kind of reveal is imminent.
2 weeks ago - An interview with Craig Raw, creator of the popular Sparrow Wallet, on Bitcoin privacy best practices and pain as a teacher. From "The Privacy Issue".
1 week ago - UPDATE: The PlayStation 5 Pro is officially coming. Here’s when, for how much, and what it’ll offer over the vanilla PS5.
Other stories
17 minutes ago - Tether's investment in Sorted Wallet could significantly enhance financial inclusion and economic empowerment in underserved regions. The post Tether invests $1.5 million in Sorted Wallet to boost financial access in emerging markets...
53 minutes ago - After launching a Bitcoin yield ETP, Core wants to bring a similar product to the U.S. "as soon as regulatory frameworks allow it.”
1 hour ago - Dogecoin could be gearing up for another major surge in price as the meme coin’s chart shows the formation of a major pattern. The Golden Cross pattern is a major bullish formation on a chart that usually precedes a notable rally for...
1 hour ago - Bybit's support for Ethereum's Attackathon underscores the growing emphasis on security and innovation in the crypto industry. The post Bybit backs Ethereum’s first Attackathon with 75 ETH commitment appeared first on Crypto Briefing.
1 hour ago - The arrests and asset freezes highlight the growing effectiveness of international cooperation in combating sophisticated crypto crimes. The post Massive $243 million crypto heist ends with multiple arrests and asset frozen appeared first...