Are edits with img2img producing washed-out faces, loss of detail, or stylistic drift? Many photographers and creators try image-to-image (img2img) expecting precise photo edits but end up with painterly artifacts or lost resolution. This guide provides a compact, repeatable img2img photo editing step-by-step tutorial that focuses on practical parameter values, masking technique, prompt recipes, and export tips to deliver client-ready images using free tools and models.
Key takeaways: what to know in 60 seconds
- Img2img is image-conditioned editing: it uses a source photo plus parameters (denoising, mask, prompt) to change content while preserving details.
- Start with low denoising (0.2–0.5) for subtle edits and increase for stronger stylization; denoising controls how much the model rewrites the source image.
- Use precise masks and feathering for localized edits; inpainting keeps background intact and edits only masked areas.
- Combine prompt engineering with CFG, sampler, and seed to make edits predictable; document exact values for client reproducibility.
- Export as TIFF/PNG and upscale with Real-ESRGAN for client delivery to preserve color and avoid compression artifacts.
Choose where to run img2img first. Options: local UI (automatic1111 web UI), lightweight desktop clients (InvokeAI), or cloud services (Hugging Face Inference, Replicate). For free and full control, the automatic1111 stable-diffusion-webui is the recommended starting point because it exposes all parameters and supports inpainting, neg prompts, and many samplers.
Recommended free tools and links:
Models to consider (free checkpoints or permissive licenses):
- Stable Diffusion v1.5 / v2.x (for photoreal edits)
- Photorealism-focused finetunes (look for "photoreal" or "face-finetune") on Hugging Face
- Inpainting checkpoints for more accurate localized fills
Hardware and environment tips:
- A GPU with at least 6–8GB VRAM supports small-to-moderate resolutions; use lower batch sizes and enable xformers if available.
- For heavier edits or 4k upscaling, consider cloud GPU instances only for rendering, then bring results back locally for post-processing.
This section gives a sequential workflow with concrete values that can be applied in the Automatic1111 web UI or analogous tools.
Step 1: prepare the source image and canvas
- Use the highest-quality original available (RAW/JPEG at target resolution). Crop to the final composition before editing.
- Remove heavy compression if possible. For faces, prefer images with neutral expressions and diffuse lighting for best model results.
Step 2: choose model and sampler
- Model: use a photoreal checkpoint (example: Stable Diffusion v1.5 or a photoreal finetune).
- Sampler: start with Euler a or DPM++ 2M Karras for smoother photoreal outputs.
Step 3: set img2img parameters (recommended baseline)
- Denoising strength: 0.25–0.45 for photo edits (0.2–0.35 preserves more detail; 0.5+ for stronger stylization).
- Steps: 20–40 (30 is a balanced default).
- CFG scale: 3.5–7.0 for photoreal edits (lower keeps closer to the source; higher forces prompt adherence).
- Seed: set a fixed seed to reproduce results.
- Resize method: resize the source image to match the model's native dimensions (512/768/1024) using a high-quality algorithm (Lanczos).
Step 4: write the edit prompt and negative prompt
- Prompt: write a concise directive describing the target change. Example for skin correction: "subtle skin smoothing, remove blemishes, maintain skin texture, natural color, photographic lighting".
- Negative prompt: include terms to avoid ("painting, watercolor, cartoon, oversmoothed, blurry, deformed face").
Step 5: run a preview pass
- Use a single pass with the baseline denoising strength. Inspect for artifacts.
- If the subject becomes stylized, reduce denoising or lower CFG.
Step 6: refine with masks and inpainting (if needed)
- Create a mask for the exact area to change (see masking section). Keep mask edges feathered 8–20 px depending on resolution.
- Use inpainting model if available and set inpaint strength low for subtle changes.
Step 7: multi-stage editing (iterative approach)
- For complex retouches, use multiple img2img passes: background edits first (higher denoising), then face-level passes with lower denoising.
- Keep a changelog of parameter values for each pass to reproduce client deliverables.
Step 8: select the best candidate and finalize
- Compare multiple seeds/runs. Use side-by-side comparisons to select the best balance of fidelity and edit.
Step 9: post-process and export (see export section below)
- Export as 16-bit TIFF (if color work required) or high-quality PNG. Apply upscaling only after final selection.
| Stage |
Key settings |
Typical values |
| Initial pass |
Sampler, denoise, steps |
Euler a, 0.3, 30 steps |
| Mask inpaint |
Feather, inpaint model |
Feather 10px, 0.2–0.35 |
| Final upscale |
Upscaler, denoise (if any) |
Real-ESRGAN x2/x4 |

Masking and inpainting in img2img for precise edits
Masking is the most important technique to obtain precise photo edits without altering the rest of the image. A well-made mask confines the model's changes and preserves context.
How to create effective masks
- Use a mask channel in the UI or an external editor (Photoshop/GIMP) to paint white where the edit should appear and black where it should remain unchanged.
- Add a soft transition: feather edges 8–20 px at 1k resolution; scale feather relative to image size.
- For hair and fine edges, create a two-layer approach: a tight mask for structure and a wider faded mask for color blending.
Inpainting settings for photo edits
- Use an inpainting model if available; otherwise, use img2img with a mask and low denoising.
- For reconstructing small details like eyes or blemishes, use denoising 0.15–0.30 and higher steps (25–40) to let the model refine textures.
- For larger area replacement (background swaps), denoising 0.35–0.6 gives cleaner reimagining while keeping some original framing.
Practical examples
- Fix stray hair: tight mask around strands, denoising 0.18, prompt "remove stray hair, preserve skin texture"
- Replace sky: wide mask, denoising 0.45, prompt "natural blue sky with soft clouds, consistent lighting"
Img2img prompt engineering and parameters to control results
Prompt engineering in img2img blends natural-language control with parameter tuning. The same prompt behaves differently depending on CFG, sampler, and denoising strength.
Core parameters and how they affect edits
- Denoising strength: main control for how much the model rewrites the source (0 = identical; 1 = full regeneration).
- CFG scale: how strongly the generation follows the prompt vs. the model's prior (lower = closer to source, higher = more prompt-driven).
- Sampler: affects noise scheduling and detail; experiment between Euler a, DPM++ and Karras variants for smoothness.
- Steps: more steps add refinement but diminishing returns past ~50 for photoreal tasks.
Prompt templates for common photo edits
- Skin retouch (subtle): "subtle skin smoothing, preserve pores and texture, natural tones, photographic detail, minimal retouch"
- Teeth whitening: "natural teeth whitening, maintain shape and realistic shading"
- Color grading: "warm cinematic grade, soft contrast boost, filmic highlights, natural skin tones"
Negative prompts and safety
- Use negative prompts to avoid painterly output: "(painting, watercolor, cartoon, lowres, deformed, overprocessed)". Parentheses or brackets can increase the weight in many UIs.
- For face edits, include "(ugly, deformed face, extra limbs)" in the negative prompt to reduce facial artifacts.
Export, upscaling, and client-ready post-processing for img2img
Final delivery matters. Proper export and post-processing ensure the image meets client specs and survives print or web compression.
- For retouch deliveries: export as 16-bit TIFF (for color grading) or maximum-quality PNG for web.
- Avoid JPEG until final delivery to prevent compression artifacts during processing.
- Embed color profile (sRGB for web, Adobe RGB or ProPhoto for print) and document metadata.
Upscaling options (free)
- Real-ESRGAN: reliable perceptual upscaling with minimal artifacts.
- Waifu2x (for some portraits), not always ideal for photorealism.
Recommended workflow:
- Finalize best candidate PNG/TIFF.
- Upscale x2 with Real-ESRGAN, inspect sharpen and texture.
- Apply local sharpening in an editor (unsharp mask or high-pass) conservatively.
Client-ready finishing touches
- Color grade to match client brand or brief; use subtle curves and selective color adjustments.
- Check skin tones with vectorscope or sample values; aim for natural luminance.
- Provide both full-resolution image and web-optimized JPEG (quality 85) with sRGB profile.
Advantages, risks and common mistakes
Benefits / when to apply ✅
- Quick localized retouches without manual cloning or frequency separation.
- Fast global style changes (mood, color grade) with retained composition.
- Great for concept iterations or batch stylization when fidelity is secondary.
Errors to avoid / risks ⚠️
- Using high denoising for portraits can produce uncanny or oversmoothed faces.
- Not saving parameter logs makes results non-reproducible for clients.
- Over-reliance on default samplers can introduce artifacts; test multiple samplers.
Troubleshooting tips ⚠️
- If faces degrade: lower denoising, increase steps slightly, add negative face prompts.
- If background leaks into edits: refine mask edges and increase mask feather.
- If colors shift: ensure color profile matches between source and model input; convert to sRGB for web.
img2img workflow at a glance
📸Step 1 → Prepare high-quality source and crop
⚙️Step 2 → Choose photoreal model and sampler
🧩Step 3 → Create mask and feather edges
✍️Step 4 → Prompt: describe edit + negative prompt
🔍Step 5 → Preview, iterate, select best
🔼Step 6 → Upscale with Real-ESRGAN and export
Frequently asked questions
What is the best denoising strength for subtle portrait edits?
For portraits, start between 0.18 and 0.35. Lower values preserve details; higher values add stylistic change. Adjust in small increments.
How to keep facial features realistic when using img2img?
Use a tight mask around the face, set denoising low, add negative prompts for deformities, and keep CFG moderate (3.5–6).
Which sampler provides the most photoreal output?
DPM++ 2M Karras and Euler a are strong starting points. Test both; DPM++ often gives smoother detail while Euler a can be sharper.
Can img2img replace manual retouching entirely?
Img2img speeds many retouches but may not replace manual cloning or frequency separation for extreme precision; combine both when needed.
How to reproduce the same edit for multiple photos?
Document model, sampler, seed, denoising strength, steps, CFG, and the exact prompt/negative prompt. Use presets or scripts in the UI.
Is upscaling required after img2img?
If the target delivery requires larger dimensions or print, upscaling (Real-ESRGAN) is recommended to preserve perceived detail.
Are there legal or copyright concerns when editing client photos with AI models?
Check model and checkpoint licenses and obtain client consent for AI-assisted edits. For models with restrictive licenses, use alternatives or obtain commercial rights.
Next steps
- Run three quick tests on a recent client photo: one with denoising 0.2, one 0.35, and one 0.5; compare results.
- Create a reusable preset file logging model, sampler, seed, denoising, steps, and CFG for quick replication.
- Export one final image as TIFF and upscale with Real-ESRGAN; deliver a web-optimized JPEG alongside.