We ❤️ Open Source
A community education resource
How to create your own AI self-portrait with Gemini, ChatGPT, and Stable Diffusion
Turn reference photos into professional headshots or creative avatars in minutes.
This article is part of the eBook: Everyday AI guide: Practical genAI life hacks from real users, a free download from We Love Open Source.
AI-generated images have evolved dramatically over the past two years, transforming from novelty to powerful creative tools. This guide walks you through creating your own custom AI self-portraits using both commercial and open source solutions—from Google’s Gemini and ChatGPT to customizable models like the MIT-licensed Stable Diffusion.
And over the last few months the quality of AI generation of images has accelerated. The ability to generate realistic and artistic images from text prompts has captured the imagination of developers, artists, and enthusiasts alike. It also has created an increase in deep fakes.
I’ve been creating images for my newsletter, The Artificially Intelligent Enterprise for the last two years. You can see an evolution in the quality and consistency of images if you take a look at the archives. Early on I was using Midjourney, by most accounts the leading image of the time. Today, I am using ChatGPT (use the 4o model for faster results) or Google Nano Banana for the most cutting edge results.
Honestly for the All Things Open community, the open source capabilities aren’t the best in capabilities for image generation so this article is a hybrid of both the best models for this as well as the open source models that can do this but probably not with the same level of results…yet.
Creating professional portraits without the photo shoot
Let’s face it—not everyone loves being photographed. Between finding good lighting, nailing that natural smile, and dealing with inevitable bad hair days, traditional headshots can be stressful. AI-generated portraits solve these problems, giving you consistent, professional results every time.
Google Gemini 2.5 Flash
Google Gemini 2.5 Flash Preview (codenamed “Nano Banana”) is currently leading the pack in image generation quality and ease of use. Access it through Google AI Studio for a streamlined experience that balances power with simplicity.
Here’s an example prompt:
Professional headshot portrait of a software developer at a technology
conference, wearing a casual button-up shirt, friendly smile, modern
office background with computer screens, natural lighting,
photorealistic style, high quality, detailed facial features

But that isn’t really helpful as it doesn’t look like you. So let’s do this. Let’s upload a picture for reference and then change the picture to match your look. Start by uploading an image of yourself then change the prompt from above to this.
Use this image to generate a professional headshot portrait of a
software developer at a technology conference, wearing a casual
button-up shirt, friendly smile, modern office background with
computer screens, natural lighting, photorealistic style,
high quality, detailed facial features

Text-based image editing
Now that you have an image where Nano Banana excels at natural language editing commands. The model actually now understands the content of your image and then you can use this to edit rather than using The GIMP or Photoshop.
Simply describe the changes you want:
| Command Example | Result |
| “Make the car red instead of black” | Changes color while preserving lighting |
| “Remove background, replace with beach” | Seamless background replacement |
| “Add ‘Happy Holidays’ in snow font” | Adds styled text overlay |
| “Blur the background slightly” | Creates depth-of-field effect |
| Make me look thinner | Gets rid fo that double chin🤣 |
The system uses vision-language models to interpret prompts and apply edits through inpainting, object manipulation, or style transfer. This technique works for most all the vision models.
Artistic portraits with ChatGPT
For more creative and stylized portraits, ChatGPT (I choose the legacy 40 model because it seems to be faster and it also offers unlimited artistic possibilities. This approach is perfect for unique avatars or exploring AI’s creative potential. Like Nano Banana, ChatGPT can create and edit images based on the prompt and the reference images.
When aiming for an artistic style, your prompt can be more abstract and evocative. Instead of focusing on photorealism, you can use descriptive language to suggest a particular mood, aesthetic, or artistic movement.
Here is an example of a prompt designed to generate a stylized, digital art portrait:
Digital art portrait of a tech professional, artistic interpretation with
vibrant colors, stylized features, modern digital art style reminiscent
of DALL-E generations, creative lighting effects, contemporary art aesthetic
This prompt encourages the AI to take creative liberties, resulting in an image that is more of an artistic interpretation than a literal representation. The use of terms like “vibrant colors,” “stylized features,” and “creative lighting effects” guides the AI towards a more imaginative and visually dynamic result. Also I used ChatGPT to meta prompt or create a prompt for the model. It’s a lot more effective than typing, “just make me look cool”.
Here is an example of an image generated using the prompt above, showcasing the artistic capabilities of the OpenAI GPT 4o model:

The resulting image is a striking and futuristic portrait that blends the human form with abstract digital elements. This style is perfect for anyone looking to create a unique and memorable visual identity that reflects a passion for technology and innovation. Think X.com or Github avatar.
Unlike photorealistic prompts, artistic ones encourage creative interpretation through terms like “vibrant colors” and “stylized features.”
This was super detailed and well-done but I didn’t include a reference picture. Here’s the same prompt but with a reference picture to seed the image.
Create a digital art portrait of a tech professional, artistic interpretation
with vibrant colors, stylized features, modern digital art style reminiscent
of DALL-E generations, creative lighting effects, contemporary art aesthetic.
Use the uploaded image for reference.

With ChatGPT’s integrated image generation, you can upload a reference photo and specify the artistic style you want (e.g., “Ghibli-style cartoon” or “cyberpunk aesthetic”). The model will maintain your likeness while applying the requested artistic transformation.
Here’s an example of what I created in ChatGPT of me and my dog. Though this will work with other modern vision models.
First I provided a picture of myself, giving the model a reference image is important. I used one of my AI-enhanced portraits along with the items I wanted in the picture, my dog, his car, and me. Then I gave it a simple prompt to create a Ghlibi-style cartoon.

Open source solutions for maximum control
For ultimate customization, open source models like Stable Diffusion provide complete control over the generation process. While requiring more technical expertise, they offer unmatched flexibility. I used the prompt below with a Stable Diffusion model hosted on Hugging Face. That was easy to use and test the results of a model before committing to it.
Realistic portrait of an open source developer, casual attire,
sitting at desk with multiple monitors showing code, warm ambient
lighting, detailed textures, community-focused atmosphere,
(best quality, 4k, 8k, ultra highres, raw photo:1.2),
(realistic, photo-realistic:1.37)

If you notice the image is pretty good but the screens on the monitor are a little fuzzy and the developer’s hand has a weird position. Though technically nothing is wrong with the picture it’s just not that clear.
Here’s what’s good about the open source models. Open source models allow:
- Custom model checkpoints
- LoRA adaptations for specific styles, here’s a how-to from open source vector database provider Zilliz.
- Fine-tuning for precise control
- Weight parameters for quality enhancement
Getting started with open source
- Stable Diffusion: Available on Hugging Face
- Requirements: GPU with 6GB+ VRAM recommended
- Interfaces: ComfyUI, Automatic1111, or command line
For the true open source enthusiast, the ability to run and fine-tune your own AI image generation models offers the ultimate level of control and customization. Open source models like Stable Diffusion, and the ecosystems built around them, provide a powerful platform for creating highly personalized and specific images.
If you want to run your own model, I suggest you run a vision model on Ollama or check out the vision models hosted on Hugging Face.
While this approach requires more technical expertise, the rewards are well worth the effort if you require more customization. Open source models allow you to tailor every aspect of the image generation process to your exact specifications. Also you can fine-tune these models with images to further refine the style and results.
Read more: What is prompt engineering?
The prompt
When working with open source models, you have the flexibility to use more complex and nuanced prompts. You can also incorporate specific model checkpoints and LoRAs (Low-Rank Adaptations) to achieve a particular style or likeness but this is pretty advanced.
Here is an example of a prompt that could be used with a fine-tuned Stable Diffusion model to generate a realistic portrait of an open source developer:
Realistic portrait of an open source developer, casual attire, sitting at a
desk with multiple monitors showing code, warm ambient lighting, detailed
textures, community-focused atmosphere, approachable expression, (best
quality, 4k, 8k, ultra highres, raw photo:1.2), (realistic, photo-realistic:1.37)
This prompt includes not only descriptive language but also specific keywords and weights (e.g., (best quality, 4k, 8k, ultra highres, raw photo:1.2)) that are commonly used in the Stable Diffusion community to enhance the quality and realism of the generated images. This level of control is one of the key advantages of using open source models.
The result
Here is an example of an image that could be generated using a fine-tuned open source model with the prompt above:

This image captures the authentic and relatable atmosphere of an open source developer’s workspace. The attention to detail, from the code on the monitors to the warm and inviting lighting, creates a sense of realism and community that is often a hallmark of open source culture.
Summary guide for AI image creation
Here is an overview of how to get the best results from whatever image model you chose.
- Be specific: Include details about lighting, pose, and environment
- Reference images: Upload photos for better likeness
- Iterate: Refine prompts based on results
- Style keywords: Use terms like “photorealistic,” “artistic,” or specific art movements
Common use cases
- Professional headshots for conferences
- Creative avatars for social media
- Team portraits with consistent styling
- Technical documentation illustrations
Choosing your tool
| Tool | Best For | Skill Level | Cost |
| Google Gemini | Quick professional portraits | Beginner | Free/Paid tiers |
| ChatGPT/DALL-E | Creative, artistic styles | Beginner-Intermediate | Subscription |
| Stable Diffusion | Full customization | Intermediate-Advanced | Free and open source (local hardware) |
Conclusion
As we’ve seen, the world of AI image generation offers a diverse range of tools and techniques for creating stunning self-portraits. Whether you prefer the user-friendly simplicity of Nano Banana, the artistic flexibility of ChatGPT and DALL-E, or the ultimate control of open source models, there is a solution that is right for you. For the All Things Open community, these tools represent not just a new form of creative expression but also a powerful new frontier for innovation and collaboration. We encourage you to experiment with these techniques, share your creations, and continue to explore the exciting intersection of art and open source technology.
Get the “Everyday AI” eBook
This free guide highlights simple, real-world examples of how people are using AI to save time, spark creativity, and make life easier.
The opinions expressed on this website are those of each author, not of the author's employer or All Things Open/We Love Open Source.
