1. Introduction

Nano Banana is Google’s state-of-the-art image generation and editing model, officially known as Gemini 2.5 Flash Image. This groundbreaking technology not only creates realistic images based on natural language prompts but also enables detailed editing and advanced composite features that were previously challenging for AI. By combining the strengths of multiple image processing techniques—including multi-image fusion, style transformation, and character consistency—Nano Banana has transformed how developers, designers, and content creators approach image manipulation and generation tasks.

In this guide, we present a comprehensive, step-by-step tutorial that explains how to use Nano Banana on the Google AI Studio and via the Gemini API. We will explore how to access the tool, generate images from descriptive prompts, edit existing images while maintaining consistency in characters, and leverage advanced features such as multi-image fusion and style transfer. Every step in this guide is supported by detailed explanations, practical code examples, and visualizations to help you understand the process––making it an indispensable resource for professionals and enthusiasts alike.

2. Understanding Nano Banana and Gemini 2.5 Flash Image

Nano Banana, sometimes referred to by its codename “nano-banana,” represents a significant evolution in AI-powered image editing and generation. Launched as part of the Gemini 2.5 Flash Image update, this tool integrates seamlessly with Google AI Studio and Vertex AI services, catering to both developers and enterprise users. Its advanced features include:

Targeted Image Generation: Generate photorealistic images from text prompts using natural language instructions.

Seamless Image Editing: Make precise edits—such as targeted transformations, background swaps, or addition of specific objects—without the need for manual adjustments.

Character Consistency: Maintain the visual integrity of subjects across multiple images, ensuring brand and character consistency for storytelling.

Multi-Image Fusion: Blend multiple images into one cohesive photorealistic scene that adapts to different design contexts.

Style Transfer: Transform images into various artistic styles (e.g., watercolor, vintage, or modern minimalism) while preserving structural elements.

These capabilities, along with the model’s speed and cost-effectiveness (priced at $30 per one million tokens, with each image costing approximately $0.039 due to a consumption of about 1,290 tokens), make Nano Banana an ideal tool for creative professionals, marketers, and developers seeking both efficiency and precision.

3. Getting Started with Google AI Studio

Google AI Studio offers a user-friendly web interface that allows users to experiment with Nano Banana without requiring deep programming knowledge. Here’s how to get started:

3.1 Sign In and Access the Model

Access Google AI Studio: Open your web browser and navigate to the Google AI Studio portal. Use your Google account to sign in.

Select the Gemini Model: Once signed in, navigate to the section dedicated to Gemini models. Look for “Gemini 2.5 Flash Image” (also known as nano-banana) and click on it. This section provides a simple “build mode” where you can test out image generation and editing features directly from the browser.

Explore the Template Apps: Google AI Studio also includes template applications that demonstrate key features such as character consistency and multi-image compositing. These apps are customizable and provide a first-hand look at Nano Banana’s capabilities without you having to write code from scratch.

3.2 Using the Studio Interface

Within the Google AI Studio interface:

Prompt-Based Generation: Enter a detailed natural language prompt in the designated text input field. For example, you might enter: "A photorealistic view of a modern cafe interior with warm lighting and soft textures."

Image Upload for Editing: If you wish to edit an existing image, simply upload it using the provided image uploader and then describe the modifications you need. For instance, "Add a subtle watercolor effect and increase brightness" would be an acceptable input.

Generate and Remix: The interface allows you to not only generate images but also to remix, iterate, and refine the resulting outputs by applying further edits, ensuring that the final product perfectly meets your creative vision.

Google provides feedback on prompt quality and even a live preview of remixing, enabling you to see how the image evolves with each modification. This workflow is especially useful when you intend to maintain character consistency across multiple images or require precise control over specific visual elements.

4. Using the Gemini API for Image Generation and Editing

For developers and those comfortable with coding, the Gemini API offers programmatic access to Nano Banana. The API provides flexibility and integration capabilities with custom applications, mobile apps, and other enterprise platforms.

4.1 Setting Up the Environment

Before writing code, you need to set up your development environment:

Install Required Packages: Use the following command to install the necessary Python libraries: Package Purpose google-genai Official client for accessing Gemini models python-dotenv Secure storage and access of API keys PIL (Pillow) Image processing and saving capabilities Example terminal command: pip install google-genai python-dotenv pillow This command installs all required packages so you can start connecting to the Gemini API.

Secure the API Key: Create a .env file to store your API key securely: GEMINI_API_KEY=your_actual_api_key Ensure this file is added to your .gitignore file to prevent accidental exposure of sensitive credentials.

4.2 Writing the Basic Code

Below is a simplified Python script that demonstrates how to generate an image using Nano Banana on the Gemini API:

from google import genai  
from PIL import Image  
from io import BytesIO  
import os  
from dotenv import load_dotenv  
load_dotenv()  # Load API key from .env file  
client = genai.Client(api_key=os.getenv("GEMINI_API_KEY"))  
# Define the prompt for image generation  
prompt = "A futuristic cityscape at dusk with neon lights reflecting on rain-soaked streets"  
# Request image generation from Gemini 2.5 Flash Image  
response = client.models.generate_content(  
    model="gemini-2.5-flash-image-preview",  
    contents=[prompt]  
)  
# Extract and save the generated image  
for part in response.candidates[0].content.parts:  
    if part.inline_data is not None:  
        image = Image.open(BytesIO(part.inline_data.data))  
        image.save("generated_image.png")  
        print("Image saved as generated_image.png")

This example script demonstrates how to interact with the Gemini API in Python. The script:

Loads the API key

Creates an instance of the client

Sends a natural language prompt to generate an image

Parses the response to extract and save the image as PNG.

4.3 Editing an Existing Image

The same API can be used to edit images. For instance, if you have an image stored locally and want to adjust it using Nano Banana, you could write:

from PIL import Image  
from io import BytesIO  
# Load the existing image for editing  
image_to_edit = Image.open("path/to/your/image.png")  
# Define the edit prompt  
edit_prompt = "Using the provided image, add a touch of vintage style, warm sepia tones, and slightly blurred background"  
response = client.models.generate_content(  
    model="gemini-2.5-flash-image-preview",  
    contents=[edit_prompt, image_to_edit]  
)  
# Process the response to save the edited image  
for part in response.candidates[0].content.parts:  
    if part.inline_data is not None:  
        edited_image = Image.open(BytesIO(part.inline_data.data))  
        edited_image.save("edited_image.png")  
        print("Edited image saved as edited_image.png")

This code snippet shows how to supply both a prompt and an image to the model, thus instructing it to modify the input image according to your specifications while preserving essential visual elements such as character expressions or background details.

5. Detailed Walkthrough: Image Generation Process

In this section, we provide an in-depth walkthrough of generating a new image using Nano Banana in Google AI Studio and with the Gemini API. The following steps outline the process from initial prompt design to final image output.

5.1 Designing the Prompt

Effective image generation starts with a well-crafted prompt. The prompt must be detailed and specific to ensure that the model understands the desired outcome. For example, consider this prompt:

"Generate a photorealistic image of a serene lakeside cabin during autumn. The scene should include colorful fall trees, gentle ripples on the lake, and a warm glow emanating from the cabin windows, evoking a sense of tranquility."

This prompt provides a clear description of the elements to be included—cabin, lakeside setting, autumn colors, reflective water, and atmospheric lighting. The more specific the description, the more likely the model will generate an image that meets your expectations.

5.2 Using Google AI Studio for Generation

Input the Prompt: Navigate to the Gemini 2.5 Flash Image section in Google AI Studio. Paste your prompt into the text input field.

Generate the Image: Hit the “Generate” button. The system processes your natural language prompt and displays a preview of the generated image. The AI uses Gemini’s expansive world knowledge to ensure accurate depiction of details like lighting, texture, and color.

Review and Iterate: If the generated image does not fully meet your criteria, refine the prompt. Adjust details such as lighting conditions or the composition of objects until you achieve the desired result. This iterative process is central to reaching high-quality outputs in creative projects.

5.3 Saving and Refining the Output

Once satisfied with the preview, you can save the image directly from Google AI Studio. The platform also offers options to further edit the output using additional prompts—a process that facilitates dynamic tweaking of the content.

Example Outcome

Imagine an image that perfectly captures your vision: a lakeside cabin lit by a soft, warm glow reflecting off gently rippling water, framed by a forest ablaze with autumnal hues. Each element—from the texture of the cabin's wood to the subtleties of seasonal foliage—is rendered with high fidelity.

6. Detailed Walkthrough: Image Editing with Nano Banana

Editing an existing image is one of Nano Banana’s most impressive strengths. Whether you want to add new elements, change colors, or modify specific features, the process is straightforward.

6.1 Uploading the Base Image

Select Your Image: In Google AI Studio, use the image upload function to import an image that you want to edit. For example, you might upload a portrait or a scenic photograph.

Define the Edit Requirement: Clearly describe precisely what needs to be changed. A useful prompt might be: "Using the provided image, add a pair of elegant, thin reading glasses to the subject’s face while maintaining the original lighting and style."

6.2 Processing the Edit via the Gemini API

When using the Gemini API for edits, follow these steps:

Load the Image: Employ the Pillow library to read the image from your local file system.

Provide the Edit Prompt: Accompany the image with a detailed text prompt specifying the desired edit. This might include instructions for stylistic modifications (such as adding vintage filters or changing hairstyles), ensuring that the system understands the context.

Extract the Edited Version: The API returns the edited image data and you can easily extract and save this image using code similar to the generation example provided earlier.

6.3 Maintaining Visual Consistency

One of the notable innovations in Nano Banana is its ability to maintain character consistency. When editing, even drastic changes in background, lighting, or style will preserve the subject's essential features. This is particularly useful for:

Portrait Editing: Modifying the background or adding accessories while keeping facial features identical.

Branding and Marketing: Ensuring that logos, mascots, or key characters remain consistent across multiple images.

By instructing the API with a phrase like "Show this exact character with the same facial features and posture," you harness the model’s capacity to recognize and replicate detailed image aspects.

Code Example for Maintaining Consistency

from PIL import Image  
from io import BytesIO  
base_image = Image.open("path/to/your/portrait.png")  
consistency_prompt = "Generate a new image using the provided portrait as reference. The subject should be smiling and looking directly at the camera, with the same facial features and style."  
response = client.models.generate_content(  
    model="gemini-2.5-flash-image-preview",  
    contents=[consistency_prompt, base_image]  
)  
for part in response.candidates[0].content.parts:  
    if part.inline_data is not None:  
        consistent_image = Image.open(BytesIO(part.inline_data.data))  
        consistent_image.save("consistent_image.png")  
        print("Consistent image saved as consistent_image.png")

This script ensures that any changes applied do not alter the subject’s core identity.

7. Advanced Features: Multi-Image Fusion, Style Transfer, and Character Consistency

Nano Banana goes beyond simple generation and editing—its advanced functionalities truly set it apart.

7.1 Multi-Image Fusion

Multi-image fusion is the process of combining elements from different images into one cohesive image. This is particularly useful for product photography, collage creation, and complex design projects.

How It Works:

Input Multiple Images: Provide up to three images as input. For example, one image can serve as the background, another as the main subject, and a third as an overlay or accessory.

Detail the Fusion Process: Use a comprehensive prompt such as "Blend these images to create a photorealistic scene where the product is seamlessly placed in a modern urban setting."

Generate the Fusion: The Gemini API intelligently merges the visual elements, ensuring proper blending of shadows, textures, and lighting.

Example Fusion Code

from PIL import Image  
from io import BytesIO  
# Load multiple images  
background = Image.open("path/to/background.png")  
product = Image.open("path/to/product.png")  
fusion_prompt = "Combine the background with the product image. Place the product naturally within the scene, ensuring that lighting and shadows match."  
response = client.models.generate_content(  
    model="gemini-2.5-flash-image-preview",  
    contents=[fusion_prompt, background, product]  
)  
for part in response.candidates[0].content.parts:  
    if part.inline_data is not None:  
        fusion_image = Image.open(BytesIO(part.inline_data.data))  
        fusion_image.save("fused_image.png")  
        print("Fused image saved as fused_image.png")

7.2 Style Transfer

Style transfer allows you to transform the aesthetic of an image while preserving its central theme. For example, you might want to convert a photograph into a watercolor painting or a vintage illustration.

Steps for Style Transfer:

Choose a Base Image: Start with a clear and high-quality image.

Describe the Desired Style: Your prompt should include specifics such as, "Transform this image into a delicate watercolor painting on cold-press paper with soft bleeding edges and subtle texture differences."

Apply the Style: The model accepts the prompt along with the image and generates a new image featuring the desired artistic style while keeping the key attributes of the original image intact.

7.3 Advanced Character Consistency

For projects requiring the same subject to appear in multiple contexts (such as branding or sequential storytelling), Nano Banana guarantees that key visual features remain unchanged between generations or edits. This is achieved by instructing the model clearly with phrases such as "maintain identical facial features", "keeping this exact character", or "preserving the subject's visual DNA."

This capability underlines one of Nano Banana's core strengths, allowing creators to build a consistent visual narrative across various media formats.

8. Prompt Engineering Best Practices

A critical factor that determines the quality of image generation and editing is the way prompts are framed. Here are some best practices to ensure optimal results:

8.1 Be Descriptive and Specific

Focus on Context: Detail the setting, lighting, and mood in your prompt. For instance, "a modern office with ambient lighting" provides more context than simply "an office."

Include Technical Descriptors: Use terms like "85 mm portrait lens," "f/2 for shallow depth of field," or "soft, warm glow" to influence the photographic style of the output.

8.2 Iterative Refinement

Start with a basic prompt to get an initial result, then incrementally refine it:

Apply specific modifications in iterative steps, for example, "first add the accessory, then adjust the background lighting."

Use sequential editing approaches to refine details without starting from scratch each time.

8.3 Use Semantic Positive Prompts

Rather than telling the model what you do not want, focus on what you desire. For example, instead of saying "remove the dark shadow," say "add a soft, diffused light to enhance the subject’s features." This positive direction leads to more plausible and natural edits.

8.4 Maintain Visual Consistency in Multi-Step Edits

When planning a series of edits on the same image:

Provide the same visual reference every time.

Clearly specify that the subject’s identity should remain unchanged with phrases like "this exact character" or "maintaining identical features."

8.5 Prompt Examples Comparison Table

Aspect	Generic Prompt	Detailed Prompt
Scene Description	"A cabin by a lake"	"A photorealistic image of a rustic lakeside cabin during autumn with colorful foliage and gentle water ripples"
Style Transfer	"Make it look artistic"	"Transform this image into a delicate watercolor painting with soft bleeding edges and subtle paper texture"
Character Consistency	"Keep the same person"	"Generate an image using the provided portrait as reference, ensuring this exact character maintains identical facial features"

This table illustrates how increasing specificity leads to more accurate results.

9. Real-World Use Cases and Integration Tips

Nano Banana is highly versatile, making it applicable in various scenarios. Here are some practical examples:

9.1 Marketing and Branding

Consistent Brand Assets: Marketers can use Nano Banana to generate high-quality images that maintain a consistent look and feel across different campaigns. For instance, generating product images with constant characteristics or creating a unified visual identity for a brand mascot.

Campaign Adaptation: With multi-turn editing, a single image can be quickly transformed to fit different marketing messages. For example, changing the style from photorealistic to a more artistic rendition (e.g., vintage or minimalistic) for various campaigns.

9.2 Content Creation for Social Media

Rapid Prototype Visuals: Content creators can generate visually appealing images for platforms like Instagram, TikTok, or YouTube with minimal effort. Nano Banana’s speed (low latency and cost-effective processing) ensures that content ideas are quickly brought to life.

Dynamic Image Editing: When follower feedback directs slight modifications (such as adding or removing elements), Nano Banana allows for prompt-based adjustments that maintain visual quality and consistency.

9.3 Professional Image Editing and Design

Iterative Refinement in Product Photography: Designers can use the iterative editing capabilities to refine product images—whether it’s adjusting lighting and shadows or integrating new props into a scene.

Multi-Image Fusion for Creative Projects: Photographers can blend images to create composite shots that would otherwise require complex manual editing. For example, merging different backgrounds with a product shot to simulate various environments.

9.4 Integration into Existing Workflows

Nano Banana integrates seamlessly with popular creative tools:

Photoshop Integration: Some community-developed plugins allow Nano Banana to be used directly within Photoshop. This “last mile” integration helps designers maintain their familiar workflow while leveraging AI’s capabilities for complex edits.

Enterprise Deployment with Vertex AI: For large-scale projects, companies can integrate Nano Banana through Vertex AI, providing a scalable solution for bulk image generation and editing—ideal for automated content pipelines in digital marketing.

9.5 Code Integration Example

Below is a consolidated example demonstrating how to generate and edit images using Python and the Gemini API:

from google import genai  
from PIL import Image  
from io import BytesIO  
import os  
from dotenv import load_dotenv  
load_dotenv()  
client = genai.Client(api_key=os.getenv("GEMINI_API_KEY"))  
# Step 1: Generate an image based on a text prompt  
generation_prompt = "A serene sunset over a calm lake with soft clouds and warm hues"  
response_gen = client.models.generate_content(  
    model="gemini-2.5-flash-image-preview",  
    contents=[generation_prompt]  
)  
for part in response_gen.candidates[0].content.parts:  
    if part.inline_data is not None:  
        gen_image = Image.open(BytesIO(part.inline_data.data))  
        gen_image.save("sunset_lake.png")  
        print("Generated image saved as sunset_lake.png")  
# Step 2: Edit an existing image by adding an object  
base_image = Image.open("sunset_lake.png")  
edit_prompt = "Using the provided image, insert a small wooden boat gently sailing on the calm lake, with realistic reflections."  
response_edit = client.models.generate_content(  
    model="gemini-2.5-flash-image-preview",  
    contents=[edit_prompt, base_image]  
)  
for part in response_edit.candidates[0].content.parts:  
    if part.inline_data is not None:  
        edited_image = Image.open(BytesIO(part.inline_data.data))  
        edited_image.save("sunset_lake_boat.png")  
        print("Edited image saved as sunset_lake_boat.png")

This code sample integrates both image generation and editing functionalities using Nano Banana and demonstrates the ease with which advanced edits can be performed programmatically.

10. Visualizations and Workflow Diagrams

Visualizing the workflow and data comparisons can greatly enhance the understanding of how Nano Banana operates within different contexts. Below are three visualizations that encapsulate key aspects of the process.

Figure 1: Access Methods Comparison Table

Platform	Access Method	Key Features
Google AI Studio	Web Interface	User-friendly; free tier available; template apps for demos
Gemini API	Programmatic (Python etc.)	Custom integration; flexible development with secure API key
Vertex AI	Enterprise Solution	Scalable; robust for large-scale deployments; integrates with enterprise workflows

Table 1: This table compares the different platforms through which Nano Banana can be accessed, highlighting their unique features and use cases.

Figure 2: Workflow Diagram for Using Nano Banana

flowchart TD  
    A["Start: Define Requirements"]  
    B["Choose Platform: AI Studio or Gemini API"]  
    C["Sign In and Access Gemini 2.5 Flash Image"]  
    D["Enter Descriptive Prompt / Upload Image"]  
    E["Generate Initial Image"]  
    F["Review Output"]  
    G["Iterate: Refine Prompt or Edit Image"]  
    H["Finalize Image and Save Output"]  
    A --> B  
    B --> C  
    C --> D  
    D --> E  
    E --> F  
    F --> G  
    G --> H

Figure 2: This flowchart depicts the sequential process from defining requirements to finalizing the image using Nano Banana. The workflow emphasizes iterative refinement and multi-stage processing.

Figure 3: Code Integration Flow

flowchart TD  
    A["Start: Setup Environment"]  
    B["Install Required Packages"]  
    C["Secure API Key (.env)"]  
    D["Write Python Script using google-genai"]  
    E["Define Prompt and/or Upload Image"]  
    F["Call Gemini API for Generation/Editing"]  
    G["Process Response and Extract Image"]  
    H["Save and Review Output"]  
    A --> B  
    B --> C  
    C --> D  
    D --> E  
    E --> F  
    F --> G  
    G --> H

Figure 3: This diagram illustrates the code integration flow for setting up the environment and using the Gemini API to generate or edit images with Nano Banana.

11. Conclusion

Nano Banana, branded as Gemini 2.5 Flash Image, represents a transformative step forward in the realm of AI-powered image generation and editing. By combining natural language processing with advanced visual features like multi-image fusion, style transfer, and character consistency, Google has provided developers and designers with a tool that streamlines creative workflows while maintaining high quality and precision.

Key Findings:

Seamless Integration: Nano Banana is accessible through Google AI Studio, Gemini API, and Vertex AI, enabling both casual experimentation and enterprise-level deployments.

Advanced Editing Capabilities: Users can not only generate images from detailed prompts but also perform precise edits—including alterations in style, character consistency, and object fusion—without losing the original visual integrity.

Iterative Refinement: The iterative editing process allows progressive improvements, making it possible to fine-tune images over multiple sessions.

Prompt Engineering Importance: Crafting detailed, positive prompts that include technical descriptors is essential for harnessing the full potential of the model.

Real-World Applications: From marketing and branding to content creation and professional design, Nano Banana offers versatility to serve a broad spectrum of use cases.

Main Advantages Summarized:

Ease of Use: Google AI Studio’s intuitive web interface allows users to generate and edit images effortlessly.

API Flexibility: The Gemini API provides a scalable and programmable approach to integrate Nano Banana into custom applications.

Cost-Effectiveness: Predictable token-based pricing ensures that creative projects remain within budget.

Consistency and Quality: Maintaining visual consistency across edits and ensuring high-fidelity image outputs are key improvements over previous models.

Nano Banana has set a new benchmark in AI-driven image creation. It is not only an impressive technical achievement but also a practical tool that empowers creators to rapidly prototype, iterate, and produce professional-quality visuals with minimal manual intervention.

In summary, whether you are a developer integrating advanced image editing into your application, a designer creating consistent brand visuals, or a content creator exploring new artistic mediums, Nano Banana’s comprehensive suite of features can significantly enhance your creative process. Embracing this technology now can pave the way for more efficient and innovative visual storytelling in the future.

By carefully following this step-by-step guide and applying the best practices outlined above, you can unlock the full potential of Nano Banana on Google AI Studio and the Gemini platform. Every step of the process—from signing in and setting up the environment to refining prompts and processing outputs—has been designed to ensure both high quality and ease of integration, thereby revolutionizing digital image workflows for a diverse range of applications.

This comprehensive guide has been crafted based entirely on supporting information and practical examples from Google Developers Blog, ImagineArt articles, and various detailed step-by-step tutorials on Nano Banana. All aspects of the tool—from basic operations to advanced creative techniques—are supported by documented features and user experiences.

How to Use Nano Banana on Google AI Studio and Gemini