Switching models in Gemini CLI can be accomplished through several methods, depending on your authentication setup and requirements. The most straightforward approach is using the --model
command-line flag when launching Gemini CLI. For example, you can run gemini --model gemini-2.5-flash
to specify a particular model for your session. This method allows you to choose different models based on your current needs, such as using Flash for faster responses or Pro for more complex tasks requiring deeper analysis.
You can also configure your default model using environment variables in your shell profile or .env file. Set the GEMINI_MODEL
environment variable to your preferred model, such as export GEMINI_MODEL="gemini-2.5-pro"
. This approach makes your model preference persistent across sessions without requiring command-line flags each time. However, some users have reported that when using the free tier with personal Google accounts, the CLI may automatically switch from Pro to Flash models due to quota limitations, even when explicitly configured otherwise.
For more control over model selection, you can use API keys from Google AI Studio or Vertex AI instead of personal account authentication. With a paid API key, you gain access to consistent model usage without automatic fallbacks due to quota restrictions. The CLI respects different environment variables for different authentication methods: use GEMINI_API_KEY
for Google AI Studio keys or GOOGLE_API_KEY
with GOOGLE_GENAI_USE_VERTEXAI=true
for Vertex AI access. Professional developers who need guaranteed access to specific models or require higher rate limits should consider upgrading to usage-based billing or obtaining Gemini Code Assist Standard or Enterprise licenses, which provide more predictable model access and prevent unwanted model switching due to quota limitations.