Google Gemini API

Google Gemini는 텍스트, 이미지, 비디오, 오디오를 네이티브로 처리하는 멀티모달 AI 모델입니다. 업계 최대 수준인 1M 토큰 컨텍스트와 강력한 무료 티어를 제공하여 다양한 AI 애플리케이션을 구축할 수 있습니다.

핵심 포인트

Gemini 3.1 Pro: 1M 토큰 컨텍스트, 최신 플래그십 추론 모델
Gemini 3 Flash: 속도와 성능의 균형
Gemini 3.1 Flash-Lite: 초경량, 최저가
무료 티어: AI Studio에서 제한적 무료 사용
네이티브 멀티모달: 텍스트, 이미지, 비디오, 오디오
Google 서비스 통합: Search, Maps, YouTube

Gemini API 개요

모델 라인업

모델

Gemini 3.1 시리즈 (최신, 2026):
┌───────────────────────┬───────────┬──────────────────┬──────────────────────────┐
│ 모델                  │ 컨텍스트  │ 가격 (입력/출력) │ 특징                     │
├───────────────────────┼───────────┼──────────────────┼──────────────────────────┤
│ gemini-3.1-pro-preview        │ 1M tokens │ $2/$12 (≤200K)   │ 플래그십 추론 (2026-02-19)│
│                       │           │ $4/$18 (>200K)   │                          │
│ gemini-3.1-flash-lite-preview │ 1M tokens │ $0.25/$1.50      │ 초경량 (2026-03-03)      │
└───────────────────────┴───────────┴──────────────────┴──────────────────────────┘

Gemini 3 시리즈:
┌───────────────────────┬───────────┬──────────────────┬──────────────────────────┐
│ gemini-3-flash-preview        │ 1M tokens │ $0.50/$3         │ 균형 모델 (속도/성능)    │
└───────────────────────┴───────────┴──────────────────┴──────────────────────────┘

Gemini 2.5 시리즈 (이전 세대, 아직 사용 가능):
┌───────────────────────┬───────────┬──────────────────┬──────────────────────────┐
│ gemini-2.5-pro        │ 1M tokens │ 변동/1M          │ 고성능                   │
│ gemini-2.5-flash      │ 1M tokens │ 변동/1M          │ 빠름                     │
└───────────────────────┴───────────┴──────────────────┴──────────────────────────┘

레거시/종료 예정:
┌───────────────────────┬───────────┬──────────────────┬──────────────────────────┐
│ gemini-2.0-flash      │ 1M tokens │ 변동/1M          │ ⚠ 2026-06-01 종료 예정  │
│ gemini-2.0-flash-lite │ 1M tokens │ 변동/1M          │ ⚠ 종료 예정             │
└───────────────────────┴───────────┴──────────────────┴──────────────────────────┘

특수 기능:
• 프롬프트 캐싱 (50% 할인)
• JSON 모드
• Function Calling
• Code Execution (코드 자동 실행)
• Grounding (Google Search 통합)

지원 종료 안내: Gemini 2.0 Flash 및 2.0 Flash-Lite는 2026년 6월 1일부로 서비스가 종료됩니다. 기존 프로젝트에서 해당 모델을 사용 중이라면 Gemini 3 Flash 또는 Gemini 3.1 Flash-Lite로 마이그레이션하세요.

멀티모달 기능

기능

// 텍스트
• 대화형 AI
• 긴 문서 분석 (최대 1M 토큰)
• 코드 생성 및 분석
• 번역, 요약, 질의응답

// 이미지
• 이미지 설명 및 분석
• OCR (텍스트 추출)
• 객체 인식
• 차트/다이어그램 해석
• 여러 이미지 동시 처리

// 비디오 (Gemini만의 강점)
• 비디오 콘텐츠 이해
• 프레임별 분석
• 동작 인식
• 비디오 요약

// 오디오
• 음성 인식
• 오디오 콘텐츠 분석
• 다국어 지원

// 코드 실행
• Python 코드 자동 실행
• 결과 반환 및 활용

시작하기

API 키 발급

절차

1. Google AI Studio 접속
   https://aistudio.google.com

2. Google 계정으로 로그인

3. "Get API key" 클릭

4. "Create API key" 선택
   • 새 프로젝트 생성 또는
   • 기존 Google Cloud 프로젝트 선택

5. API 키 복사
   AIzaSy...

6. 안전하게 보관

무료 티어

Gemini 3 Flash: 분당 15 요청 (일일 한도는 변동될 수 있으므로 공식 가격 페이지 확인)
Gemini 3.1 Pro 계열: 2026년 4월 1일부로 무료 티어 미제공 (유료 전용)
프로토타이핑 시 Gemini 3 Flash 또는 Gemini 3.1 Flash-Lite 사용 권장
결제 정보 없이 Flash 계열 사용 가능

빠른 시작

Python

# 설치
$ pip install google-genai

# 사용
from google import genai

client = genai.Client(api_key="AIzaSy...")

response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="안녕하세요!"
)

print(response.text)

텍스트 생성

기본 사용법

Python

from google import genai
from google.genai import types

client = genai.Client(api_key="AIzaSy...")

response = client.models.generate_content(
    model="gemini-3.1-pro-preview",
    contents="Python으로 퀵소트를 구현해줘.",
    config=types.GenerateContentConfig(
        temperature=0.9,
        top_p=1,
        max_output_tokens=2048,
        system_instruction="당신은 친절한 AI 어시스턴트입니다."
    )
)

print(response.text)

# 토큰 사용량
print(f"\n입력 토큰: {response.usage_metadata.prompt_token_count}")
print(f"출력 토큰: {response.usage_metadata.candidates_token_count}")

다중 턴 대화

Python

from google import genai

client = genai.Client(api_key="AIzaSy...")
chat = client.chats.create(model="gemini-3-flash-preview")

# 첫 번째 메시지
response1 = chat.send_message("내 이름은 철수야.")
print("Gemini:", response1.text)

# 두 번째 메시지 (컨텍스트 유지)
response2 = chat.send_message("내 이름이 뭐였지?")
print("Gemini:", response2.text)  # "철수"라고 기억

# 대화 히스토리 확인
print("\n대화 히스토리:")
for message in chat.history:
    print(f"{message.role}: {message.parts[0].text}")

스트리밍

Python

from google import genai

client = genai.Client(api_key="AIzaSy...")

# 스트리밍 생성
response = client.models.generate_content_stream(
    model="gemini-3-flash-preview",
    contents="긴 이야기를 들려줘."
)

for chunk in response:
    print(chunk.text, end="", flush=True)

print()

멀티모달 기능

이미지 입력

Python

from google import genai
from PIL import Image

client = genai.Client(api_key="AIzaSy...")

# 이미지 로드
image = Image.open("photo.jpg")

# 이미지와 텍스트 함께 전송
response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents=["이 이미지에 무엇이 있나요? 자세히 설명해주세요.", image]
)

print(response.text)

여러 이미지

Python

from google import genai
from PIL import Image

client = genai.Client(api_key="AIzaSy...")

# 여러 이미지 로드
image1 = Image.open("before.jpg")
image2 = Image.open("after.jpg")

# 이미지 비교
response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents=["이 두 이미지의 차이점을 설명해주세요:", image1, "vs", image2]
)

print(response.text)

비디오 입력 (Gemini 특화)

Python

from google import genai
import time

client = genai.Client(api_key="AIzaSy...")

# 1. 비디오 파일 업로드
video_file = client.files.upload(file="video.mp4")

print(f"업로드 완료: {video_file.uri}")

# 2. 처리 완료 대기
while not video_file.state or video_file.state.name == "PROCESSING":
    time.sleep(5)
    video_file = client.files.get(name=video_file.name)

if video_file.state.name != "ACTIVE":
    raise ValueError("비디오 처리 실패")

# 3. 비디오 분석
response = client.models.generate_content(
    model="gemini-3.1-pro-preview",
    contents=[video_file, "이 비디오의 주요 내용을 요약해주세요. 중요한 장면을 시간대별로 설명해주세요."]
)

print(response.text)

# 4. 파일 삭제 (선택사항)
client.files.delete(name=video_file.name)

비디오 지원 상세

지원 형식: MP4, AVI, MOV, MPEG 등
최대 크기: 2GB
최대 길이: 제한 없음 (컨텍스트 윈도우 내)
프레임별 분석 가능
오디오 트랙도 함께 분석

오디오 입력

Python

from google import genai
import time

client = genai.Client(api_key="AIzaSy...")

# 오디오 파일 업로드 및 분석
audio_file = client.files.upload(file="audio.mp3")

# 업로드 완료 대기
while not audio_file.state or audio_file.state.name == "PROCESSING":
    time.sleep(5)
    audio_file = client.files.get(name=audio_file.name)

if audio_file.state.name != "ACTIVE":
    raise ValueError("오디오 처리 실패")

# 오디오 분석
response = client.models.generate_content(
    model="gemini-3.1-pro-preview",
    contents=[audio_file, "이 오디오를 텍스트로 변환하고 요약해주세요."]
)

print(response.text)
client.files.delete(name=audio_file.name)

초장문 컨텍스트

대용량 문서 분석

Python

from google import genai

client = genai.Client(api_key="AIzaSy...")

# 긴 문서 로드 (예: 200페이지 PDF의 텍스트)
with open("long_document.txt", "r", encoding="utf-8") as f:
    document = f.read()  # 예: 500,000 토큰

print(f"문서 길이: {len(document)} 문자")

# Gemini 3.1 Pro는 1M 토큰까지 처리 가능
# 문서 전체에 대한 질문
response = client.models.generate_content(
    model="gemini-3.1-pro-preview",
    contents=f"다음은 긴 문서입니다:\n\n{document}\n\n이 문서의 주요 내용을 5가지로 요약해주세요."
)

print(response.text)

# 추가 질문 (동일한 문서 컨텍스트 유지)
chat = client.chats.create(model="gemini-3.1-pro-preview")

response1 = chat.send_message(f"문서:\n{document}")
response2 = chat.send_message("3장의 핵심 내용은 뭐야?")
response3 = chat.send_message("저자의 주장에 대한 반론을 제시해줘.")

print(response2.text)
print(response3.text)

코드베이스 분석

Python

from google import genai
import os

client = genai.Client(api_key="AIzaSy...")

# 프로젝트의 모든 Python 파일 수집
codebase = ""
for root, dirs, files in os.walk("./src"):
    for file in files:
        if file.endswith(".py"):
            filepath = os.path.join(root, file)
            with open(filepath, "r") as f:
                codebase += f"\n\n{'='*50}\n"
                codebase += f"파일: {filepath}\n"
                codebase += f"{'='*50}\n"
                codebase += f.read()

print(f"코드베이스 크기: {len(codebase)} 문자")

# 전체 코드베이스 분석
response = client.models.generate_content(
    model="gemini-3.1-pro-preview",
    contents=f"""다음은 Python 프로젝트의 전체 코드입니다:

{codebase}

이 프로젝트를 분석하고 다음을 제공해주세요:
1. 프로젝트의 주요 기능
2. 아키텍처 개요
3. 개선이 필요한 부분
4. 보안 취약점
5. 테스트 커버리지 제안"""
)

print(response.text)

Function Calling

함수 정의

Python

from google import genai
from google.genai import types

# 타입 힌트와 docstring이 함수 선언 스키마로 변환됩니다.
def get_weather(location: str, unit: str = "celsius") -> dict:
    """특정 위치의 현재 날씨를 가져옵니다.

    Args:
        location: 도시 이름. 예: 서울
        unit: 온도 단위. celsius 또는 fahrenheit
    """
    return {
        "location": location,
        "temperature": 22,
        "unit": unit,
        "condition": "맑음"
    }

client = genai.Client(api_key="AIzaSy...")
config = types.GenerateContentConfig(tools=[get_weather])

# Python SDK가 함수 호출, 실행, 결과 반환까지 자동 처리
response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="서울의 날씨는 어때?",
    config=config
)

print(response.text)

함수 실행 루프

Python

from google import genai
from google.genai import types

# 실제 함수 구현
def get_weather(location: str, unit: str = "celsius") -> dict:
    """특정 위치의 현재 날씨를 가져옵니다."""
    # 실제 API 호출 대신 더미 데이터
    return {
        "location": location,
        "temperature": 22,
        "unit": unit,
        "condition": "맑음"
    }

client = genai.Client(api_key="AIzaSy...")
config = types.GenerateContentConfig(
    tools=[get_weather],
    automatic_function_calling=types.AutomaticFunctionCallingConfig(disable=True)
)

response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="서울 날씨 알려줘",
    config=config
)

# 함수 호출 처리
if response.function_calls:
    function_call = response.function_calls[0]

    # 함수 실행
    result = get_weather(**dict(function_call.args))

    # 결과를 Gemini에게 반환
    response2 = client.models.generate_content(
        model="gemini-3-flash-preview",
        contents=[
            response.candidates[0].content,
            types.Content(
                role="user",
                parts=[types.Part.from_function_response(
                    name=function_call.name,
                    response={"result": result},
                    id=function_call.id
                )]
            )
        ],
        config=config
    )

    print(response2.text)  # "서울의 날씨는 섭씨 22도로 맑습니다."

코드 실행

자동 코드 실행

Python

from google import genai
from google.genai import types

client = genai.Client(api_key="AIzaSy...")

# Gemini가 Python 코드를 작성하고 실행
response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="""피보나치 수열의 20번째 항을 계산해주세요.
Python 코드를 작성하고 실행해서 결과를 알려주세요.""",
    config=types.GenerateContentConfig(
        tools=[types.Tool(code_execution=types.ToolCodeExecution)]
    )
)

print(response.text)

# 실행된 코드 확인
for part in response.candidates[0].content.parts:
    if part.executable_code is not None:
        print("\n실행된 코드:")
        print(part.executable_code.code)
    if part.code_execution_result is not None:
        print("\n실행 결과:")
        print(part.code_execution_result.output)

코드 실행 활용 사례

복잡한 수학 계산
데이터 분석 및 시각화
알고리즘 검증
파일 처리
API 호출 시뮬레이션

Grounding (Google Search)

실시간 정보 검색

Python

# 주의: Grounding은 Vertex AI에서만 지원 (Google Cloud)
# AI Studio API에서는 미지원

from vertexai.preview.generative_models import (
    GenerativeModel,
    Tool,
    grounding
)

# Grounding 도구 설정
google_search_tool = Tool.from_google_search_retrieval(
    grounding.GoogleSearchRetrieval()
)

model = GenerativeModel(
    "gemini-3.1-pro-preview",
    tools=[google_search_tool]
)

# 최신 정보가 필요한 질문
response = model.generate_content(
    "2026년 1월 현재 세계 경제 상황은 어떤가요?"
)

print(response.text)

# 검색 출처 확인
if hasattr(response, 'grounding_metadata'):
    print("\n검색 출처:")
    for chunk in response.grounding_metadata.grounding_chunks:
        print(f"- {chunk.web.uri}")

안전 설정

콘텐츠 필터링

Python

from google import genai
from google.genai import types

client = genai.Client(api_key="AIzaSy...")

# 안전 설정
safety_settings = [
    types.SafetySetting(
        category=types.HarmCategory.HARM_CATEGORY_HARASSMENT,
        threshold=types.HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
    ),
    types.SafetySetting(
        category=types.HarmCategory.HARM_CATEGORY_HATE_SPEECH,
        threshold=types.HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
    ),
    types.SafetySetting(
        category=types.HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT,
        threshold=types.HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
    ),
    types.SafetySetting(
        category=types.HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
        threshold=types.HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
    ),
]

# 임계값 옵션:
# OFF, BLOCK_NONE, BLOCK_ONLY_HIGH, BLOCK_MEDIUM_AND_ABOVE, BLOCK_LOW_AND_ABOVE

response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="...",
    config=types.GenerateContentConfig(safety_settings=safety_settings)
)

# 안전 평가 확인
print(response.candidates[0].safety_ratings)

모범 사례

권장사항

// 1. 모델 선택
간단한 작업 → gemini-3.1-flash-lite-preview (초경량, 최저가)
일반 작업 → gemini-3-flash-preview (속도/성능 균형)
긴 컨텍스트/복잡한 작업 → gemini-3.1-pro-preview (플래그십 추론)

// 2. 컨텍스트 최적화
• 1M 토큰이 필요한 경우는 드물음
• 필요한 부분만 전달
• 요약 → 상세 분석 패턴 활용

// 3. 멀티모달 활용
• 이미지는 압축하여 전송
• 비디오는 긴 파일만 (짧은 파일은 프레임 추출)
• 파일 업로드 후 삭제로 스토리지 관리

// 4. 비용 절감
• 무료 티어 최대 활용 (프로토타이핑)
• 프롬프트 캐싱 (50% 할인)
• Flash Lite 모델 우선 시도

// 5. 에러 처리
try:
    response = client.models.generate_content(
        model="gemini-3-flash-preview",
        contents=prompt
    )
    if response.prompt_feedback and response.prompt_feedback.block_reason:
        print("안전 필터에 의해 차단됨")
except genai.errors.ClientError as e:
    if e.code == 400:
        print("요청 형식 또는 안전 정책을 확인하세요")
    else:
        raise
except genai.errors.ServerError:
    print("일시적 서버 오류. 지수 백오프로 재시도")

// 6. Rate Limit 대응
• 무료 티어: Flash 분당 15 요청 (Pro 계열은 2026-04-01부로 무료 제공 종료)
• 유료 티어: 높은 한도
• 최신 한도는 공식 가격 페이지(https://ai.google.dev/pricing) 확인 권장
• 지수 백오프 재시도 구현

핵심 정리

모델 선택: 간단한 작업에는 Gemini 3.1 Flash-Lite(초경량), 일반 작업에는 Gemini 3 Flash(속도/성능 균형), 복잡한 분석과 초장문 처리에는 Gemini 3.1 Pro(1M 토큰, 플래그십 추론)를 사용합니다.
네이티브 멀티모달: Gemini는 텍스트, 이미지, 비디오, 오디오를 별도 전처리 없이 직접 처리할 수 있으며, 특히 비디오 분석은 Gemini만의 차별화된 강점입니다.
초장문 컨텍스트: 1M 토큰 컨텍스트로 수백 페이지 문서나 전체 코드베이스를 한 번에 분석할 수 있습니다.
Function Calling: 외부 API와 연동하여 실시간 데이터 조회, 계산, 시스템 연동 등을 자동화합니다. 6단계 흐름(요청 → 함수 결정 → 호출 → 실행 → 결과 반환 → 응답)을 이해하세요.
코드 실행: Gemini가 Python 코드를 직접 작성하고 실행하여 수학 계산, 데이터 분석, 알고리즘 검증 결과를 즉시 제공합니다.
접근 경로: 프로토타이핑에는 Google AI Studio(무료), 엔터프라이즈에는 Vertex AI(SLA, Grounding 지원)를 사용합니다.

실무 팁

무료 티어 활용: Gemini 3 Flash는 분당 15요청까지 무료로 사용 가능합니다. 일일 한도 등 세부 조건은 변동될 수 있으므로 공식 가격 페이지에서 반드시 확인하세요. Gemini 3.1 Pro 계열은 2026년 4월 1일부로 무료 티어에서 제외되어 유료 전용으로 전환되었습니다.
프롬프트 캐싱: 동일한 시스템 프롬프트나 문서를 반복 사용할 때 프롬프트 캐싱을 활성화하면 비용을 50% 절감할 수 있습니다.
안전 설정 조정: 기본 안전 필터가 너무 엄격하면 BLOCK_ONLY_HIGH로 완화할 수 있습니다. 프로덕션에서는 BLOCK_MEDIUM_AND_ABOVE를 권장합니다.
파일 업로드 관리: 비디오/오디오 파일 업로드 후에는 client.files.delete(name=file.name)로 삭제하여 스토리지를 관리하세요.
에러 처리: genai.errors.ClientError, genai.errors.ServerError, prompt_feedback.block_reason을 구분하여 처리하고, 지수 백오프 재시도를 구현하세요.