OpenAI API 가이드

OpenAI API는 ChatGPT를 구동하는 GPT 모델을 프로그래밍 방식으로 사용할 수 있게 해주는 RESTful API입니다. GPT-4o, o1 등 최신 모델과 다양한 기능을 제공하여 강력한 AI 애플리케이션을 구축할 수 있습니다.

업데이트 안내: 모델/요금/버전/정책 등 시점에 민감한 정보는 변동될 수 있습니다. 최신 내용은 공식 문서를 확인하세요.

핵심 포인트

GPT-4o: 멀티모달, 빠른 속도, 합리적 가격
o1: 고급 추론 모델, 복잡한 문제 해결
Python/Node.js SDK 공식 지원
Function Calling, Vision, Audio 등 다양한 기능
배치 API로 50% 비용 절감

OpenAI API 개요

모델 라인업

모델

GPT-4 시리즈:
┌──────────────────┬──────────┬──────────┬────────────────────┐
│ 모델             │ 입력     │ 출력     │ 특징               │
├──────────────────┼──────────┼──────────┼────────────────────┤
│ gpt-4o           │ 변동/1M │ 변동/1M   │ 멀티모달, 빠름     │
│ gpt-4o-mini      │ 변동/1M │ 변동/1M │ 저렴, 빠름         │
│ gpt-4-turbo      │ 변동/1M   │ 변동/1M   │ 범용, 안정적       │
│ gpt-4            │ 변동/1M   │ 변동/1M   │ 레거시             │
└──────────────────┴──────────┴──────────┴────────────────────┘

추론 모델 (o 시리즈):
┌──────────────────┬──────────┬──────────┬────────────────────┐
│ o1               │ 변동/1M   │ 변동/1M   │ 고급 추론          │
│ o1-mini          │ 변동/1M    │ 변동/1M   │ 빠른 추론          │
└──────────────────┴──────────┴──────────┴────────────────────┘

GPT-3.5:
┌──────────────────┬──────────┬──────────┬────────────────────┐
│ gpt-3.5-turbo    │ 변동/1M │ 변동/1M │ 저렴, 빠름         │
└──────────────────┴──────────┴──────────┴────────────────────┘

컨텍스트 윈도우:
• GPT-4o: 128K tokens
• o1: 200K tokens
• GPT-3.5-turbo: 16K tokens

주요 기능

기능

// Chat Completions (대화형 AI)
• 다중 턴 대화
• 시스템 메시지로 행동 제어
• 스트리밍 지원

// Function Calling
• 외부 API/함수 호출
• 구조화된 데이터 추출
• 병렬 함수 호출 지원

// Vision (이미지 입력)
• 이미지 분석 및 설명
• OCR, 차트 해석
• URL 또는 Base64 지원

// Audio
• 음성 입력 (Whisper API)
• 텍스트→음성 (TTS API)
• 다국어 지원

// JSON Mode
• 항상 유효한 JSON 반환
• 구조화된 데이터 추출

// 배치 API
• 비동기 대량 처리
• 50% 비용 절감
• 24시간 이내 완료

시작하기

API 키 발급

절차

1. OpenAI 플랫폼 접속
   https://platform.openai.com

2. 계정 생성 또는 로그인

3. API Keys 메뉴 (왼쪽 사이드바)

4. "Create new secret key" 클릭

5. 키 이름 입력 (선택사항)

6. 생성된 키 복사 (한 번만 표시)
   sk-proj-...

7. 안전하게 보관

신규 가입 크레딧

신규 가입 시 변동 크레딧 제공 (3개월 유효)
결제 수단 등록 후 추가 크레딧 사용 가능
사용량 제한(Rate Limit)은 티어에 따라 다름

빠른 시작

Python

# 설치
$ pip install openai

# 사용
from openai import OpenAI

client = OpenAI(api_key="sk-proj-...")

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": "안녕하세요!"}
    ]
)

print(response.choices[0].message.content)

Node.js

// 설치
$ npm install openai

// 사용
import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

const response = await client.chat.completions.create({
  model: 'gpt-4o',
  messages: [
    { role: 'user', content: '안녕하세요!' }
  ],
});

console.log(response.choices[0].message.content);

Chat Completions API

기본 사용법

Python

from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {
            "role": "system",
            "content": "당신은 도움이 되는 AI 어시스턴트입니다."
        },
        {
            "role": "user",
            "content": "Python으로 퀵소트를 구현해줘."
        }
    ],
    temperature=0.7,  # 0.0 ~ 2.0
    max_tokens=1000,
    top_p=1.0,
    frequency_penalty=0.0,  # -2.0 ~ 2.0
    presence_penalty=0.0,   # -2.0 ~ 2.0
)

print(response.choices[0].message.content)

# 토큰 사용량 확인
print(f"\n입력: {response.usage.prompt_tokens}")
print(f"출력: {response.usage.completion_tokens}")
print(f"총합: {response.usage.total_tokens}")

다중 턴 대화

Python

from openai import OpenAI

client = OpenAI()

messages = [
    {"role": "system", "content": "당신은 친절한 선생님입니다."},
]

while True:
    user_input = input("You: ")
    if user_input.lower() == "exit":
        break

    messages.append({"role": "user", "content": user_input})

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=messages
    )

    assistant_message = response.choices[0].message.content
    print(f"Assistant: {assistant_message}\n")

    messages.append({"role": "assistant", "content": assistant_message})

JSON Mode

Python

import json

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {
            "role": "system",
            "content": "당신은 데이터를 JSON 형식으로 추출하는 도구입니다."
        },
        {
            "role": "user",
            "content": """다음 텍스트에서 이름, 나이, 직업을 JSON으로 추출해주세요:
"철수는 28살이고 소프트웨어 엔지니어입니다."

JSON 형식으로만 답변하세요."""
        }
    ],
    response_format={"type": "json_object"}  # JSON 모드 활성화
)

data = json.loads(response.choices[0].message.content)
print(data)
# {'name': '철수', 'age': 28, 'job': '소프트웨어 엔지니어'}

스트리밍

기본 스트리밍

Python

from openai import OpenAI

client = OpenAI()

stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "긴 이야기를 들려줘."}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="", flush=True)

print()

Node.js 스트리밍

TypeScript

const stream = await client.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: '긴 이야기를 들려줘.' }],
  stream: true,
});

for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content || '';
  process.stdout.write(content);
}

console.log();

Function Calling

함수 정의

Python

import json

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "특정 위치의 날씨를 가져옵니다",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "도시 이름 (예: Seoul)"
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"]
                    }
                },
                "required": ["location"]
            }
        }
    }
]

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "서울 날씨 알려줘"}],
    tools=tools,
    tool_choice="auto"  # "auto", "none", {"type": "function", "function": {"name": "..."}}
)

print(json.dumps(response.model_dump(), indent=2, ensure_ascii=False))

함수 실행 루프

Python

import json

def get_weather(location: str, unit: str = "celsius"):
    # 실제 API 호출 대신 더미 데이터
    return json.dumps({
        "location": location,
        "temperature": 22,
        "unit": unit,
        "forecast": "맑음"
    })

messages = [{"role": "user", "content": "서울과 부산의 날씨를 비교해줘"}]

while True:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
        tools=tools
    )

    choice = response.choices[0]

    if choice.finish_reason == "tool_calls":
        messages.append(choice.message)

        for tool_call in choice.message.tool_calls:
            function_name = tool_call.function.name
            arguments = json.loads(tool_call.function.arguments)

            print(f"Calling {function_name} with {arguments}")

            if function_name == "get_weather":
                result = get_weather(**arguments)

            messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "name": function_name,
                "content": result
            })
    else:
        print(choice.message.content)
        break

병렬 함수 호출

Python

# GPT-4o는 여러 함수를 동시에 호출 가능
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{
        "role": "user",
        "content": "서울, 부산, 제주의 날씨를 알려줘"
    }],
    tools=tools
)

# response.choices[0].message.tool_calls에 3개의 함수 호출이 포함됨
for tool_call in response.choices[0].message.tool_calls:
    print(f"Function: {tool_call.function.name}")
    print(f"Arguments: {tool_call.function.arguments}")

Vision (이미지 입력)

URL 이미지

Python

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "이 이미지에 무엇이 있나요?"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://example.com/image.jpg",
                        "detail": "high"  # "low", "high", "auto"
                    }
                }
            ]
        }
    ]
)

print(response.choices[0].message.content)

Base64 이미지

Python

import base64

def encode_image(image_path):
    with open(image_path, "rb") as f:
        return base64.b64encode(f.read()).decode("utf-8")

image_base64 = encode_image("chart.png")

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "이 차트를 분석해주세요."},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:image/png;base64,{image_base64}"
                    }
                }
            ]
        }
    ]
)

print(response.choices[0].message.content)

o1 추론 모델

o1 사용법

Python

# o1 모델은 특별한 사용 방식
response = client.chat.completions.create(
    model="o1",  # 또는 "o1-mini"
    messages=[
        {
            "role": "user",
            "content": """다음 수학 문제를 풀어주세요:

한 농부가 닭과 토끼를 키우고 있습니다.
머리가 총 20개이고, 다리가 총 56개일 때,
닭과 토끼는 각각 몇 마리일까요?"""
        }
    ]
    # 주의: o1은 temperature, top_p, system message 미지원
    # 주의: o1은 스트리밍 미지원
)

print(response.choices[0].message.content)

o1 모델 특징

내부적으로 Chain-of-Thought 추론 수행
복잡한 수학, 과학, 프로그래밍 문제에 강함
system 메시지 대신 user 메시지에 지시사항 포함
temperature, top_p 등 샘플링 파라미터 미지원
스트리밍, Function Calling 미지원
응답 시간이 GPT-4o보다 길 수 있음

배치 API

배치 생성

Python

import json

# 1. 배치 요청 파일 생성 (JSONL)
batch_requests = [
    {
        "custom_id": "request-1",
        "method": "POST",
        "url": "/v1/chat/completions",
        "body": {
            "model": "gpt-4o",
            "messages": [{"role": "user", "content": "안녕하세요"}]
        }
    },
    {
        "custom_id": "request-2",
        "method": "POST",
        "url": "/v1/chat/completions",
        "body": {
            "model": "gpt-4o",
            "messages": [{"role": "user", "content": "날씨 어때?"}]
        }
    }
]

with open("batch_input.jsonl", "w") as f:
    for req in batch_requests:
        f.write(json.dumps(req) + "\n")

# 2. 파일 업로드
batch_file = client.files.create(
    file=open("batch_input.jsonl", "rb"),
    purpose="batch"
)

# 3. 배치 생성
batch = client.batches.create(
    input_file_id=batch_file.id,
    endpoint="/v1/chat/completions",
    completion_window="24h"
)

print(f"Batch ID: {batch.id}")
print(f"Status: {batch.status}")  # validating, in_progress, completed, failed

배치 결과 가져오기

Python

# 배치 상태 확인
batch = client.batches.retrieve(batch.id)
print(f"Status: {batch.status}")

if batch.status == "completed":
    # 결과 파일 다운로드
    result_file_id = batch.output_file_id
    result = client.files.content(result_file_id)

    # 결과 파싱
    for line in result.text.strip().split("\n"):
        result_obj = json.loads(line)
        custom_id = result_obj["custom_id"]
        response = result_obj["response"]["body"]["choices"][0]["message"]["content"]

        print(f"{custom_id}: {response}")

배치 API 장점

50% 비용 절감 (일반 API 대비)
대량 처리에 적합
24시간 이내 완료
Rate Limit에 영향 없음

비용 관리

가격 개요

가격

실시간 API:
┌──────────────────┬──────────┬──────────┬──────────────┐
│ 모델             │ 입력     │ 출력     │ 비용(예시)   │
├──────────────────┼──────────┼──────────┼──────────────┤
│ gpt-4o           │ 변동/1M │ 변동/1M   │ 변동      │
│ gpt-4o-mini      │ 변동/1M │ 변동/1M │ 변동      │
│ o1               │ 변동/1M   │ 변동/1M   │ 변동      │
│ gpt-3.5-turbo    │ 변동/1M │ 변동/1M │ 변동      │
└──────────────────┴──────────┴──────────┴──────────────┘
예시: 2K input + 500 output

배치 API (50% 할인):
┌──────────────────┬──────────┬──────────┐
│ gpt-4o           │ 변동/1M │ 변동/1M    │
│ gpt-4o-mini      │ 변동/1M│ 변동/1M │
└──────────────────┴──────────┴──────────┘

비용 추적

Python

class CostTracker:
    PRICING = {
        "gpt-4o": (2.50, 10.0),
        "gpt-4o-mini": (0.15, 0.60),
        "o1": (15.0, 60.0),
        "gpt-3.5-turbo": (0.50, 1.50),
    }

    def __init__(self):
        self.total_cost = 0.0

    def track(self, response):
        model = response.model
        input_tokens = response.usage.prompt_tokens
        output_tokens = response.usage.completion_tokens

        input_price, output_price = self.PRICING.get(model, (0, 0))
        cost = (input_tokens * input_price + output_tokens * output_price) / 1_000_000

        self.total_cost += cost

        print(f"Request cost: ${cost:.6f}")
        print(f"Total cost: ${self.total_cost:.6f}")

        return cost

# 사용
tracker = CostTracker()

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "안녕"}]
)

tracker.track(response)

비용 최적화 팁

최적화

// 1. 올바른 모델 선택
간단한 작업 → gpt-4o-mini
일반 작업 → gpt-4o
복잡한 추론 → o1

// 2. 배치 API 활용
대량 처리 시 50% 절감

// 3. max_tokens 제한
response = client.chat.completions.create(
    model="gpt-4o",
    max_tokens=500,  # 출력 제한
    ...
)

// 4. 불필요한 컨텍스트 제거
• 오래된 대화 내역 삭제
• 요약 후 긴 컨텍스트 제거

// 5. 사용량 제한 설정
OpenAI 대시보드에서 월간 예산 제한 설정

// 6. Semantic Caching (외부)
동일/유사 쿼리 캐싱 (Redis, Pinecone 등)

모범 사례

권장사항

// 1. 에러 처리
from openai import OpenAI, APIError, RateLimitError
import time

def call_with_retry(client, **kwargs):
    max_retries = 3
    for i in range(max_retries):
        try:
            return client.chat.completions.create(**kwargs)
        except RateLimitError:
            if i < max_retries - 1:
                time.sleep(2 ** i)  # 지수 백오프
            else:
                raise
        except APIError as e:
            print(f"API Error: {e}")
            raise

// 2. 타임아웃 설정
client = OpenAI(timeout=30.0)  # 30초

// 3. API 키 보안
import os
api_key = os.getenv("OPENAI_API_KEY")

// 4. 프롬프트 최적화
• 명확하고 구체적인 지시
• Few-shot 예시 활용
• 출력 형식 명시

// 5. 모니터링
• 사용량 대시보드 정기 확인
• 비정상 패턴 감지
• 예산 알림 설정

Codex 연계

OpenAI API를 활용해 Codex와 같은 코딩 에이전트 워크플로우를 구성할 수 있습니다. Codex의 개요와 보안 흐름은 전용 가이드를 참고하세요.

Codex는 코드 작성/수정/테스트를 수행하는 코딩 에이전트이며, 클라우드 샌드박스에서 병렬 작업을 수행하거나 CLI/IDE에서 로컬로 작업할 수 있습니다.

Codex 가이드 - 개요, 사용 흐름, 보안/거버넌스
CLI 모범 사례 - 승인 모드와 안전한 작업 흐름

핵심 정리

OpenAI API 가이드의 핵심 개념과 흐름을 정리합니다.
OpenAI API 개요를 단계별로 이해합니다.
실전 적용 시 기준과 주의점을 확인합니다.

실무 팁

입력/출력 예시를 고정해 재현성을 확보하세요.
OpenAI API 가이드 범위를 작게 잡고 단계적으로 확장하세요.
OpenAI API 개요 조건을 문서화해 대응 시간을 줄이세요.