3장: 프레임워크 구축 - 브라이트웍스 AI 플랫폼

왜 작은 성공이 큰 문제가 되는가

2장의 멀티 에이전트 시스템이 너무 잘 작동해서 문제가 생겼습니다. 브라이트웍스 콘텐츠팀이 성과를 보고하자, 마케팅팀에서 "우리도 AI 에이전트를 써보자"고 했고, 영업팀도 "고객 제안서 작성에 AI를 활용하고 싶다"고 요청했습니다. 그리고 HR팀까지 "채용 공고문 자동 생성이 가능할까요?"라고 물어왔죠.

각 팀의 요청을 들어주려고 하니, 문제가 명확해졌습니다. 지금의 시스템은 "콘텐츠 제작"이라는 한 가지 용도로만 하드코딩되어 있습니다. 새로운 에이전트를 추가하려면 `orchestrator.py` 파일을 열어서 코드를 직접 수정해야 하고, Slack이나 Google Docs 같은 도구와 연결하려면 각 에이전트마다 개별적으로 구현해야 했습니다. 이렇게 하면 5개 팀을 지원하기 위해 5개의 완전히 다른 시스템을 만들어야 하는 상황이 됩니다.

이것은 마치 회사마다 자체 전기 발전소를 짓는 것과 같습니다. 대신 우리는 "전력망"처럼 모든 팀이 공유하는 AI 인프라가 필요합니다. 새로운 에이전트를 추가할 때는 전기 제품을 플러그에 꽂듯이, 새로운 도구를 연결할 때는 앱을 설치하듯이 간단해야 합니다. 이것이 바로 AI 플랫폼의 핵심 가치입니다.

2장 멀티 에이전트 시스템의 한계

확장성 부족: 에이전트 추가 시 orchestrator.py 코드 수정 필요
도구 통합 어려움: Google Docs, Slack 등 외부 도구 연결 시 각 에이전트에 개별 구현
중복 코드: API 키 관리, 에러 처리 등이 각 에이전트에 반복
모니터링 부재: 실시간 진행 상황, 비용, 성능 추적 불가

해결책: 통합 AI 플랫폼 구축

브라이트웍스 AI 플랫폼은 에이전트와 도구를 동적으로 관리하는 인프라입니다. 코드 수정 없이 새로운 에이전트/도구를 추가할 수 있습니다.

이 플랫폼 접근법은 개발 시간을 97% 단축시킵니다. 새로운 에이전트를 추가하는 데 걸리는 시간이 2-3시간에서 5분으로 줄어들고, 10개의 외부 도구(Google Docs, Slack, Analytics 등)를 표준화된 방식으로 통합할 수 있습니다. 무엇보다 실시간 비용 모니터링으로 예산 초과를 방지할 수 있죠.

플랫폼 아키텍처

graph TB
    subgraph Platform["BrightWorks AI Platform"]
        Registry["Agent Registry<br/>- 동적 등록<br/>- 버전 관리<br/>- 메타데이터"]
        Library["Tool Library<br/>- Google Docs<br/>- Slack<br/>- Analytics"]

        Registry --> |에이전트 제공| Engine
        Library --> |도구 제공| Engine

        Engine["Workflow Engine<br/>- Sequential Execution<br/>- Parallel Execution<br/>- Error Recovery"]

        Engine --> Monitor["Monitoring & Analytics<br/>- Logs<br/>- Metrics<br/>- Costs"]
    end

    User[사용자 요청] --> Engine
    Engine --> Result[최종 결과]

    style Platform fill:#f5f5f5,stroke:#333,stroke-width:2px
    style Registry fill:#fff3e0
    style Library fill:#e8f5e9
    style Engine fill:#e3f2fd
    style Monitor fill:#fce4ec
    style User fill:#c8e6c9
    style Result fill:#c8e6c9

Part 1: Agent Registry - 에이전트 동적 관리

Agent Registry는 "에이전트 등록소"라고 생각하면 됩니다. 마치 회사의 인사 시스템처럼, 누가 어떤 역할을 하는지, 어떤 버전인지, 어떤 입력을 받고 어떤 출력을 내는지 모든 정보를 중앙에서 관리합니다.

왜 이게 중요할까요? 2장에서는 새 에이전트를 추가하려면 코드를 직접 수정해야 했습니다. 하지만 Agent Registry를 사용하면, 에이전트를 "등록"만 하면 즉시 사용 가능합니다. 마치 직원을 채용할 때 인사 시스템에 등록하면 즉시 회사 이메일, 업무 도구 접근 권한이 생기는 것처럼요. 이는 특히 여러 팀이 서로 다른 에이전트를 사용할 때 혼란을 방지하고, 같은 에이전트의 여러 버전을 동시에 관리할 수 있게 해줍니다.

Agent Registry는 에이전트를 런타임에 등록/조회/실행할 수 있게 합니다.

핵심 기능

에이전트 등록: 코드 수정 없이 새 에이전트 추가
자동 발견: 디렉토리 스캔으로 에이전트 자동 로드
버전 관리: 동일 에이전트의 여러 버전 관리
메타데이터: 각 에이전트의 역할, 입력/출력 타입 정의

구현 (platform/agent_registry.py)

class AgentRegistry:
    """에이전트 동적 등록 및 관리"""

    def __init__(self):
        self._agents = {}

    def register(
        self,
        name: str,
        agent_class: Type,
        version: str = "v1",
        input_type: Type = None,
        output_type: Type = None,
        description: str = ""
    ):
        """에이전트 등록"""
        key = f"{name}:{version}"
        self._agents[key] = {
            "class": agent_class,
            "input_type": input_type,
            "output_type": output_type,
            "description": description,
            "instance": None
        }

    def get(self, name: str, version: str = "v1"):
        """에이전트 인스턴스 가져오기 (싱글톤)"""
        key = f"{name}:{version}"
        if key not in self._agents:
            raise ValueError(f"Agent {key} not found")

        agent_info = self._agents[key]
        if agent_info["instance"] is None:
            agent_info["instance"] = agent_info["class"]()

        return agent_info["instance"]

    def list_agents(self) -> List[str]:
        """등록된 모든 에이전트 목록"""
        return list(self._agents.keys())

사용 예시

# 플랫폼 초기화
platform = BrightWorksPlatform()

# 에이전트 등록
platform.registry.register(
    name="research",
    agent_class=ResearchAgent,
    input_type=dict,
    output_type=ResearchResult,
    description="리서치 및 트렌드 분석"
)

platform.registry.register(
    name="writer",
    agent_class=WriterAgent,
    input_type=ResearchResult,
    output_type=DraftPost,
    description="콘텐츠 작성"
)

# 에이전트 실행
research_agent = platform.registry.get("research")
result = research_agent.research(topic, keywords)

Part 2: Tool Library - 10개 외부 도구 통합

AI 에이전트는 아무리 똑똑해도 혼자서는 아무것도 할 수 없습니다. 콘텐츠를 생성했다면 어딘가에 저장해야 하고(Google Docs), 팀에게 알려야 하고(Slack), 성과를 측정해야 합니다(Analytics). 하지만 각 에이전트가 이런 도구들과 직접 연결되면, 코드가 복잡해지고 유지보수가 어려워집니다.

Tool Library는 도구 사용을 "표준화"합니다. 마치 전기 제품이 모두 같은 플러그를 사용하듯이, 모든 도구가 동일한 방식으로 호출됩니다. `library.execute("google_docs", "create", ...)` 형태로 간단히 사용할 수 있죠. 이렇게 하면 Google Docs를 Notion으로 바꾸고 싶을 때, Tool Library의 구현만 바꾸면 되고 각 에이전트 코드는 전혀 수정할 필요가 없습니다.

이것은 단순한 기술적 편의가 아닙니다. 비즈니스 유연성의 핵심입니다. 회사가 Slack에서 Microsoft Teams로 이전하거나, 새로운 CRM 시스템을 도입할 때, 모든 에이전트를 다시 만들 필요 없이 Tool Library만 업데이트하면 됩니다. 이는 곧 변화에 빠르게 적응할 수 있는 조직을 만드는 것입니다.

Tool Library는 에이전트가 사용할 수 있는 외부 도구를 표준화된 인터페이스로 제공합니다.

통합 도구 목록

콘텐츠 도구

• Google Docs API
• Medium API
• WordPress REST API

커뮤니케이션

• Slack Webhooks
• Email (SMTP)
• Discord Webhooks

데이터 분석

• Google Analytics API
• Google Search Console

스토리지

• Google Cloud Storage
• Firebase Storage

Tool 추상화 (platform/tools/base.py)

class BaseTool(ABC):
    """모든 도구의 기본 클래스"""

    @abstractmethod
    def execute(self, **kwargs) -> Any:
        """도구 실행"""
        pass

    @abstractmethod
    def validate_config(self) -> bool:
        """설정 유효성 검사"""
        pass


class GoogleDocsTool(BaseTool):
    """Google Docs 도구"""

    def __init__(self, credentials_path: str):
        self.creds = self._load_credentials(credentials_path)
        self.docs_service = build('docs', 'v1', credentials=self.creds)

    def execute(self, action: str, **kwargs) -> Any:
        """Google Docs 작업 실행"""
        if action == "create":
            return self._create_document(kwargs['title'], kwargs['content'])
        elif action == "read":
            return self._read_document(kwargs['document_id'])
        elif action == "update":
            return self._update_document(kwargs['document_id'], kwargs['content'])

    def _create_document(self, title: str, content: str):
        """새 문서 생성"""
        doc = self.docs_service.documents().create(
            body={'title': title}
        ).execute()
        # 콘텐츠 추가
        self._update_document(doc['documentId'], content)
        return doc['documentId']

Tool Registry

class ToolLibrary:
    """도구 관리 라이브러리"""

    def __init__(self):
        self._tools = {}

    def register_tool(self, name: str, tool: BaseTool):
        """도구 등록"""
        self._tools[name] = tool

    def get_tool(self, name: str) -> BaseTool:
        """도구 가져오기"""
        if name not in self._tools:
            raise ValueError(f"Tool {name} not found")
        return self._tools[name]

    def execute(self, tool_name: str, action: str, **kwargs):
        """도구 실행 (간편 메서드)"""
        tool = self.get_tool(tool_name)
        return tool.execute(action, **kwargs)


# 사용 예시
library = ToolLibrary()
library.register_tool("google_docs", GoogleDocsTool(creds_path))
library.register_tool("slack", SlackTool(webhook_url))

# 에이전트가 도구 사용
doc_id = library.execute(
    "google_docs",
    "create",
    title="AI 마케팅 전략",
    content=final_post.content
)

library.execute(
    "slack",
    "send_message",
    channel="#content-team",
    text=f"새 블로그 포스트 생성 완료: {final_post.title}"
)

Part 3: Workflow Engine - 유연한 실행 엔진

여기서 가장 혁신적인 부분이 나옵니다. 지금까지 우리는 Python 코드로 "Research 실행 → 결과를 Writer에 전달 → 결과를 Editor에 전달"이라는 순서를 하드코딩했습니다. 하지만 콘텐츠팀과 마케팅팀의 워크플로우는 다릅니다. 영업팀은 또 다른 순서가 필요하죠. 각 팀마다 별도의 Python 파일을 만들어야 할까요?

Workflow Engine의 핵심은 "코드가 아닌 설정으로 워크플로우를 정의"하는 것입니다. YAML이라는 간단한 텍스트 파일 형식을 사용하면, 프로그래밍 지식이 없는 사람도 워크플로우를 만들고 수정할 수 있습니다. 마치 Excel에서 수식을 작성하듯이요.

이것의 비즈니스 임팩트는 엄청납니다. 각 팀이 자신만의 워크플로우를 독립적으로 실험할 수 있습니다. 마케팅팀이 "Research → Writer → SNS 동시 발행"이라는 새로운 워크플로우를 테스트하고 싶다면, 개발자의 도움 없이 YAML 파일 하나만 수정하면 됩니다. 이는 조직의 실험 속도를 획기적으로 높이고, 혁신을 가속화합니다.

Workflow Engine은 에이전트 실행 순서와 방식을 선언적으로 정의합니다.

워크플로우 정의 (YAML)

# workflows/content_pipeline.yaml

name: "Content Creation Pipeline"
description: "Research → Write → Edit → Publish"

steps:
  - name: research
    agent: research:v1
    input:
      topic: "${input.topic}"
      keywords: "${input.keywords}"
    output: research_result

  - name: write
    agent: writer:v1
    input:
      research: "${research.output}"
      keywords: "${input.keywords}"
    output: draft_post

  - name: edit
    agent: editor:v1
    input:
      draft: "${write.output}"
    output: final_post

  - name: publish
    tool: google_docs
    action: create
    input:
      title: "${edit.output.title}"
      content: "${edit.output.content}"
    output: document_id

  - name: notify
    tool: slack
    action: send_message
    input:
      channel: "#content-team"
      text: "새 포스트 발행: ${edit.output.title} (Google Docs: ${publish.output})"

on_error:
  strategy: retry
  max_retries: 3
  fallback: slack_notification

Workflow Executor

class WorkflowEngine:
    """워크플로우 실행 엔진"""

    def __init__(self, registry: AgentRegistry, library: ToolLibrary):
        self.registry = registry
        self.library = library

    def load_workflow(self, yaml_path: str) -> Workflow:
        """YAML 파일에서 워크플로우 로드"""
        with open(yaml_path) as f:
            config = yaml.safe_load(f)
        return Workflow(config)

    def execute(self, workflow: Workflow, inputs: dict) -> dict:
        """워크플로우 실행"""
        context = {"input": inputs}

        for step in workflow.steps:
            try:
                if step.type == "agent":
                    result = self._execute_agent_step(step, context)
                elif step.type == "tool":
                    result = self._execute_tool_step(step, context)

                context[step.name] = {"output": result}

            except Exception as e:
                if workflow.on_error:
                    self._handle_error(step, e, workflow.on_error)
                else:
                    raise

        return context

    def _execute_agent_step(self, step, context):
        """에이전트 스텝 실행"""
        agent = self.registry.get(step.agent)
        inputs = self._resolve_variables(step.input, context)
        return agent.execute(**inputs)

    def _resolve_variables(self, template, context):
        """변수 치환 (${step.output.field} 형식)"""
        # 예: "${research.output.trends}" → context['research']['output']['trends']
        result = {}
        for key, value in template.items():
            if isinstance(value, str) and value.startswith("${"):
                result[key] = self._get_from_context(value, context)
            else:
                result[key] = value
        return result

실행 예시

# 플랫폼 초기화
platform = BrightWorksPlatform()

# 워크플로우 로드
workflow = platform.engine.load_workflow("workflows/content_pipeline.yaml")

# 실행
result = platform.engine.execute(workflow, {
    "topic": "AI 에이전트 시대의 디지털 마케팅",
    "keywords": ["AI 에이전트", "디지털 마케팅", "자동화"]
})

print(f"Published Document ID: {result['publish']['output']}")

Part 4: Monitoring & Analytics

실시간 모니터링으로 비용, 성능, 에러를 추적합니다.

수집 메트릭

비용 추적: API 호출 횟수, 토큰 사용량, 예상 비용
성능: 각 에이전트/도구의 실행 시간
품질: 품질 점수, 편집 횟수, 사용자 만족도
에러: 실패율, 재시도 횟수, 에러 유형

구현 (platform/monitoring.py)

class Monitor:
    """모니터링 및 로깅"""

    def __init__(self):
        self.metrics = defaultdict(list)

    def log_agent_execution(
        self,
        agent_name: str,
        duration: float,
        input_tokens: int,
        output_tokens: int,
        cost: float
    ):
        """에이전트 실행 기록"""
        self.metrics["executions"].append({
            "agent": agent_name,
            "timestamp": datetime.now(),
            "duration": duration,
            "tokens": {"input": input_tokens, "output": output_tokens},
            "cost": cost
        })

    def get_dashboard_data(self):
        """대시보드용 데이터 집계"""
        total_cost = sum(e["cost"] for e in self.metrics["executions"])
        total_executions = len(self.metrics["executions"])

        return {
            "total_cost": total_cost,
            "total_executions": total_executions,
            "avg_duration": np.mean([e["duration"] for e in self.metrics["executions"]]),
            "cost_by_agent": self._group_by_agent()
        }

대시보드 예시

BrightWorks AI Platform Dashboard

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

📊 총 실행: 124건 (오늘)

💰 총 비용: $12.34 (이번 달)

⏱ 평균 실행 시간: 42초

✅ 성공률: 98.4%

에이전트별 통계:

Research Agent: 45건 | $3.21 | 평균 15초

Writer Agent: 42건 | $6.78 | 평균 35초

Editor Agent: 37건 | $2.35 | 평균 12초

전체 시스템 통합 예시

BrightWorks Platform (main.py)

class BrightWorksPlatform:
    """브라이트웍스 AI 플랫폼"""

    def __init__(self, config_path: str = "config.yaml"):
        # 컴포넌트 초기화
        self.registry = AgentRegistry()
        self.library = ToolLibrary()
        self.engine = WorkflowEngine(self.registry, self.library)
        self.monitor = Monitor()

        # 설정 로드
        self.config = self._load_config(config_path)

        # 에이전트 자동 등록
        self._auto_discover_agents()

        # 도구 자동 등록
        self._auto_discover_tools()

    def run_workflow(self, workflow_name: str, inputs: dict):
        """워크플로우 실행"""
        workflow = self.engine.load_workflow(f"workflows/{workflow_name}.yaml")

        with self.monitor.track_execution(workflow_name):
            result = self.engine.execute(workflow, inputs)

        return result

    def get_dashboard(self):
        """모니터링 대시보드 데이터"""
        return self.monitor.get_dashboard_data()


# 사용 예시
if __name__ == "__main__":
    platform = BrightWorksPlatform()

    # 워크플로우 실행
    result = platform.run_workflow("content_pipeline", {
        "topic": "AI 에이전트 시대의 디지털 마케팅",
        "keywords": ["AI 에이전트", "디지털 마케팅"]
    })

    print(f"✅ 완료!")
    print(f"Google Docs: {result['publish']['output']}")

    # 대시보드 확인
    dashboard = platform.get_dashboard()
    total = dashboard['total_cost']
    print(f"총 비용: {total:.2f{'}'}")

완성된 프로젝트 구조

brightworks-ai-platform/
├── platform/
│   ├── agent_registry.py      # 에이전트 레지스트리
│   ├── workflow_engine.py     # 워크플로우 엔진
│   ├── monitoring.py          # 모니터링
│   └── tools/
│       ├── base.py            # 도구 베이스 클래스
│       ├── google_docs.py     # Google Docs 도구
│       ├── slack.py           # Slack 도구
│       └── analytics.py       # Analytics 도구
├── agents/
│   ├── research_agent.py
│   ├── writer_agent.py
│   └── editor_agent.py
├── workflows/
│   ├── content_pipeline.yaml   # 콘텐츠 제작 워크플로우
│   └── social_media.yaml       # 소셜 미디어 워크플로우
├── shared/
│   ├── models.py
│   └── state_manager.py
├── config.yaml                  # 플랫폼 설정
├── main.py                      # 플랫폼 진입점
└── requirements.txt

성과 비교: 멀티 에이전트 vs 플랫폼

항목	2장 (멀티)	3장 (플랫폼)	개선
에이전트 추가 시간	2-3시간	5분	-97%
도구 통합 난이도	높음	낮음	표준화
모니터링 가능 여부	불가	실시간	+100%
코드 재사용성	낮음	높음	모듈화
워크플로우 유연성	고정됨	동적 변경	선언적

핵심 학습 포인트

1. 확장성의 원칙

동적 등록: 코드 수정 없이 새 기능 추가
표준 인터페이스: 모든 도구가 동일한 패턴 따름
모듈화: 각 컴포넌트가 독립적으로 동작

2. 선언적 워크플로우의 장점

가독성: YAML로 워크플로우를 명확히 표현
재사용성: 동일 워크플로우를 여러 프로젝트에서 사용
버전 관리: 워크플로우 변경 이력 추적

3. 관측 가능성 (Observability)

실시간 모니터링: 시스템 상태를 언제든 확인
비용 추적: API 사용량과 비용을 투명하게 관리
성능 최적화: 병목 구간을 데이터로 파악

🎯 실습 과제

과제 1: 플랫폼 구축

Agent Registry와 Tool Library 구현
최소 3개 도구 통합 (Google Docs, Slack, 선택 1개)
모니터링 대시보드 구현

과제 2: 새로운 워크플로우 작성

소셜 미디어 콘텐츠 제작 워크플로우를 YAML로 작성하세요:

Research Agent → Writer Agent → 3개 플랫폼 동시 발행 (Facebook, Twitter, LinkedIn)

과제 3: 성능 분석

1주일간 플랫폼을 사용하면서 비용, 실행 시간, 품질을 측정하고 최적화 방안 제시

← 이전: 2장. 멀티 에이전트 시스템 설계 목차로 돌아가기 →