종합 의사결정 매트릭스 / Comprehensive Decision Matrix

Last verified: 2026년 2월 / February 2026

한국어

개요

본 문서는 혜경궁 홍씨(Lady Hyegyong) AI NPC 프로젝트의 성공적인 구현을 위해 검토된 모든 기술적 선택지를 하나의 통합된 프레임워크로 정리한 종합 의사결정 매트릭스입니다. 시스템 아키텍처, 대화 엔진, 음성 파이프라인, 애니메이션, MR 기술, 전시 인프라 등 9개 섹션에서 도출된 핵심 결정 포인트들을 분석하여, 프로젝트의 목표와 제약 조건에 따른 최적의 경로를 제시합니다.

본 매트릭스는 특정 기술을 최종적으로 확정하기보다는, 개발 기간, 예산, 품질 목표, 오프라인 안정성 등 팀이 직면한 상황에 따라 어떤 기술을 선택해야 하는지에 대한 조건부 가이드를 제공합니다. 또한, 전시 운영 중 발생할 수 있는 주요 위험 요소와 이에 대한 대응 전략, 그리고 단계별 구현 로드맵을 포함하여 프로젝트의 전체적인 실행 전략을 수립하는 데 기여합니다.

핵심 발견

품질 vs 속도: Convai와 같은 통합 플랫폼은 개발 속도에서 압도적이지만, 커스텀 스택(GPT-4o + ElevenLabs)은 한국어 품질과 세밀한 페르소나 제어에서 우위를 점합니다.
오프라인 안정성: 전시 환경의 특성상 인터넷 단절에 대비한 4단계 폴백(Fallback) 전략이 필수적이며, 이를 위해 Jetson AGX Orin 기반의 엣지 서버 구축이 권장됩니다.
레이턴시 관리: 1.5초 이내의 체감 레이턴시를 달성하기 위해 ElevenLabs Turbo v2.5와 같은 저지연 TTS와 ‘생각 중’ 애니메이션 마스킹 기술의 결합이 핵심입니다.
Samsung Galaxy XR: 현재 모든 기술적 검토는 Quest 3를 기준으로 완료되었으며, Samsung Galaxy XR은 ‘알려진 갭(Known Gap)‘으로서 향후 이식성을 고려한 OpenXR 표준 준수가 필요합니다.

마스터 의사결정 매트릭스 (Master Decision Matrix)

2026년 2월 기준

결정 포인트	옵션 A (통합/속도)	옵션 B (고품질/커스텀)	옵션 C (안정성/로컬)	추천 조건
1. AI 플랫폼	Convai	Custom Stack	Azure AI	빠른 구축은 A, 품질은 B, 보안은 C
2. LLM 모델	Convai Built-in	GPT-4o	Claude 3.5	레이턴시는 A, 추론은 B, 지침 준수는 C
3. TTS 서비스	ElevenLabs Turbo	Typecast	NAVER CLOVA	속도는 A, 감정은 B, 한국어 자연스러움은 C
4. STT 솔루션	Whisper Sentis	Azure Speech	Google Speech	오프라인은 A, 정확도는 B, 범용성은 C
5. 립싱크 (LipSync)	OVRLipSync	SALSA v2	Audio2Face	Quest 최적화는 A, 범용은 B, 실사는 C
6. 엣지 하드웨어	Jetson AGX Orin	Mini PC (RTX 4070)	Cloud Only	산업용 안정성은 A, 성능은 B, 저예산은 C
7. 로컬 LLM	Llama 3.2 (3B/8B)	HyperCLOVA X	OPEN-SOLAR-KO	속도는 A, 한국어 맥락은 B, 오픈소스는 C
8. Unity XR 프레임워크	Meta XR SDK	AR Foundation	OpenXR Native	Quest 전용은 A, 확장성은 B, 표준은 C
9. 애니메이션 시스템	Mecanim Layers	Playables API	Timeline	직관적 제어는 A, 동적 합성은 B, 시퀀스는 C
10. 메모리 시스템	Sliding Window	ENGRAM Triple Memory	Vector DB (RAG)	단순 대화는 A, 장기 기억은 B, 지식 검색은 C
11. 대화 엔진	Arbor	ChatSOP	Native Scripting	논리 흐름은 A, 절차 제어는 B, 단순 로직은 C
12. 페르소나 프레임워크	MemorIA	EsthaAI	AHA Guidelines	빠른 생성은 A, 일관성은 B, 역사 고증은 C
13. 레이턴시 마스킹	Thinking Animation	UI Loading	Audio Filler	몰입감은 A, 정보 전달은 B, 단순 대기는 C
14. MR 오클루전	Depth API	Scene Mesh	Static Mesh	실시간은 A, 가구 인식은 B, 고정 환경은 C
15. MR 조명	Light Estimation	Static Lighting	Fake Shadows	실시간 동기화는 A, 성능 최적화는 B, 단순 구현은 C
16. 폴백 전략	4단계 (Cloud-Edge-On-Pre)	2단계 (Cloud-Pre)	Cloud Only	전시 안정성은 A, 중급은 B, 테스트용은 C
17. 네트워크	Wi-Fi 6E	Wi-Fi 6	Ethernet	다수 기기 간섭 방지는 A, 일반은 B, 고정형은 C
18. 모니터링	Prometheus/Grafana	Cloud Dashboard	None	실시간 관제는 A, 사후 분석은 B, 소규모는 C
19. 보안/개인정보	Zero-Retention	Local Storage	Cloud Storage	법규 준수는 A, 데이터 분석은 B, 편의성은 C
20. 캐릭터 모델	High-poly (50k)	Mid-poly (35k)	Low-poly (20k)	근접 체험은 A, 표준은 B, 다수 NPC는 C

기술 선택 의사결정 트리 (Technology Selection Decision Tree)

위험 매트릭스 (Risk Matrix)

위험 요소	발생 가능성	영향도	완화 전략 (Mitigation)
전시 중 네트워크 단절	높음	매우 높음	4단계 폴백(Fallback) 구축, Jetson 기반 엣지 서버 운영
API 비용 초과	중간	중간	ElevenLabs Turbo 사용, 문장 단위 캐싱, 일일 쿼터 설정
한국어 TTS 품질 저하	낮음	높음	ElevenLabs/Typecast 이중화, 특정 시나리오 NAVER CLOVA 사용
Samsung Galaxy XR 호환성	높음	중간	OpenXR 표준 준수, AR Foundation 추상화 레이어 강화
페르소나 열화 (오프라인)	중간	중간	로컬 모델용 경량화 프롬프트 최적화, 규칙 기반 보조 시스템
레이턴시 1.5초 초과	중간	높음	스트리밍 TTS 적용, ‘생각 중’ 애니메이션 즉시 트리거
Quest 3 발열/쓰로틀링	중간	높음	FFR(Fixed Foveated Rendering) 적용, 외부 쿨링 솔루션 검토

구현 로드맵 (Implementation Roadmap)

Phase 1: 프로토타입 개발 (4-6주)

목표: 핵심 파이프라인 검증 및 기본 페르소나 구현
주요 과업:
- STT-LLM-TTS-LipSync 기본 연동 (Path A 또는 B)
- 혜경궁 홍씨 기본 시스템 프롬프트 설계
- 단일 활동(예: 인사 및 대화) 프로토타입 구축
- Quest 3 Passthrough 기본 환경 설정

Phase 2: 시스템 통합 및 고도화 (6-8주)

목표: 4대 활동 통합 및 엣지 서버 구축
주요 과업:
- 4가지 활동(편지, 서예, 예절, 다과) 시퀀스 구현
- Jetson AGX Orin 기반 엣지 서버 및 로컬 LLM 최적화
- 4단계 폴백(Fallback) 로직 완성
- ENGRAM 3중 메모리 시스템 적용

Phase 3: 최적화 및 전시 준비 (4-6주)

목표: 성능 튜닝 및 안정성 테스트
주요 과업:
- 90fps 유지를 위한 렌더링 최적화 (LOD, GPU Skinning)
- 전시장 네트워크(Wi-Fi 6E) 환경 스트레스 테스트
- 모니터링 대시보드(Grafana) 연동
- 최종 역사 고증 검수 및 가드레일 강화

평가 기준 가중치 (Evaluation Criteria Weights)

평가 항목	가중치	산출 근거
한국어 대화 품질	25%	역사적 인물과의 몰입감을 결정하는 가장 핵심적인 요소
레이턴시 성능	20%	상호작용의 자연스러움을 좌우하며 멀미 방지에 기여
운영 비용 (6개월)	15%	한정된 예산 내에서 지속 가능한 전시 운영 가능 여부
오프라인 구동 능력	15%	전시장 네트워크 불안정 시에도 중단 없는 체험 보장
커스터마이징 깊이	15%	혜경궁 홍씨만의 독특한 페르소나와 활동 구현의 자유도
Quest 3 호환성	10%	주 타깃 기기에서의 성능 최적화 및 SDK 지원 수준

알려진 갭 및 향후 과제

Samsung Galaxy XR (알려진 갭): 본 매트릭스의 성능 데이터는 Quest 3 기준이며, Samsung 기기에서의 실제 레이턴시 및 렌더링 효율은 추가 검증이 필요합니다.
멀티모달 연동: 사용자의 동작(서예, 다과)을 실시간으로 분석하여 대화에 반영하는 기술은 현재 레이턴시 문제로 인해 Phase 2 이후의 과제로 설정되었습니다.

출처 및 참고문헌

Convai, Inworld, ElevenLabs 공식 기술 문서 (2026).
Meta Quest 3 Developer Guide: Performance Optimization (2025).
NVIDIA Jetson AGX Orin vLLM Deployment Guide (2025).
“Hybrid AI NPC Architecture for Cultural Heritage,” Journal of Digital Heritage (2025).

English

Overview

This document is a Comprehensive Decision Matrix that synthesizes all technical options reviewed for the successful implementation of the Lady Hyegyong (혜경궁 홍씨) AI NPC project into a unified framework. By analyzing key decision points derived from nine sections—including system architecture, conversation engine, voice pipeline, animation, MR technology, and exhibition infrastructure—it presents the optimal path based on project goals and constraints.

Rather than finalizing specific technologies, this matrix provides a conditional guide on which technologies to select based on the situation the team faces, such as development period, budget, quality goals, and offline stability. It also contributes to establishing an overall execution strategy for the project by including major risk factors that may occur during exhibition operation, corresponding mitigation strategies, and a step-by-step implementation roadmap.

Key Findings

Quality vs. Speed: Integrated platforms like Convai are overwhelming in development speed, but custom stacks (GPT-4o + ElevenLabs) hold the edge in Korean quality and fine-grained persona control.
Offline Stability: Due to the nature of exhibition environments, a 4-tier fallback strategy against internet disconnection is essential, and building an edge server based on Jetson AGX Orin is recommended for this purpose.
Latency Management: To achieve a perceived latency of less than 1.5 seconds, the combination of low-latency TTS like ElevenLabs Turbo v2.5 and ‘Thinking’ animation masking technology is key.
Samsung Galaxy XR: All technical reviews have been completed based on Quest 3. Samsung Galaxy XR is a ‘Known Gap,’ requiring compliance with OpenXR standards for future portability.

Master Decision Matrix

As of February 2026

Decision Point	Option A (Integrated/Speed)	Option B (High Quality/Custom)	Option C (Stability/Local)	Recommended Condition
1. AI Platform	Convai	Custom Stack	Azure AI	A for rapid build, B for quality, C for security
2. LLM Model	Convai Built-in	GPT-4o	Claude 3.5	A for latency, B for reasoning, C for instruction following
3. TTS Service	ElevenLabs Turbo	Typecast	NAVER CLOVA	A for speed, B for emotion, C for Korean naturalness
4. STT Solution	Whisper Sentis	Azure Speech	Google Speech	A for offline, B for accuracy, C for versatility
5. LipSync	OVRLipSync	SALSA v2	Audio2Face	A for Quest optimization, B for universal, C for realism
6. Edge Hardware	Jetson AGX Orin	Mini PC (RTX 4070)	Cloud Only	A for industrial stability, B for performance, C for low budget
7. Local LLM	Llama 3.2 (3B/8B)	HyperCLOVA X	OPEN-SOLAR-KO	A for speed, B for Korean context, C for open source
8. Unity XR Framework	Meta XR SDK	AR Foundation	OpenXR Native	A for Quest-only, B for scalability, C for standards
9. Animation System	Mecanim Layers	Playables API	Timeline	A for intuitive control, B for dynamic synthesis, C for sequences
10. Memory System	Sliding Window	ENGRAM Triple Memory	Vector DB (RAG)	A for simple dialogue, B for long-term memory, C for knowledge search
11. Conversation Engine	Arbor	ChatSOP	Native Scripting	A for logic flow, B for procedural control, C for simple logic
12. Persona Framework	MemorIA	EsthaAI	AHA Guidelines	A for fast generation, B for consistency, C for historical verification
13. Latency Masking	Thinking Animation	UI Loading	Audio Filler	A for immersion, B for info delivery, C for simple waiting
14. MR Occlusion	Depth API	Scene Mesh	Static Mesh	A for real-time, B for furniture recognition, C for fixed environments
15. MR Lighting	Light Estimation	Static Lighting	Fake Shadows	A for real-time sync, B for performance, C for simple implementation
16. Fallback Strategy	4-tier (Cloud-Edge-On-Pre)	2-tier (Cloud-Pre)	Cloud Only	A for exhibition stability, B for intermediate, C for testing
17. Network	Wi-Fi 6E	Wi-Fi 6	Ethernet	A for interference prevention, B for general, C for fixed
18. Monitoring	Prometheus/Grafana	Cloud Dashboard	None	A for real-time control, B for post-analysis, C for small scale
19. Security/Privacy	Zero-Retention	Local Storage	Cloud Storage	A for compliance, B for data analysis, C for convenience
20. Character Model	High-poly (50k)	Mid-poly (35k)	Low-poly (20k)	A for close-up experience, B for standard, C for multiple NPCs

Technology Selection Decision Tree

Risk Matrix

Risk Factor	Probability	Impact	Mitigation Strategy
Network Outage during Exhibition	High	Very High	Build 4-tier fallback, operate Jetson-based edge server
API Cost Overrun	Medium	Medium	Use ElevenLabs Turbo, sentence-level caching, set daily quotas
Korean TTS Quality Degradation	Low	High	Redundancy with ElevenLabs/Typecast, use NAVER CLOVA for specific scenarios
Samsung Galaxy XR Compatibility	High	Medium	Comply with OpenXR standards, strengthen AR Foundation abstraction layer
Persona Degradation (Offline)	Medium	Medium	Optimize lightweight prompts for local models, rule-based assistance system
Latency Exceeding 1.5s	Medium	High	Apply streaming TTS, immediately trigger ‘Thinking’ animation
Quest 3 Thermal Throttling	Medium	High	Apply FFR (Fixed Foveated Rendering), review external cooling solutions

Implementation Roadmap

Phase 1: Prototype Development (4-6 weeks)

Goal: Verify core pipeline and implement basic persona.
Key Tasks:
- Basic integration of STT-LLM-TTS-LipSync (Path A or B).
- Design basic system prompt for Lady Hyegyong.
- Build prototype for a single activity (e.g., greeting and dialogue).
- Set up basic Quest 3 Passthrough environment.

Phase 2: System Integration & Advancement (6-8 weeks)

Goal: Integrate 4 main activities and build edge server.
Key Tasks:
- Implement sequences for 4 activities (Letter, Calligraphy, Etiquette, Tea Ceremony).
- Optimize Jetson AGX Orin-based edge server and local LLM.
- Complete 4-tier fallback logic.
- Apply ENGRAM triple memory system.

Phase 3: Optimization & Exhibition Prep (4-6 weeks)

Goal: Performance tuning and stability testing.
Key Tasks:
- Rendering optimization (LOD, GPU Skinning) to maintain 90fps.
- Stress test exhibition network (Wi-Fi 6E) environment.
- Integrate monitoring dashboard (Grafana).
- Final historical verification and reinforcement of guardrails.

Evaluation Criteria Weights

Criterion	Weight	Rationale
Korean Dialogue Quality	25%	The most core factor determining immersion with a historical figure.
Latency Performance	20%	Governs the naturalness of interaction and contributes to motion sickness prevention.
Operational Cost (6mo)	15%	Whether sustainable exhibition operation is possible within a limited budget.
Offline Capability	15%	Ensures uninterrupted experience even during exhibition network instability.
Customization Depth	15%	Freedom to implement Lady Hyegyong’s unique persona and activities.
Quest 3 Compatibility	10%	Level of performance optimization and SDK support on the primary target device.

Known Gaps & Future Work

Samsung Galaxy XR (Known Gap): Performance data in this matrix is based on Quest 3; actual latency and rendering efficiency on Samsung devices require further verification.
Multimodal Integration: Technology to analyze user movements (calligraphy, tea ceremony) in real-time and reflect them in dialogue has been set as a task for after Phase 2 due to current latency issues.

Sources & References

Convai, Inworld, ElevenLabs Official Technical Documentation (2026).
Meta Quest 3 Developer Guide: Performance Optimization (2025).
NVIDIA Jetson AGX Orin vLLM Deployment Guide (2025).
“Hybrid AI NPC Architecture for Cultural Heritage,” Journal of Digital Heritage (2025).