ãç®æ¬¡ ãã€ã³ã¿ãŒã³ã¬ããŒããOpenAI Agents SDK (Pythonç) ã§ã³ãŒã«ã»ã³ã¿ãŒé¢šé³å£°å¯Ÿè©±åãã«ããšãŒãžã§ã³ããã¢ãäœã£ãŠã¿ã(ããŸãä»ã) ã¯ããã« 1.AIãšãŒãžã§ã³ãâïžé³å£° = é³å£°ãšãŒãžã§ã³ã 1.1 æ®åããŠããAIãšãŒãžã§ã³ãã«ã€ã㊠1.2 é³å£°ãšãŒãžã§ã³ãã®æ©æµã«ã€ããŠèãã 1.3 ãªã¢ã«ã¿ã€ã é³å£°å¯Ÿè©±APIã»é³å£°ãšãŒãžã§ã³ãéçºããŒã«ã®çŽ¹ä» 2. OpenAI Agents SDK (Pythonç)ã§äœãé³å£°å¯Ÿè©±åãã«ããšãŒãžã§ã³ãããŒã« 2.1 OpenAI Agents SDKãšã¯ 2.2 2çš®é¡ã®é³å£°ãšãŒãžã§ã³ãã®æ§é 2.3 ãã¢ã®çŽ¹ä» 2.4 ä»åŸã®å±æ ãããã« åèè³æ ã¯ããã« ããã«ã¡ã¯ïŒïŒïŒ Insight Edgeã§ã¢ã«ãã€ããããŠãããŸããæ±äº¬ç§åŠå€§åŠå€§åŠé¢ 修士2幎ã®ç°äžã§ãã倧åŠé¢ã§ã¯ãçµå¶å·¥åŠç³»ã®ç 究宀ã§ããµãã«ãŒã®è©Šåæ ååæã«é¢ããç ç©¶ãããŠããŸããç§ã®ç 究宀ã§ã¯ã(ç¥è)ã°ã©ããLLMã匷ååŠç¿ãçšãããéèãèªåé転ãªã©ã®ããããç£æ¥é åãžã®å¿çšç ç©¶ãæŽ»çºã«ãªãããŠãããæ§ã
ãªé åã®ç ç©¶ãç¥ãããšãã§ããŸãã Insight Edgeãããšã¯ãæšå¹Žã«è¡ããã1ã¶æéã®ã€ã³ã¿ãŒã³ã·ããããé¢ããããŠããã ããŠãããŸãããã®ã€ã³ã¿ãŒã³ã§ã¯èŠèŠèšèªã¢ãã«ã®PoCã«åå ãããŠããã ããŸããããã®ãããªãçžããããæ¥å¹ŽåºŠããã¯ããŒã¿ãµã€ãšã³ãã£ã¹ããšããŠãä»äºãããŠããã ããããšãšãªããŸããã®ã§ãä»åŸãšããä»ãåããããããé¡ãããããŸãð¥ð¥ð¥ ããŠãå眮ããé·ããªããŸããããæ¬èšäºã§ã¯ã¿ã€ãã«ã«ãããéããOpenAI Agents SDK(Pythonç)ã§äœæããã³ãŒã«ã»ã³ã¿ãŒé¢šé³å£°å¯Ÿè©±åãã«ããšãŒãžã§ã³ãã®ãã¢ã«ã€ããŠã玹ä»ããããšæããŸããçšããæè¡ã¹ã¿ãã¯ãå®éã«äœ¿ã£ãŠã¿ãäœ¿çšæãäžå¿ã«ããã¢æ åãªã©ã亀ããŠã玹ä»ããŸãã 第1ç« ã§ã¯ããŸãã¯èæ¯ãšããŠããªãé³å£°ãšãŒãžã§ã³ããæè¿æ³šç®ãããŠããããšããããšãçè§£ããŠããã ããããã«ãAIãšãŒãžã§ã³ã/é³å£°å¯Ÿè©±ã¢ãã«/é³å£°ãšãŒãžã§ã³ãã®çŸç¶ã«ã€ããŠãããã玹ä»ããŠãããŸãã 第2ç« ã§ã¯ãä»å¹Žã®3æã«å
¬éãããOpenAI Agents SDK (Pythonç)ã®ã³ã¢æ©èœã玹ä»ããŸãããããã®æ©èœãçšããŠäœæããé³å£°ãã«ããšãŒãžã§ã³ãã®äœæéçšãéããŠããã®ãªã¢ã«ãªäœ¿çšæãäœææã®æ³šæç¹ãæããã«ããæåŸã«ä»åŸã®å±æããäŒãããŸãã ããã§ã¯è¡ã£ãŠã¿ãŸãããïŒïŒ (â») æ¬èšäºã¯ãããå·çãã2025幎6æäžæ¬æç¹ã§ã®ã話ãšãªããŸãããŸããã®èšäºã®çè
ã¯çŸå Žçµéšã«ä¹ãã倧åŠé¢çã§ããç¹ããçè§£ããã ãããããèžãŸããäžã§æž©ããå¿ã§äžèªããã ãããšå¹žãã§ãã 1.AIãšãŒãžã§ã³ãâïžé³å£° = é³å£°ãšãŒãžã§ã³ã æ¬ç« ã§ã¯ããŸãAIãšãŒãžã§ã³ãã®å®çŸ©...ãšãããããã¯ãAIãšãŒãžã§ã³ãã®æ®åã®å€é·ããã©ããããªåœ¢ã§ãAIãšãŒãžã§ã³ãã®éçºãåŸæŒãããæ§ã
ãªéå
·ãã玹ä»ããŸããæ¬¡ã«é³å£°ãšãŒãžã§ã³ãã«ã€ããŠãåŸæ¥ã®ããã¹ãããŒã¹ã®AIãšãŒãžã§ã³ããšã®éããæç¢ºã«ããªããã䜿çšçšéãã䜿çšããäžã§çæããŠããã¹ãããšã玹ä»ããŸããæåŸã«ãé³å£°ãšãŒãžã§ã³ããéçºããäžã§ã¯å€ããªããªã¢ã«ã¿ã€ã 察話åã¢ãã«ãšãéçºããããããã€ããŸãšããã®ã§ã玹ä»ããŸããæ¬ç« ãéããŠãé³å£°ãšãŒãžã§ã³ãã«å°ãã§ã芪ãã¿ãæã£ãŠããã ããã°ãšæããŸãã 1.1 æ®åããŠããAIãšãŒãžã§ã³ãã«ã€ã㊠ã2025幎ã¯AIãšãŒãžã§ã³ãã®å¹Žã ããšããèšèãããè³ã«ããŸãã確ãã«ãã®æŽ»çšäºäŸã¯ä»å¹Žããççºçã«èŠãããããã«ãªã£ãŠããŸããããããããã®äžå°ã¯2幎ã»ã©åãããããŸããã2023幎ã®LangChainã®ãããªãã¬ãŒã ã¯ãŒã¯ã®éæã1ã€ç®ã®äžå°ã§ããããã¯LLMã1ã€ã®ãšãŒãžã§ã³ããšèŠç«ãŠãè€æ°ã®ãšãŒãžã§ã³ããé£éçã«åçãæèãããããŒãæ§ç¯ã§ããŸãããããŠä»æ¹ã§ã¯ãæšå¹Žããä»å¹Žã«ãããŠæå±ããããå€éšããŒã«ãç°ãªãèŠçŽãæã€ãšãŒãžã§ã³ãå士ã®é£æºãžã®éèŠã«å¿ããããã®ã€ã³ãã©æŽåããªãããŠããŸããããããŸã§ã¯ãšãŒãžã§ã³ãã®è³ã¿ããšããŠã®åœ¹å²ãæ
ãLLMã®å
éšç¥èã®ã¿ã§å®çµãããããªãäžè¬çãªã¿ã¹ã¯ãžã®ãšãŒãžã§ã³ãæ§ç¯ã«çãŸã£ãŠãããããããMCPãA2Aãšãã£ãæ°ãããããã³ã«(èŠçŽãåãæ±ºã)ã«ãã£ãŠãã¡ãŒã«ã¢ããªåŠçãããŒã«ã«ãã¡ã€ã«åŠçãªã©ãå€éšããŒã«æäœãå€éšãã³ããŒãšãŒãžã§ã³ããšã®å調ãå¿
èŠãšãªããå°éæ§ã®é«ãã¿ã¹ã¯ãžã®ãšãŒãžã§ã³ãæ§ç¯ãå¯èœãšãªããŸãã (äžå³åç
§)ã ããã¯ãLLMã®å
éšç¥èã«ã¯ãªãæ
å ±ã«ã¢ã¯ã»ã¹ã§ããæš©éããšãŒãžã§ã³ããžäžããããšã§ããããŸã§ã®RAGçãªæ€çŽ¢æ©èœã«å ããŠããããŸã§äººéãè¡ã£ãŠãããããªã¢ããªã±ãŒã·ã§ã³ã®æäœããæ©èœããšãŒãžã§ã³ãã«åãã£ãããšãæå³ããŸãã Model Context Protcol(MCP) : Ahthropicã2024幎11æã«æå±ãAIããŒã«ã«ããŒã«ã«ãŸãã¯ã€ã³ã¿ãŒãããäžã®ãµãŒããŒãšã®æ
å ±ã®ããåãã®ã«ãŒã«ã®ããš Agent to Agent(A2A) : Googleã2025幎4æã«æå±ãå¥ã
ã®åœ¹å²ãäžããããAIãšãŒãžã§ã³ãã«å
±æãããã«ãŒã«ã®ããš A2AãšMCPã®æŠèŠå³ (åç
§: A2A Protcol ( https://a2aproject.github.io/A2A/latest/#why-a2a-matters )) ãã®æµããåãããã®ããã«ãOpenAIã»Googleã»AWSã®ãããªå€§æAIã»ã¯ã©ãŠããããã€ããŒãããããããæã€ãµãŒãã¹ããå€éšããŒã«ã»ãšãŒãžã§ã³ããšã®ç°¡æçãªçµ±åãç®çãšããŠãOpenAI Agents SDK (Pythonçã¯25幎3æ)ã»Agent Development kit(25幎4æ)ã»Strands Agents SDK(25幎5æ)ã®ãããªãšãŒãžã§ã³ãéçºããããå
¬éããŠããŸãã ããã以å€ã«ããã§ã«ããŸããŸãªAIãšãŒãžã§ã³ãã®éçºããããç¶ã
ãšç»å ŽããŠããŠããŸããå®éã«AIãšãŒãžã§ã³ããäœãéã¯ãèªåéã®èª²é¡ãšéçºç°å¢ã«é©ãããã®ãéžå®ããå¿
èŠãããã§ãããã 1.2 é³å£°ãšãŒãžã§ã³ãã®æ©æµã«ã€ããŠèãã çŸåšã察話åãã«ããšãŒãžã§ã³ããšç§°ããããã®ã®å€ãããããã¹ãããŒã¹ã®ãã®ã§ããããã¹ãããŒã¹ã®ãšãŒãžã§ã³ãã䜿çšããéã¯ããŠãŒã¶ãŒãããŒããŒãã§ã¯ãšãªãå
¥åãããã®ã¯ãšãªã«å¿ããåçããšãŒãžã§ã³ããããã¹ãã§è¿ãããã®åçã«å¿ããŠåã³ãŠãŒã¶ãŒãã¯ãšãªãå
¥åã...ãšãããããªã«ãŒããç¶ããŸããããã«å¯ŸããŠãä»åŸã¯ãŠãŒã¶ãŒã®å
¥åãšãšãŒãžã§ã³ãã®åºåãé³å£°ã«çœ®ãæãããããããªé³å£°å¯Ÿè©±åãã«ããšãŒãžã§ã³ãã®äºäŸãå¢ããŠããã®ã§ã¯ãªãããšäºæ³ããŠããŸãããªããªããé³å£°æ©èœãæã£ããšãŒãžã§ã³ãã¯ä»¥äžã®ãããªæ©æµããããããŠãããããã§ãã ãã³ãºããªãŒã§äŒéãæ¥œã§æ©ã ïŒäººå士ã®å¯Ÿè©±ããããŒããŒãã§å
¥åãããããã¯çºè©±åœ¢åŒã§è¡ã£ãæ¹ãæ©ãã§ãããæ¥œã§ããããããããããŒããŒããæå
ã«ãªãå Žé¢ããæã£ãŠããæéããªãå Žé¢ãããŒããŒãã®æ±ããé£ããå Žåã§ã圹ç«ã¡ããã§ã ææ
ã®äŒéãã§ãã ïŒããšãã°ã«ã¹ã¿ããŒã»ã³ã¿ãŒã®ããã«ããŠãŒã¶ãŒåŽã®ææ
ãAIã«çè§£ãããäžã§å¯Ÿå¿ããŠããã£ãæ¹ãããå Žé¢ããããããããŸããããŸãããšãŒãžã§ã³ãåŽã«ææ
è±ãã«è©±ããŠãããããšã§ãèŽãããããå¢ããããããŸãããNotebookLMã®é³å£°æ©èœãç¹ã«ããããããäºäŸã§ãã æ°ãããŠãŒã¶ãŒäœéšã®æäŸ ïŒããã¹ãããŒã¹ã§ã¯ã©ãããŠãããAIã éå
· ãšããŠäœ¿çšããŠããæãã匷ãã£ãã®ã§ããããããé³å£°ã§ã®äŒè©±ã¯ãAIã ä»äºä»²é ãšããŠã» åé ãšããŠäœ¿çšããŠããæãã匷ãŸããŸã(çè
ã®äœæã«åºã¥ã) é³å£°ããŒã¹ã§ã®ãšãŒãžã§ã³ãã¯ãããã¹ãããŒã¹ã®ãšãŒãžã§ã³ãã«æ¯ã¹ãŠãæ
å ±ã®å
¥åºåã®äŒééåºŠãæ©ãããšãäŒéãç°¡æã§ããããšãããã«äººéã®ããã«ææ
衚çŸã®äŒéãã§ãããšããç¹ã§æ©æµããããŸããé³å£°ãšãŒãžã§ã³ããšã®ãããšããããéã«æ±ããããããšã¯ãããã¹ãããŒã¹ã§æ±ããããŠããã ããã«ããŠæ£ããæ
å ±ãæ©ãåŒãåºãã ãšããèŠçŽ ã«å ããŠãããã«ããŠäººéå士ã®ãããšãã«è¿ã¥ãããããããŸããã€ãŸãã 人éçãªäŒè©±ã®éåããçžæ§ãèšèªç¹æã®ã€ã³ãããŒã·ã§ã³ãæ¯ç¶ãã®ã¿ã€ãã³ã°ãçžæã®ææ
ãªã©çè§£ããäžã§ã®æè»ãªèšèéžã³ã察å¿ã®ä»æ¹ã®åçŸ ãéèŠãªèŠçŽ ã«ãªã£ãŠããã®ã§ãã AIãšãŒãžã§ã³ãã®äœ¿çšçšéã¯æ§ã
ã§ãããé³å£°æ©èœãæã£ããšãŒãžã§ã³ããªãã§ã¯ã®å¿çšäºäŸã¯ã以äžã®ãããªãã®ãæããããŸãã ã«ã¹ã¿ããŒã»ã³ã¿ãŒïŒãªã¢ã«ã¿ã€ã ã§é¡§å®¢ã«å¯Ÿå¿ããé¡§å®¢ã®æ
å ±ãåŠçããªãããé©åã«ç€Ÿå
ã®ãã¬ããžãåç
§ããŠåçãæäŸãããã人éã®ãªãã¬ãŒã¿ãŒã«åŒãç¶ãã ããã äŒè°ã®ãã¡ã·ãªããŒã¿ãŒãæžèšïŒäŒè°ã®ãµããŒãæ©èœå
šè¬ãæ
ããè°äºé²äœæããªãã€ã³ããŒæ©èœãªã© ããã²ãŒã·ã§ã³ã·ã¹ãã ïŒPCãã¹ãããè»ã«æèŒããæçãæ©æ¢°æäœãéæ¡å
ãªã©ãããããçšéã§ããã²ãŒã·ã§ã³ããã ãšã³ã¿ã¡ïŒã«ã¹ã¿ã é³å£°(ãã€ã¹ã¯ããŒãã³ã°)æ©èœãªã©ãçšããŠãç¹å®ã®ãã£ã©ã¯ã¿ãŒãæš¡ããAIãšäŒè©±ããã (ãã«ããšãŒãžã§ã³ãèŠçŽ ã¯å°ãªã) 人éã察å¿ããå Žåãšæ¯èŒããŠãé³å£°ãšãŒãžã§ã³ãã䜿çšããã¡ãªããã¯ã©ã®ãããªããšãæããããã§ããããïŒããšãã°ä»¥äžã®ãããªããšãæããããŸãã 24æé365æ¥çšŒåå¯èœ ãªãã¬ãŒã¿ã®è² æ
åæž äººä»¶è²»åæž å€èšèªå¯Ÿå¿å¯èœ ãã®ãããªã¡ãªãããããäžæ¹ã§ã以äžã®ãããªãã¡ãªããããããŸãã èãééããšèšãééãïŒç¹å®ã®èšèªã«å¯Ÿããé³å£°èªèæ§èœãé³å£°åææ§èœãäœããšãå®çšåã§ããŸãã å
šå¯Ÿå¿ã®é£ããïŒãããŸã§äººéãè¡ãªã£ãŠããããšã圢åŒç¥åããäžã§ããã³ãããšæ©æ§ã§åçŸãããã«ã³ããããããããšã®é£ãã äœæ¥éãšé
å»¶ã®ãã¬ãŒããªãïŒé³å£°ãã«ããšãŒãžã§ã³ãã®ãã«ããšãŒãžã§ã³ãéšåã§è¡ãäœæ¥éãå¢ããã»ã©ãè¿çé床ãäœäžããŠããŸã ãããã£ãŠãé³å£°ãšãŒãžã§ã³ããæ§ç¯ããŠããéã¯ããã®ãããªãã¡ãªãããèæ
®ãã€ã€ããããããªãã¹ã軜æžã§ãããããªç°å¢ãã¢ãžã¥ãŒã«ã®éžå®ã»æ§ç¯ãå¿
èŠãšãªãã§ãããã ãšããããèãééãã»èšãééãã»é
å»¶ã¯é³å£°ãæ±ãäžã§ã¯ãããªãã»ã³ã·ãã£ãã«ãªããªããã°ãªããªã課é¡ã§ããããšã確èªããŠãããŸãããã 1.3 ãªã¢ã«ã¿ã€ã é³å£°å¯Ÿè©±APIã»é³å£°ãšãŒãžã§ã³ãéçºããŒã«ã®çŽ¹ä» é³å£°ãšãŒãžã§ã³ãã®è³ããããŠå£(åïŒ)ã®åœ¹å²ãæããã®ããé³å£°èªè (Speech-To-Text)ãšé³å£°åæ (Text-To-Speech)ã§ãããããããå€èšèªã«å¯Ÿå¿ããããŒã«ããæ¥æ¬èªç¹åããŒã«ãããŒã«ã«ããŒã«ãªã©éåžžã«ãããããããããã§ç޹ä»ããããŸããããã©ã¡ãã®æè¡ãæ¥é²ææ©ã§å€§ãã鲿©ããŠããŸãã äŸãã°é³å£°åæã«é¢ããŠã2025幎5æã«å
¬éããããGoogleã®å€èšèªå¯Ÿå¿é³å£°åæã¢ãã«(gemini-2.5-flash-preview-tts)ã®äœ¿çšäºäŸã以äžã®èšäºã§ç޹ä»ãããŠããŸãããã®ã¢ãã«ã¯ãã«ãã¹ããŒã«ãŒã§ã®çºè©±èšå®ãå¯èœã§(å·çæç¹ã§æå€§ïŒå)ããã®èšäºã§ã¯2äººã®æ¥æ¬äººã«ããæŒ«æã¹ã¯ãªããããã®ã¢ãã«ã«çºè©±ãããçµæãèãããšãã§ããŸããèããŠã¿ããšãæã£ãŠãã以äžã«èªç¶ãªã€ã³ãããŒã·ã§ã³ã®æ¥æ¬èªãçºè©±ãããŠããããšã確èªã§ããããšæããŸãã Gemini API TTS(Text-to-Speech)で漫才音声を生成してみた ããã«AIãšçŽæ¥ãªã¢ã«ã¿ã€ã ã§äŒè©±ããããšãç®çãšãããSpeech-to-Speechåã®ã¢ãã«ãå¢ããŠããŠããŸãã Speech-to-Speechãšã¯ãé³å£°èªèãåççæãé³å£°åæãäžè²«ããŠè¡ãã¢ãã«æ§é ã®ããšã§ãäœé
å»¶ã§ããã人éãããèªç¶ãªäŒè©±ãå®çŸããããšãç®æããŠããŸãããããŸã§ã®ãããªãè€æ°ã®åºæã®ã¢ãã«ãçµã¿åãããã¢ãã«æ§é ãšç°ãªããSpeech-to-Speechåã®ã¢ãã«ã§ã¯ãèªèããé³å£°ãããã¹ãåãããã®ãŸãŸç¹åŸŽéãšããŠäœ¿çšããŠãããããããã¹ãåããéã«æ¬ èœããŠããŸãçºè©±è
ã®ææ
ãããŒã³ã®ãããªéèšèªçç¹åŸŽãåççæãé³å£°åæã«æå¹ã«å©çšã§ããŸãã 以äžã«ã6ææç¹ã§Speech-to-Speechã¢ãã«ã䜿çšã§ãã代衚çãªAPIãšãã®ç¹åŸŽããŸãšããŠããŸããåºæ¬çã«ã©ã®ã¢ãã«ãToolCallã«å¯Ÿå¿ããŠããããšãŒãžã§ã³ãçãªäœ¿çšãå¯èœã§ãã API å
¬éæ¥ 6æäžæ¬æç¹ã§ã®äœ¿çšå¯èœã¢ãã« ç«¶åãšæ¯èŒããéã®ç¹åŸŽ OpenAI Realtime API 2024.10 (WebSocket) 2024.12 (WebRTC) gpt-4o-realtime-preview-2025-06-03 gpt-4o-mini-realtime-preview-2025-06-03 WebRTCã§ã®å©çšãå¯èœ Google LiveAPI 2025.4 (Preview) gemini-2.0-flash-live-001 gemini-live-2.5-flash-preview gemini-2.5-flash-preview-native-audio-dialog (ã€ã³ãããŒã·ã§ã³æ¹åãææ
èªè) gemini-2.5-flash-exp-native-audio-thinking-dialog (Deep thinkç) PCã«ã¡ã©ãã¹ã¯ãªãŒã³å
±æãªã©ãç»åãåç»ãä»ãããªã¢ã«ã¿ã€ã äŒè©±ã«ç¹å Azure Voice LiveAPI 2025.5 (Preview) gpt-4o-realtime-preview gpt-4o-mini-realtime-preview phi4-mm-realtime Azureå
ã®é³å£°ããŒã«(ãã«ãã€ã³/ã«ã¹ã¿ã ã®ã¢ãã¿ãŒãé³å£°)ãšã®çµ±åãå¯èœ AWS SDK Bedrock API 2025.4 Amazon Nova Sonic AWSäžã§ã®å©çšã«ç¹å ä»åŸã¯ãã©ã®ãšãŒãžã§ã³ãéçºãã¬ãŒã ã¯ãŒã¯ã«ãããªã¢ã«ã¿ã€ã 察話ã¢ãã«ãçµã¿èŸŒããããã«ãªããšæãããŸããLiveKitã®ãããªWebRTCã§ã®éä¿¡ãåæãšãããŠãŒã¶ãŒã»ãµãŒããŒéããµãŒããŒéã®äœé
å»¶éä¿¡ãè¡ãã€ã€ãSpeech-to-Speechåã®ã¢ãã«ãšãå€éšããŒã«ãç°ãªããã³ããŒã®ãšãŒãžã§ã³ããšã®é£æºã«ãã£ãŠé
å»¶ã®å°ãªããããªãé³å£°å¯Ÿè©±åãã«ããšãŒãžã§ã³ããæ§ç¯ããŠãããããªã€ã¡ãŒãžã§ãããã®éã䜿çšããå¯èœæ§ãããã倧æAIãã³ããŒãæäŸããŠãããšãŒãžã§ã³ãéçºããããäžã®è¡šã«ãŸãšããŠããŸããåºæ¬çã«ã¯ãå
ã»ã©ç޹ä»ãããªã¢ã«ã¿ã€ã é³å£°ãšãŒãžã§ã³ããçµã¿èŸŒãããšãã§ããMCPãA2Aãããã³ã«ã§ã®å€éšé£æºæ©èœãåãã£ãŠããŸãã SDK å
¬éæ¥ OpenAI Agents SDK 2025.3 (Pythonç) 2025.6 (TypeScriptç) Google Agent Development Kit (ADK) Google Vertex AI Agents ADK: 2025.4 Azure AI Foundry Agent Service 2025.5 (äžè¬æäŸéå§) AWS Strands Agents 2025.5 以äžã§ã第ïŒç« ã¯ãããã§ããæ¬ç« ãéããŠãé³å£°ãšãŒãžã§ã³ãé¢é£ã®èæ¯ç¥èã䟿çãçŸç¶å
¬éãããŠããããŒã«ã®äžéšããäŒãã§ããããšæããŸããæ¬¡ç« ã§ã¯ãé³å£°ãšãŒãžã§ã³ãæè¡çãªéšåãããå°ãæ·±æããããããäžã®è¡šã§ç޹ä»ãããšãŒãžã§ã³ãéçºããããçšããŠå®éã«äœæãããã¢ããèŠãããéçºãããã®äœ¿çšæããåºæ¬çãªæè¡ãé³å£°ãšãŒãžã§ã³ãã®é°å²æ°ãå°ãã§ãçè§£ããŠããã ããã°ãšæããŸãã 2. OpenAI Agents SDK (Pythonç)ã§äœãé³å£°å¯Ÿè©±åãã«ããšãŒãžã§ã³ãããŒã« ãã®ç« ã§ã¯ãå®éã«ãšãŒãžã§ã³ãéçºããããå©çšããŠäœæããé³å£°å¯Ÿè©±åãã«ããšãŒãžã§ã³ãã®ãã¢ã®æ§åããèŠãããŸããä»åã¯ã1.3ç¯ã§ç޹ä»ããéçºãããã®äžã§ãæ¯èŒçæ©ãããå©çšå¯èœã ã£ãOpenAI Agents SDK(Pythonç)ã䜿çšããŠããŸããå§ãã«ãOpenAI Agents SDKã®åºæ¬çãªæ
å ±ãšæ©èœã玹ä»ããŸããæ¬¡ã«ãäžè¬çãªé³å£°ãšãŒãžã§ã³ãã®æ§é ãšããŠ2ã€ãChained ArchitectureãšSpeech-to-Speech Architectureããããã玹ä»ããŸããç¶ããŠããã¢ã®ç޹ä»ãšããŠãå®éã®ã³ãŒããšãã«ããšãŒãžã§ã³ãã®å
šäœåããããŠãã¢ãåãããŠããåç»ããèŠãããŸããæåŸã«æ®ã課é¡ãšä»åŸã®å±æã«ã€ããŠãäŒãããŸãã 2.1 OpenAI Agents SDKãšã¯ OpenAI Agents SDKãšã¯ãOpenAIã«ãã£ãŠæäŸãããŠãããªãŒãã³ãœãŒã¹ã®Python/TypeScriptçšã®ã©ã€ãã©ãªã®ããšã§ãAIãšãŒãžã§ã³ãã®éçºãç°¡çŽ åããããã«èšèšãããŠããŸãã OpenAI Agents SDKã§æäŸãããŠããåºæ¬çãªæ©èœãšããŠä»¥äžã®ãããªãã®ããããŸãã ãã³ããªã (Handoffs) ãããšãŒãžã§ã³ããèªåã®åœ¹å²ãè¶
ããã¿ã¹ã¯ã«ééããéãå°éãšãŒãžã§ã³ãã«å§è²ããä»çµã¿ã è€éãªã¯ãŒã¯ãããŒãåæ»ã«é²ããããšãã§ããŸãã ãã®éçºããã㯠A2A ãæå±ãããåã«å
¬éããããã®ã§ãããèãæ¹ã¯å
±éã§ãã ãšãŒãžã§ã³ãã®ããŒã«å (Agent as a tool) ä»ã®ãšãŒãžã§ã³ããããŒã«ãšããŠå©çšããLLMãžã®åãåããã 颿°åŒã³åºãåœ¢åŒ ã§è¡ããŸãã MCP ãšãŒãžã§ã³ããå€éšããŒã«ãžã¢ã¯ã»ã¹ããããç¹å®æ©èœãå®è¡ãããããããã® æ¡åŒµæ©èœ ã 颿°åŒã³åºã (Function calling / Tools) éçºè
ãå®çŸ©ãã Python 颿° ãAIãšãŒãžã§ã³ãã«ããŒã«ãšããŠæäŸããå¿
èŠã«å¿ããŠå®è¡å¯èœã çµã¿èŸŒã¿ããŒã« Webæ€çŽ¢ã»ãã¡ã€ã«æ€çŽ¢ã»ã³ã³ãã¥ãŒã¿ãŒæäœãªã©ãæšæºã§åãã£ãŠããããŒã«çŸ€ã ã¬ãŒãã¬ãŒã« (Guardrails) ãšãŒãžã§ã³ãã®å
¥åã»åºåãæ€èšŒïŒå¶åŸ¡ããå®å
šæ§ãšå質ã確ä¿ããæ©èœã ãã¬ãŒã·ã³ã° (Tracing) ãšãŒãžã§ã³ãã®å®è¡ãããŒãæç³»åã§å¯èŠåã»èšé²ãããããã°ãæ§èœåæã容æã«ããŸãã ã¹ããªãŒãã³ã°çæ ãšãŒãžã§ã³ãå®è¡äžã®åºåãã€ãã³ãã ãã£ã³ã¯åäœ ã§é 次åãåãä»çµã¿ã ä»åã®äœæãããã¢ã¯ãç¹ã«ãã³ããªãã»MCPã»Toolsã»ã¬ãŒãã¬ãŒã«ã»ã¹ããªãŒãã³ã°çæãã³ã¢æè¡ãšãªããŸãããããã®æ©èœãçµã¿åãããŠãé³å£°æ©èœãæã£ããã«ããšãŒãžã§ã³ããæ§ç¯ããŠãããŸãã 2.2 2çš®é¡ã®é³å£°ãšãŒãžã§ã³ãã®æ§é OpenAI Platformã®Webãµã€ãã®Voice agentsããŒãžã§ã¯ãïŒçš®é¡ã®é³å£°ãã«ããšãŒãžã§ã³ãã®æ§é ã玹ä»ãããŠããŸãã 1ã€ç®ã®STT, TTSçµã¿èŸŒã¿åã®Chained Architectureã¯ãããã¹ãããŒã¹ã®ãã«ããšãŒãžã§ã³ããåå¥ã®STTã¢ãã«ãšTTSã¢ãã«ã§æã¿èŸŒãã æ§é ããšã£ãŠããŸããããããã®å
¥åºåã®ç®¡çãããããããšããæ§ç¯ã®ãããããå©ç¹ãšããŠãããããŸãã Chained architecture: STT, TTSçµã¿èŸŒã¿åã®ãšãŒãžã§ã³ãæ§é (åç
§: OpenAI platform, "Voice agents" ( https://platform.openai.com/docs/guides/voice-agents?voice-agent-architecture=speech-to-speech )) äžæ¹ã2ã€ç®ã®Speech-to-Speech Architectureã¯ã1.3ç¯ã§ç޹ä»ããSpeech-to-Speechã¢ãã«ã®äœ¿çšãåæãšãããã«ããšãŒãžã§ã³ãã®ããšãæããŠããŸããChained Architectureãšæ¯èŒããé
å»¶ãå°ãªãããšãææ
ã声ã®ããŒã³ã®ãããªéèšèªçãªèŠçŽ ãäŒéå¯èœã§ããããšãå©ç¹ãšããŠãããããŸãã Speech-to-speech (realtime) architecture: Speech-to-Speechåã®ãšãŒãžã§ã³ãæ§é (åç
§: OpenAI platform, "Voice agents" ( https://platform.openai.com/docs/guides/voice-agents?voice-agent-architecture=speech-to-speech )) ã©ã¡ãã®æ§é ãåããã¯ããã®çšéã«å¿ããŠèããå¿
èŠãããã§ããããOpenAI Agents SDKã¯PythonçãšTypeScriptçããããçŸåšPythonçã§ã¯1ã€ç®ã®Chained Architectureã®ã¿ããµããŒãããŠããŸããTypeScriptçã¯Speech-to-Speechæ§é ã®ãšãŒãžã§ã³ããäœæã§ãããšã®ããšã§ããããã®èšäºãæžããŠãã2,3é±éã»ã©åã«ãåºãã°ãããšããããšããããæ®å¿µãªããä»åã¯ç޹ä»ã§ããŸãããä»åã¯ãPythonçã®éçºããããçšããŠãChained-Architectureæ§é ã®ãã«ããšãŒãžã§ã³ããäœæããŠããŸãã 2.3 ãã¢ã®çŽ¹ä» ããã§ã¯æ©éãã¢ã®äœæã®é åºã説æããŠãããŸãããŸããã³ãŒããæžãåã«ããäœæ¥ãšããŠã³ãŒã«ã»ã³ã¿ãŒã®èšå®ãèããŸããããšãã°ä»¥äžã®ãããªèšå®ã§ãã äŒç€Ÿå: ä»»æ åæ±è£œåã»ãµãŒãã¹: 10çš®é¡ã®ããžã¿ã«è£œå 質åã¿ã€ã:ãååæ³šæã»åååæ±ã»ã¯ã¬ãŒã ã»å
šãé¢ä¿ã®ãªã質å ãã®ä»: 察å¿ããã¥ã¢ã«(æåã«ååã䌺ããªã©ïŒä»åã¯ãšãŒãžã§ã³ãããšã®ããã³ããã§ä»£çš) ãã«ããšãŒãžã§ã³ããšããŠãšãŒãžã§ã³ããè€æ°çšæããã®ã§ããã°ã質åã¿ã€ãããšã«çšæããããšã1ã€ã®æ¹æ³ã§ããä»åã®äŸã§ã¯ã以äžã®ãšãŒãžã§ã³ãæ§æãèããããŸãããŸããé»è©±å¯Ÿå¿ãè¡ã質åã¿ã€ããèªèããããªã¢ãŒãžãšãŒãžã§ã³ãã§ããæ¬¡ã«ãååæ³šæã»åååæ±ã»ã¯ã¬ãŒã ã»ç¡é¢ä¿ãªè³ªåãããããæ
åœããå°éãšãŒãžã§ã³ãã§ããç¹ã«ãå
šãé¢ä¿ã®ãªã質åãæ
åœãããšãŒãžã§ã³ããšããŠã2.1ç¯ã§ç޹ä»ããã¬ãŒãã¬ãŒã«ã圹ã«ç«ã¡ãŸããã¬ãŒãã¬ãŒã«ã¯ç¹å¥ãªãšãŒãžã§ã³ãã§ã質åãç¶æ³ã«çžå¿ãããªãå Žåã«ãããŠããªã¬ãŒãšããŠãšã©ãŒãåãåºãå
¥åã¬ãŒãã¬ãŒã«ãšããšãŒãžã§ã³ãã«ããåºåãç¶æ³ã«çžå¿ãããªãå Žåã«ãããŠãšã©ãŒãåãåºãåºåã¬ãŒãã¬ãŒã«ã®2çš®é¡ãçšæãããŠããŸãã ä»åã¯ãããšãã°ã20+30ã¯ãªãã§ããïŒããæé¢ã«åããŠå°çããå®å®é£è¡å£«ã¯èª°ïŒããšãã£ãç¶æ³ã«çžå¿ãããªã質åããªãããããšãæ³å®ããŠããã®ãããªè³ªåã匟ããããªããã³ãããã¬ãŒãã¬ãŒã«ãšãŒãžã§ã³ããžäžããŠããŸãã ããããèžãŸããŠãåãšãŒãžã§ã³ãã®åœ¹å²ãšé¢é£ã以äžã®ããã«æ±ºããŸããã ããªã¢ãŒãžãšãŒãžã§ã³ã: æåã«è³ªåè
ã®ååãšè³ªåãèãã質åããã¯è³ªåã¿ã€ãã顿šããã質åã¿ã€ãã«å¿ããŠãæ
åœã®ãšãŒãžã§ã³ãã«è³ªåè
ã®ååã»è³ªåã¿ã€ããã³ã³ããã¹ããšããŠæž¡ãã察å¿ãå§è²(ãã³ããªã)ãããä»åã¯ã³ã³ããã¹ããæŽæ°ãã颿°ãšããŠã質åè
ãååãšè³ªåãèšã£ãå Žåã«ããããèšæ¶ãã颿°ãçšæããToolCallã«èšå®ãããå
šãé¢ä¿ã®ãªã質åã«é¢ããŠã¯ãåãä»ããã¬ãŒãã¬ãŒã«ãšãŒãžã§ã³ããåŒã³åºããããã®è³ªåã«ã¯çããããªãããšãã£ãæšã®å
容ãåºåãã ååæ³šæãšãŒãžã§ã³ã: 質åè
ãè²·ããã補åã確èªããproductsãšããååã®ãã©ã«ãã«ãããããŸãšããã補åæ
å ±ããã¹ããã¡ã€ã«ã®ååãã該åœãã補åãæ¢ãã該åœååãããã°ãæåŸã«ããäžåºŠç¢ºèªããŠã質åè
ã®åæãåŸããæ³šæå®äºã¡ãŒã«ãSlackã«éä¿¡ããããªã¢ãŒãžãšãŒãžã§ã³ãã«ä»äºãåã³åãæž¡ã åååæ±ãšãŒãžã§ã³ã: 質åè
ãææããŠãã補åã«é¢ããæ
å ±ããproductsãã©ã«ãå
ã®åå¥ã®è£œåæ
å ±ããã¹ããã¡ã€ã«ããæ€çŽ¢ããåçã«ãªããããªéšåãæœåºããåçãäœæãããåçã§ããªãå Žåã¯ããç³ãèš³ãããŸããããåçã§ããŸããããšåçãããããªã¢ãŒãžãšãŒãžã§ã³ãã«ä»äºãåã³åãæž¡ã (人éã®ãªãã¬ãŒã¿ã«ç¹ãçŽããšããæ¹æ³ãèãããã) ãšã©ãŒã»ãã©ãã«ã»ã¯ã¬ãŒã 察å¿ãšãŒãžã§ã³ã: 質åè
ã®ææã«å¯Ÿå¿ããã補åã«é¢ããŠã§ããã°ã補åæ
å ±ããã¹ãããæ€çŽ¢ãè¡ãåçãèãããçããããªãå Žåã¯ããç³ãèš³ãããŸããããåçã§ããŸããããšåçãããããªã¢ãŒãžãšãŒãžã§ã³ãã«ä»äºãåã³åãæž¡ã ä»åçšãããToolãšMCPã¯ä»¥äžã®éãã§ãã Tool: update_customer_info (ããªã¢ãŒãžãšãŒãžã§ã³ãã§è³ªåè
ã®ååãæŽæ°ããä»ã®ãšãŒãžã§ã³ãã«åãæž¡ã) MCP: Filesystem Server MCP (æå®ãããã©ã«ãã®äžèº«ãæäœã§ããããã«ãã), SSE Slack API Server (èªåã§çšæããSlackãã£ãã«ã«BotæåŸ
ããBotãè²ã
ãšè©±ããããã«ãã) æåŸã«ãé³å£°ã¢ãã«ã®ãã€ãã©ã€ã³ã«çµ±åããã¹ããªãŒãã³ã°ã§ã®åçãè¡ããããã«èšå®ããŸãã ããã§ä»¥äžãšãªããŸããããã§ã¯ãç§ãæåã«äœæãããšãŒãžã§ã³ãã®æŠèгå³ãšã³ãŒããã¿ãŠãããŸãããã æåã®é³å£°å¯Ÿè©±åãã«ããšãŒãžã§ã³ãã®æŠèŠ³å³ # 補åæ
å ±ããã¹ããã¡ã€ã«ã®äžäŸ (Claudeã§äœæ) ååID: PROD_004 ãåºæ¬æ
å ±ã ååå: ã¹ããŒãã¹ããŒã«ãŒ D47 Air ã¢ãã«çªå·: ã¹D-6658 çºå£²å¹Ž: 2022 ã¡ãŒã«ãŒ: ã€ãããŒã·ã§ã³å·¥æ¿ å·¥å Žäœæ: åæµ·éæå¹åžäžå€®åºæ¶ç©ºçº1-5-6 ã寞æ³ã é«ã: 7.9 cm å¹
: 7.2 cm 奥è¡ã: 2.9 cm éé: 625 g ãã«ã©ãŒãªãã·ã§ã³ã - ã·ã«ã㌠- ãã¯ã€ã - ã°ãªãŒã³ äŸ¡æ Œ: 114,800å ä¿èšŒæé: 36ã¶æ ãåæ±èª¬ææžã®æŠèŠã 1. åæèšå®ïŒè£œåã®é»æºãå
¥ããç»é¢ã®æç€ºã«åŸã£ãŠåæèšå®ãå®äºããŠãã ããã 2. åºæ¬æäœïŒã¹ããŒãã¹ããŒã«ãŒ D47 Airã®äž»èŠãªæ©èœãšæäœæ¹æ³ã«ã€ããŠèª¬æããŸãã 3. å
黿¹æ³ïŒä»å±ã®å°çšå
é»åšãŸãã¯æšå¥šãããå
黿¹æ³ã§å
é»ããŠãã ãããããããªãŒå¯¿åœãå»¶ã°ãããã®ãã³ããå«ãŸããŸãã 4. ãã©ãã«ã·ã¥ãŒãã£ã³ã°ïŒç°¡åãªåé¡è§£æ±ºã®ããã®ã¹ããããã€ã¹ãããã¬ã€ãã 5. å®å
šäžã®æ³šæïŒè£œåãå®å
šã«ãå©çšããã ãããã®éèŠãªæ
å ±ã ããµããŒãæ
å ±ã â ãããã質å Q: ã¹ããŒãã¹ããŒã«ãŒ D47 Airã®é»æºãå
¥ããªãå Žåã®å¯ŸåŠæ³ã¯ïŒ A: ãŸãã補åãååã«å
é»ãããŠããã確èªããŠãã ãããæ¬¡ã«ã黿ºãã¿ã³ã10ç§ä»¥äžé·æŒãããŠåŒ·å¶åèµ·åãã詊ããã ãããããã§ã解決ããªãå Žåã¯ãµããŒãã»ã³ã¿ãŒã«ãé£çµ¡ãã ããã Q: ã¹ããŒãã¹ããŒã«ãŒ D47 Airã®ä¿èšŒæéã¯ïŒ A: éåžžãã¹ããŒãã¹ããŒã«ãŒ D47 Airã®ä¿èšŒæéã¯ã賌å
¥æ¥ãã12ã¶æã§ãã詳现ã¯ä¿èšŒæžãã確èªãã ããã â ãšã©ãŒã³ãŒã E301: ãããã¯ãŒã¯æ¥ç¶ãšã©ãŒãæ¥ç¶èšå®ã確èªããŠãã ããã E302: ã¹ãã¬ãŒãžå®¹éäžè¶³ãäžèŠãªããŒã¿ãåé€ããŠãã ããã E203: ããããªãŒæ®éäœäžãå
é»ããŠãã ããã â ãµããŒãã»ã³ã¿ãŒ é»è©±: 0120-12x-26x (å仿é: å¹³æ¥9:00-18:00) ã¡ãŒã«: support.d-6658@example-company.co.jp ãŠã§ããµã€ã: http://www.example-company.co.jp/support/ã¹d-6658 # config.py import numpy as np from pydantic import BaseModel MODEL = "gpt-4o-mini" SAMPLE_RATE = 24000 FORMAT = np.int16 CHANNELS = 1 VOICE_INSTRUCTION = "ããªãã¯ãã³ãŒã«ã»ã³ã¿ãŒã®ãšãŒãžã§ã³ãã§ããäžå¯§ãªæ¥æ¬èªã§è©±ããŠãã ããã" VOICE_SPEED = 1.0 PRODUCTS_LIST = [ "ã¿ãã¬ãã A68 Air" , "ã¹ããŒããŠã©ãã B27 Max" , "ã¹ããŒããã©ã³ C82 Lite" , "ã¹ããŒãã¹ããŒã«ãŒ D47 Air" , "ã¹ããŒããã©ã³ E51 Mini" , "ã¹ããŒãã¹ããŒã«ãŒ F29 Pro" , "ã¹ããŒããã©ã³ G81 Standard" , "ã¯ã€ã€ã¬ã¹ã€ã€ãã³ H61 Air" , "ã¯ã€ã€ã¬ã¹ã€ã€ãã³ I79 Pro" , "ã²ãŒã æ© J87 Max" ] JA_RECOMMENDED_PROMPT_PREFIX = """ #ã·ã¹ãã ã³ã³ããã¹ã \n ããªãã¯ããšãŒãžã§ã³ãã®å調ãšå®è¡ãç°¡åã«ããããã«èšèšããããã«ããšãŒãžã§ã³ãã·ã¹ãã ãAgents SDKãã®äžéšã§ãã Agentsã¯äž»ã«2ã€ã®æœè±¡æŠå¿µã**Agent**ãš**Handoffs**ã䜿çšããŸãããšãŒãžã§ã³ãã¯æç€ºãšããŒã«ãå«ã¿ãé©åãªã¿ã€ãã³ã°ã§äŒè©±ãä»ã®ãšãŒãžã§ã³ãã«åŒãç¶ãããšãã§ããŸãã ãã³ããªãã¯éåžž transfer_to_<agent_name> ãšããååã®ãã³ããªã颿°ãåŒã³åºãããšã§å®çŸãããŸãããšãŒãžã§ã³ãéã®åŒãç¶ãã¯ããã¯ã°ã©ãŠã³ãã§ã·ãŒã ã¬ã¹ã«åŠçãããŸãã ãŠãŒã¶ãŒãšã®äŒè©±ã®äžã§ããããã®åŒãç¶ãã«ã€ããŠèšåããããæ³šæãåŒãããããªãã§ãã ããã \n """ # CONTEXT class CallCenterAgentContext (BaseModel): customer_name: str | None = None question_type: str | None = None # my_workflow.py from __future__ import annotations import os import uuid from collections.abc import AsyncIterator from typing import Callable from agents import (Agent, GuardrailFunctionOutput, InputGuardrailTripwireTriggered, RunContextWrapper, Runner, TResponseInputItem, function_tool, input_guardrail, trace) from agents.mcp import MCPServerStdio from agents.voice import VoiceWorkflowBase, VoiceWorkflowHelper from config import JA_RECOMMENDED_PROMPT_PREFIX, MODEL, CallCenterAgentContext from pydantic import BaseModel, Field # TOOLS @ function_tool async def update_customer_info ( context: RunContextWrapper[CallCenterAgentContext], customer_name: str , question_type: str ) -> None : """ Update the customer information. Args: customer_name: The name of the customer. question_type: The type of question being asked. """ # Update the context based on the customer's input context.context.customer_name = customer_name context.context.question_type = question_type # Guardrails class AbnormalOutput (BaseModel): reasoning: str | None = Field( default= None , description= "ç°åžžãªè³ªåãã©ããã®çç±" ) is_abnormal: bool = Field(default= False , description= "ç°åžžãªè³ªåãã©ãã" ) guardrail_agent = Agent( name= "Guardrail check" , instructions=( "ã«ã¹ã¿ããŒãã³ãŒã«ã»ã³ã¿ãŒã«ããªããããªè³ªåãããŠãããã©ããã確èªããŠãã ããã" "ããšãšãããããªãã®å¥œããªè²ã¯äœã§ããïŒãããããªãã®è¶£å³ã¯äœã§ããïŒããªã©ã®è³ªåã¯ãã³ãŒã«ã»ã³ã¿ãŒã«ããã¹ãã§ã¯ãããŸããã" "ä»ã«ãã210ãã4ã¯ïŒããšãã£ãèšç®åé¡ããã仿¥ã®çµæžãã¥ãŒã¹ã¯ïŒããšãã£ãäžè¬çãªéè«ãã³ãŒã«ã»ã³ã¿ãŒã«ããã¹ãã§ã¯ãããŸããã" "ãã®ãããªè³ªåãèŠã€ããããis_abnormalãTrueã«ããŠãã ããã" ), output_type=AbnormalOutput, model=MODEL, ) @ input_guardrail async def abnormal_guardrail ( context: RunContextWrapper[ None ], agent: Agent, input : str | list [TResponseInputItem] ) -> GuardrailFunctionOutput: """This is an input guardrail function, which happens to call an agent to check if the input is a abnormal question. """ result = await Runner.run(guardrail_agent, input , context=context.context) final_output = result.final_output_as(AbnormalOutput) return GuardrailFunctionOutput( output_info=final_output, tripwire_triggered=final_output.is_abnormal, ) # Voice Call Center Workflow class VoiceCallCenterWorkflow (VoiceWorkflowBase): def __init__ (self, on_start: Callable[[ str ], None ], tts_output: Callable[[ str ], None ], on_agent_change: Callable[[ str ], None ] = None , on_context_change: Callable[[CallCenterAgentContext], None ] = None ): """ Args: on_start: A callback that is called when the workflow starts. The transcription is passed in as an argument. tts_output: A callback that is called when the TTS output is generated. on_agent_change: A callback that is called when the agent changes. on_context_change: A callback that is called when the context changes. """ self._input_history: list [TResponseInputItem] = [] self._context = CallCenterAgentContext() self._conversation_id = uuid.uuid4().hex[: 16 ] self._on_start = on_start self._tts_output = tts_output self._on_agent_change = on_agent_change self._on_context_change = on_context_change self._current_agent = None self._agents_initialized = False async def _initialize_agents (self): """MCPãµãŒããŒãåæåããŠãšãŒãžã§ã³ããèšå®""" if self._agents_initialized: return try : # MCPãµãŒããŒã®åæå self.file_mcp_server = MCPServerStdio( name= "Filesystem Server, via npx" , params={ "command" : "npx" , "args" : [ "-y" , "@modelcontextprotocol/server-filesystem" , "path/to/products" ] } ) self.slack_mcp_server = MCPServerStdio( name= "SSE Slack API Server" , params={ "command" : "npx" , "args" : [ "-y" , "@modelcontextprotocol/server-slack" ], "env" : { "SLACK_BOT_TOKEN" : os.environ.get( "SLACK_BOT_TOKEN" ), "SLACK_TEAM_ID" : os.environ.get( "SLACK_TEAM_ID" ), "SLACK_CHANNEL_IDS" : os.environ.get( "SLACK_CHANNEL_ID" ), } } ) # MCPãµãŒããŒãéå§ await self.file_mcp_server.__aenter__() await self.slack_mcp_server.__aenter__() # ãšãŒãžã§ã³ãã®åæå self.error_trouble_agent = Agent[CallCenterAgentContext]( name= "ãšã©ãŒã»ãã©ãã«ã»ã¯ã¬ãŒã 察å¿ãšãŒãžã§ã³ã" , handoff_description= "ãšã©ãŒã»ãã©ãã«ã»ã¯ã¬ãŒã 察å¿ãšãŒãžã§ã³ãã¯ãååã®ãšã©ãŒããã©ãã«ãã¯ã¬ãŒã ã«é¢ãã質åã«å¯Ÿå¿ã§ããŸãã" , instructions=f """{JA_RECOMMENDED_PROMPT_PREFIX} ããªãã¯ãšã©ãŒã»ãã©ãã«ã»ã¯ã¬ãŒã 察å¿ãšãŒãžã§ã³ãã§ãããã顧客ãšè©±ããŠããå Žåãããªãã¯ããããããªã¢ãŒãžãšãŒãžã§ã³ãããä»äºãå§è²ãããŸããã ã³ãŒã«ã»ã³ã¿ãŒããã¥ã¢ã«ãšã以äžã®ã«ãŒãã³ã«åŸã£ãŠé¡§å®¢ã®è³ªåã«å¯Ÿå¿ããŠãã ããã # ã«ãŒãã³ 1. 顧客ãã©ã®ååã®ãã©ã®ãããªãšã©ãŒããã©ãã«ã«ã€ããŠè³ªåããŠãããã確èªããŸããã¯ã¬ãŒã ã§ããã°ãã©ã®ãããªã¯ã¬ãŒã ãã確èªããããã¥ã¢ã«ã«åŸã£ãŠå¯Ÿå¿ããŠãã ããã 2. ç¹å®ã®ååã«é¢ãããã®ã§ããå Žåãfile_mcp_serverã§æäŸãããŠãããã£ã¬ã¯ããªã®ãã¡ã€ã«ã®äžã«ãäžèŽããããã¹ããã¡ã€ã«ããããã©ããã確èªããŸãã 3. ããå Žåããã®ããã¹ããã¡ã€ã«ã®äžããã顧客ã®è³ªåã«çããããæ
å ±ãæœåºããåçããŠãã ããã質åã®å
容ãçããããªãå Žåã¯ããç³ãèš³ãããŸããããããã€ããŠã¯ãçãã§ããŸãããããšäŒããŸãã 4. ãµããŒãã»ã³ã¿ãŒã®é»è©±çªå·ãã¡ãŒã«ã¢ãã¬ã¹ãæžãããŠããå Žåã¯ã顧客ã«ãã®æ
å ±ãäŒããSlackã®ãã£ã³ãã«ã«ãã®å
容ãéä¿¡ããŠãã ããã 5. ãªãå Žåããç³ãèš³ãããŸãããããã®ãšã©ãŒããã©ãã«ã«ã€ããŠã¯ãçãã§ããŸãããããšäŒããŸãã ãã顧客ãã«ãŒãã³ã«é¢é£ããªã質åãããå Žåããããã倧äžå€«ã§ãããšããå
容ããã£ãå Žåã¯ãããªã¢ãŒãžãšãŒãžã§ã³ãã«åŒãç¶ããŸãã """ , mcp_servers=[self.file_mcp_server, self.slack_mcp_server], ) self.how_to_agent = Agent[CallCenterAgentContext]( name= "åååãæ±ããšãŒãžã§ã³ã" , handoff_description= "åååãæ±ããšãŒãžã§ã³ãã¯ãååã«é¢ãã質åã«çããããšãã§ããŸãã" , instructions=f """{JA_RECOMMENDED_PROMPT_PREFIX} ããªãã¯åååãæ±ããšãŒãžã§ã³ãã§ãããã顧客ãšè©±ããŠããå Žåãããªãã¯ããããããªã¢ãŒãžãšãŒãžã§ã³ãããä»äºãå§è²ãããŸããã 顧客ããµããŒãããããã«ã以äžã®ã«ãŒãã³ã䜿çšããŠãã ããã # ã«ãŒãã³ 1. 顧客ãã©ã®ãããªååã«ã€ããŠè³ªåããŠãããã確èªããŸãã 2. file_mcp_serverã§æäŸãããŠãããã£ã¬ã¯ããªã®ãã¡ã€ã«ã®äžã«ãäžèŽããããã¹ããã¡ã€ã«ããããã©ããã確èªããŸãã 3. ããå Žåããã®ããã¹ããã¡ã€ã«ã®äžããã顧客ã®è³ªåã«çããããæ
å ±ãæœåºããåçããŠãã ããã質åã®å
容ãçããããªãå Žåã¯ããç³ãèš³ãããŸããããããã€ããŠã¯ãçãã§ããŸãããããšäŒããŸãã 4. ãªãå Žåããç³ãèš³ãããŸãããããã®ååã¯åãæ±ã£ãŠãããŸãããããšäŒããŸãã ãã顧客ãã«ãŒãã³ã«é¢é£ããªã質åãããå Žåããããã倧äžå€«ã§ãããšããå
容ããã£ãå Žåã¯ãããªã¢ãŒãžãšãŒãžã§ã³ãã«åŒãç¶ããŸãã """ , mcp_servers=[self.file_mcp_server], ) self.order_agent = Agent[CallCenterAgentContext]( name= "ååæ³šæã»è³Œå
¥å¯Ÿå¿ãšãŒãžã§ã³ã" , handoff_description= "ååæ³šæã»è³Œå
¥ã«é¢ãã質åã«çãããšãŒãžã§ã³ãã§ãã" , instructions=f """{JA_RECOMMENDED_PROMPT_PREFIX} ããªãã¯ååæ³šæã»è³Œå
¥å¯Ÿå¿ãšãŒãžã§ã³ãã§ãããã顧客ãšè©±ããŠããå Žåãããªãã¯ããããããªã¢ãŒãžãšãŒãžã§ã³ãããä»äºãå§è²ãããŸããã 顧客ããµããŒãããããã«ã以äžã®ã«ãŒãã³ã䜿çšããŠãã ããã # ã«ãŒãã³ 1. 顧客ãã©ã®ãããªååã賌å
¥ããããã確èªããŸãã 2. file_mcp_serverã§æäŸãããŠãããã£ã¬ã¯ããªã®ãã¡ã€ã«ã®äžã«ãäžèŽããããããã¯é¡äŒŒããããã¹ããã¡ã€ã«ããããã©ããã確èªããŸããããšãã°ããã¹ãããã®ããã«ã¹ããŒããã©ã³ã®ç¥ç§°ã䜿ã£ãŠããå Žåããåååã®äžéšãç°ãªãå Žåãªã©ã§ãã 3. ããå ŽåãäžåºŠé¡§å®¢ã«ç¢ºèªã®ããã<åå>ã§ãããæ³šæããŠãããããã§ããïŒããšå°ããŸããåæãåŸãããslack_file_mcp_serverã§#泚æç®¡çã«ã<ååå>ãæ³šæããŸãããããšéä¿¡ããŠãã ãããæåŠãããããããªã¢ãŒãžãšãŒãžã§ã³ãã«åŒãç¶ããŸãã 4. ãªãå Žåããç³ãèš³ãããŸãããããã®ååã¯åãæ±ã£ãŠãããŸãããããšäŒããŸããå°ãã ãã§ã䌌ãŠããååã®ååãããå Žåã¯ãã<䌌ãŠããååå>ã¯ãããŸããã<ååå>ã¯ãããŸãããããšäŒããŸãã ãã顧客ãã«ãŒãã³ã«é¢é£ããªã質åãããå Žåããããã倧äžå€«ã§ãããããããŸããããšããå
容ããã£ãå Žåã¯ãããªã¢ãŒãžãšãŒãžã§ã³ãã«åŒãç¶ããŸãã """ , mcp_servers=[self.file_mcp_server, self.slack_mcp_server], ) self.triage_agent = Agent[CallCenterAgentContext]( name= "ããªã¢ãŒãžãšãŒãžã§ã³ã" , instructions=( f "{JA_RECOMMENDED_PROMPT_PREFIX} " "ããªãã¯åªç§ãªããªã¢ãŒãžãšãŒãžã§ã³ãã§ãã ããªãã¯ã顧客ã®ãªã¯ãšã¹ããé©åãªãšãŒãžã§ã³ãã«å§ä»»ããããšãã§ããŸãã \n " "顧客ã®è³ªåãã³ãŒã«ã»ã³ã¿ãŒã«ããªããããªè³ªåãããŠãããããããªãå Žåã¯ãã¬ãŒãã¬ãŒã«ãšãŒãžã§ã³ãã䜿çšããŠãã ããã \n " "顧客ã®ååããå
ã«è³ªåãæ¥ãå Žåã質åãèšæ¶ãã€ã€ãååãèããupdate_customer_infoãåŒã³åºããŠãã ããã \n " "顧客ã®è³ªåã¯ã以äžã®3ã€ã®ã«ããŽãªã«åããããŸãã \n " "1. ååã®åãæ±ãã«é¢ãã質å \n " "2. ååã®æ³šæã»è³Œå
¥ã«é¢ãã質å \n " "3. ãšã©ãŒã»ãã©ãã«ã»ãµããŒãã«é¢ãã質å \n " "é©åãªãšãŒãžã§ã³ãã«åŒãç¶ãã§ãã ããã" ), handoffs=[ self.how_to_agent, self.order_agent, self.error_trouble_agent, ], input_guardrails=[abnormal_guardrail], tools=[update_customer_info], ) # åã³ããªã¢ãŒãžãšãŒãžã§ã³ãã«æ»ãããã®ãã³ããªã self.order_agent.handoffs.append(self.triage_agent) self.how_to_agent.handoffs.append(self.triage_agent) self.error_trouble_agent.handoffs.append(self.triage_agent) self._current_agent = self.triage_agent self._agents_initialized = True except Exception as e: print (f "ãšãŒãžã§ã³ãåæåãšã©ãŒ: {e}" ) async def run (self, transcription: str ) -> AsyncIterator[ str ]: self._on_start(transcription) # ãšãŒãžã§ã³ãã®åæå(åºæ¬çã«ã¯äžåºŠã ã) await self._initialize_agents() # Add the transcription to the input history self._input_history.append( { "role" : "user" , "content" : transcription, } ) try : with trace( "Customer service" , group_id=self._conversation_id): # Run the agent current_context_customer = self._context.customer_name current_context_question_type = self._context.question_type result = Runner.run_streamed(self._current_agent, self._input_history, context=self._context) full_response = "" async for chunk in VoiceWorkflowHelper.stream_text_from(result): full_response += chunk yield chunk self._tts_output(full_response) if self._context.customer_name != current_context_customer or self._context.question_type != current_context_question_type: if self._on_context_change: self._on_context_change(self._context.customer_name, self._context.question_type) # Update the input history and current agent self._input_history = result.to_input_list() if self._current_agent != result.last_agent: self._current_agent = result.last_agent if self._on_agent_change: self._on_agent_change(self._current_agent.name) except InputGuardrailTripwireTriggered as e: message = "ãã¿ãŸããããã®è³ªåã«ã¯ãçãã§ããŸããã" self._tts_output(message) # ã¬ãŒãã¬ãŒã«äœåã®éç¥ if self._on_agent_change: self._on_agent_change( "ã¬ãŒãã¬ãŒã«äœå" ) self._input_history.append( { "role" : "assistant" , "content" : message, } ) self._current_agent = self.triage_agent if self._on_agent_change: self._on_agent_change(self._current_agent.name) yield message except Exception as e: error_message = f "ç³ãèš³ãããŸãããã·ã¹ãã ãšã©ãŒãçºçããŸãã: {str(e)}" self._tts_output(error_message) yield error_message async def cleanup (self): """ãªãœãŒã¹ã®ã¯ãªãŒã³ã¢ãã""" try : if hasattr (self, 'file_mcp_server' ): await self.file_mcp_server.__aexit__( None , None , None ) if hasattr (self, 'slack_mcp_server' ): await self.slack_mcp_server.__aexit__( None , None , None ) except Exception as e: print (f "ã¯ãªãŒã³ã¢ãããšã©ãŒ: {e}" ) # main.py from __future__ import annotations import asyncio import shutil import sounddevice as sd from agents.voice import (StreamedAudioInput, StreamedAudioResult, STTModelSettings, TTSModelSettings, VoicePipeline, VoicePipelineConfig) from config import (CHANNELS, FORMAT, SAMPLE_RATE, VOICE, VOICE_INSTRUCTION, VOICE_SPEED) from dotenv import load_dotenv from my_workflow import VoiceCallCenterWorkflow from textual import events from textual.app import App, ComposeResult from textual.containers import Container from textual.reactive import reactive from textual.widgets import Button, RichLog, Static from typing_extensions import override load_dotenv() # UI Components class Header (Static): """A header widget.""" session_id = reactive( "" ) current_agent = reactive( "ããªã¢ãŒãžãšãŒãžã§ã³ã" ) @ override def render (self) -> str : return f "é³å£°ã³ãŒã«ã»ã³ã¿ãŒ | çŸåšã®ãšãŒãžã§ã³ã: {self.current_agent}" class AudioStatusIndicator (Static): """A widget that shows the current audio recording status.""" is_recording = reactive( False ) @ override def render (self) -> str : status = ( "ðŽ é²é³äž... (KããŒã§åæ¢)" if self.is_recording else "⪠KããŒã§é²é³éå§ (QããŒã§çµäº)" ) return status # Main Application class VoiceCallCenterApp (App[ None ]): CSS = """ Screen { background: #1a1b26; /* Dark blue-grey background */ } Container { border: double rgb(91, 164, 91); } Horizontal { width: 100%; } #input-container { height: 5; /* Explicit height for input container */ margin: 1 1; padding: 1 2; } Input { width: 80%; height: 3; /* Explicit height for input */ } Button { width: 20%; height: 3; /* Explicit height for button */ } #bottom-pane { width: 100%; height: 82%; /* Reduced to make room for session display */ border: round rgb(205, 133, 63); content-align: center middle; } #status-indicator { height: 3; content-align: center middle; background: #2a2b36; border: solid rgb(91, 164, 91); margin: 1 1; } #session-display { height: 3; content-align: center middle; background: #2a2b36; border: solid rgb(91, 164, 91); margin: 1 1; } Static { color: white; } """ should_send_audio: asyncio.Event audio_player: sd.OutputStream last_audio_item_id: str | None connected: asyncio.Event def __init__ (self) -> None : super ().__init__() self.last_audio_item_id = None self.should_send_audio = asyncio.Event() self.connected = asyncio.Event() self.workflow = VoiceCallCenterWorkflow( on_start=self._on_transcription, tts_output=self._tts_output, on_agent_change=self._on_agent_change, on_context_change=self._on_context_change, ) self.voice_config = VoicePipelineConfig( tts_settings=TTSModelSettings( speed=VOICE_SPEED, instructions=VOICE_INSTRUCTION, ), stt_settings=STTModelSettings( turn_detection={ "type" : "server_vad" , "threshold" : 0.5 , "prefix_padding_ms" : 300 , "silence_duration_ms" : 1000 , } ), ) self.pipeline = VoicePipeline(workflow=self.workflow, config=self.voice_config) self._audio_input = StreamedAudioInput() self.audio_player = sd.OutputStream( samplerate=SAMPLE_RATE, channels=CHANNELS, dtype=FORMAT, ) def _on_transcription (self, transcription: str ) -> None : try : self.query_one( "#bottom-pane" , RichLog).write( f "ããªã: {transcription}" ) except Exception : pass def _tts_output (self, text: str ) -> None : try : self.query_one( "#bottom-pane" , RichLog).write(f "ãšãŒãžã§ã³ãå¿ç: {text}" ) except Exception : pass def _on_agent_change (self, agent_name: str ) -> None : try : header = self.query_one( "#session-display" , Header) header.current_agent = agent_name self.query_one( "#bottom-pane" , RichLog).write(f "ð ãšãŒãžã§ã³ãåãæ¿ã: {agent_name}" ) except Exception : pass def _on_context_change (self, customer_name: str , question_type: str ) -> None : try : self.query_one( "#bottom-pane" , RichLog).write( f "ð ã³ã³ããã¹ã倿Ž: 顧客å={customer_name}, 質åã¿ã€ã={question_type}" ) except Exception : pass @ override def compose (self) -> ComposeResult: """Create child widgets for the app.""" with Container(): yield Header( id = "session-display" ) yield AudioStatusIndicator( id = "status-indicator" ) yield RichLog( id = "bottom-pane" , wrap= True , highlight= True , markup= True ) async def on_mount (self) -> None : self.run_worker(self.start_voice_pipeline()) self.run_worker(self.send_mic_audio()) async def start_voice_pipeline (self) -> None : try : self.audio_player.start() self.result: StreamedAudioResult = await self.pipeline.run( self._audio_input ) async for event in self.result.stream(): bottom_pane = self.query_one( "#bottom-pane" , RichLog) if event.type == "voice_stream_event_audio" : self.audio_player.write(event.data) # Play the audio elif event.type == "voice_stream_event_lifecycle" : bottom_pane.write(f "ã©ã€ããµã€ã¯ã«ã€ãã³ã: {event.event}" ) except Exception as e: bottom_pane = self.query_one( "#bottom-pane" , RichLog) bottom_pane.write(f "ãšã©ãŒ: {e}" ) finally : self.audio_player.close() # ã¯ãªãŒã³ã¢ãã await self.workflow.cleanup() async def send_mic_audio (self) -> None : device_info = sd.query_devices() print (device_info) read_size = int (SAMPLE_RATE * 0.02 ) stream = sd.InputStream( channels=CHANNELS, samplerate=SAMPLE_RATE, dtype= "int16" , ) stream.start() status_indicator = self.query_one(AudioStatusIndicator) try : while True : if stream.read_available < read_size: await asyncio.sleep( 0 ) continue await self.should_send_audio.wait() status_indicator.is_recording = True data, _ = stream.read(read_size) await self._audio_input.add_audio(data) await asyncio.sleep( 0 ) except KeyboardInterrupt : pass finally : stream.stop() stream.close() async def on_key (self, event: events.Key) -> None : """Handle key press events.""" if event.key == "enter" : self.query_one(Button).press() return if event.key == "q" : await self.workflow.cleanup() # ã¯ãªãŒã³ã¢ããããŠããçµäº self.exit() return if event.key == "k" : status_indicator = self.query_one(AudioStatusIndicator) if status_indicator.is_recording: self.should_send_audio.clear() status_indicator.is_recording = False else : self.should_send_audio.set() status_indicator.is_recording = True if __name__ == "__main__" : if not shutil.which( "npx" ): raise RuntimeError ( "npx is not installed. Please install it with `npm install -g npx`." ) app = VoiceCallCenterApp() app.run() main.pyã®VoicePipelineã«ãmy_workflow.pyã§äœæããVoiceCallWorkflowãæã¿èŸŒãã§ããŸãããŸããããã³ããšã³ãã®éšåã¯Pythonã§ã¿ãŒããã«äžã«äœããTextualãšãããã¬ãŒã ã¯ãŒã¯ãçšããŠããŸãã ããã§ã¯ããã®ã³ãŒããå®éã«åãããŠã¿ããã¢ã®æ§åã3ã€ã¿ãŠãããŸãããã 1ã€ç®ã®åç»ã¯è£œåã®åãæ±ãã®è³ªåãè¡ãªã£ãŠããäŸã§ããè¿çãé
ããŠæ°æããã§ããããã£ãããšMCPãæ©èœããŠããããã§ããã 2ã€ç®ã®åç»ã¯è£œåã®æ³šæãè¡ãªã£ãŠããäŸã§ããMCPã§è£œåæ
å ±ãæŽçãã€ã€ãååã®æ³šæã¡ãŒã«éä¿¡ãŸã§ãè¡ããŠããŸãã 3ã€ç®ã¯ã¬ãŒãã¬ãŒã«ããããŠèµ·åãããããšããŠããäŸã§ã...ã倱æããŠããŸãããã©ãããŠãªã®ã§ããããïŒ ãã®åå ã¯ãã¹ããªãŒãã³ã°çæãšå
¥åã¬ãŒãã¬ãŒã«ã®çžæ§ãè¯ããªããããšèããããŸãã以äžã®å³ãçšããŠèª¬æããŸããå
¥åã¬ãŒãã¬ãŒã«ã¯é³å£°èªèãå
šãŠçµäºããŠããè¡ãã®ã«å¯Ÿããé³å£°åæã®éšåã§ã¯ãé³å£°èªèãåŸã
ã«ãªãããŠããäžã§ãšãŒãžã§ã³ãã®LLMãåççæãããã®åçããã£ã³ã¯ããšã«åºåããããšããŸãããããšãå
¥åã¬ãŒãã¬ãŒã«ãç°åžžæ€ç¥ããåã«åºåãçæãããŠããŸããããå
ã«LLMã奜ãåæã«åçããæåãããŠããŸã£ããããªã®ã§ãã ã¬ãŒãã¬ãŒã«ãã¹ããªãŒãã³ã°çæã§ããŸãæ©èœããªãããšã衚ããå³ ãããã£ãŠãã¹ããªãŒãã³ã°çæã§ã®é³å£°ãã«ããšãŒãžã§ã³ããæ§ç¯ããå Žåã¯ãå
¥åã¬ãŒãã¬ãŒã«ãšãŒãžã§ã³ãã¯äœ¿çšããªãæ¹ããããšããããšãããããŸããã察çãšããŠããã®ãŸãŸããªã¢ãŒãžãšãŒãžã§ã³ãã«ããã³ãããšããŠã¬ãŒãã¬ãŒã«ãåçŸããŠããŸãæ¹æ³ãèããããŸããå®éã«ããªã¢ãŒãžãšãŒãžã§ã³ãããã¬ãŒãã¬ãŒã«ãšãŒãžã§ã³ããåãå€ããããã³ããã倿ŽããåŸã®ããšãŒãžã§ã³ãã®æŠèгå³ãšã倿Žããéšåã®ã¿ã®ã³ãŒããæ²èŒããŸãã 倿ŽåŸã®é³å£°å¯Ÿè©±åãã«ããšãŒãžã§ã³ãã®æŠèŠ³å³ # my_workflow.pyã®ã¬ãŒãã¬ãŒã«éšåãæ¶ããããªã¢ãŒãžãšãŒãžã§ã³ããšã®çŽä»ããåé€ã代ããã«ããã³ãããä¿®æ£ããŸãã self.triage_agent = Agent[CallCenterAgentContext]( name= "ããªã¢ãŒãžãšãŒãžã§ã³ã" , instructions=( f "{JA_RECOMMENDED_PROMPT_PREFIX} " "ããªãã¯åªç§ãªããªã¢ãŒãžãšãŒãžã§ã³ãã§ãã ããªãã¯ã顧客ã®ãªã¯ãšã¹ããé©åãªãšãŒãžã§ã³ãã«å§ä»»ããããšãã§ããŸãã \n " "顧客ã®è³ªåãã³ãŒã«ã»ã³ã¿ãŒã«ããªããããªè³ªåãããå Žåã¯ãããã¿ãŸããããã®è³ªåã«ã¯çããããŸããããšäŒããŠãã ããã" "ã³ãŒã«ã»ã³ã¿ãŒã«ããªããããªè³ªåã¯ãäžè¬çãªç¥èãéè«ãèšç®åé¡ãªã©ã§ãã \n " "ããšãã°ããããªãã®å¥œããªè²ã¯äœã§ããïŒããšèšã£ã質åãã210ãã4ã¯ïŒããšãã£ã質åã¯ãã³ãŒã«ã»ã³ã¿ãŒã«ããã¹ãã§ã¯ãããŸããã \n " "äŒç€Ÿã«é¢ãã質åã§ããããã®äŒç€Ÿã®èšç«å¹Žã¯ãã€ã§ããïŒããšãã£ã質åã¯ãã³ãŒã«ã»ã³ã¿ãŒã«ããã¹ãã§ã¯ãããŸããã \n " "顧客ã®è³ªåã«çããããã«ã顧客ã®ååãšè³ªåã®ã¿ã€ããupdate_customer_infoãåŒã³åºããŠä¿åããŠãã ããã \n " "顧客ã®ååã®ã¿åãã£ãå Žåã§ããupdate_customer_infoãåŒã³åºãã質åã¯NoneãšããŠä¿åããŠãã ããã \n " "顧客ã®ååããå
ã«è³ªåãæ¥ãå Žåãupdate_customer_infoãåŒã³åºãã顧客ã®ååã¯NoneãšããŠä¿åããŠãã ãããããã«é¡§å®¢ã®ååãèãåºããŠãã ããã \n " "質åã¿ã€ãã話ã®éäžããå€ããå Žåããupdate_customer_infoãåŒã³åºããŠæŽæ°ããŠãã ããã \n " "顧客ã®è³ªåã¯ã以äžã®4ã€ã®ã«ããŽãªã«åããããŸãã \n " "1. ååã®åãæ±ãã«é¢ãã質å \n " "2. ååã®æ³šæã»è³Œå
¥ã«é¢ãã質å \n " "3. ãšã©ãŒã»ãã©ãã«ã»ã¯ã¬ãŒã ã«é¢ãã質å \n " "4. ãã®ä»ã®åçäžå¯èœã»å°éç¥èãå¿
èŠãªè³ªå \n " "é©åãªãšãŒãžã§ã³ãã«åŒãç¶ãã§ãã ããã \n " ), handoffs=[ self.how_to_agent, self.order_agent, self.error_trouble_agent, ], tools=[update_customer_info], ) 以äžã®å€æŽãå ããåŸã®ãã¢ã®æ§åãèŠãŠã¿ãŸãããã ã¬ãŒãã¬ãŒã«ãšãŒãžã§ã³ããªãã§ããããŸãã¬ãŒãã¬ãŒã«ã®æ©èœã¯åçŸã§ããŠããŸããïŒ æ®ã課é¡ãšããŠã¯ããã¯ãã¬ã€ãã³ã·ãŒãæããããŸããæåã®åçãŸã§ã«10ç§ä»¥äžã2åç®ä»¥éã®åçãŸã§ã«äœæ6,7ç§ã»ã©ã®åŸ
ã¡æéãããã®ã§ãäœ¿çšæã¯æ£çŽè¯ããªãã§ãã ããããChained Architectureã§ã¯ãªãSpeech-to-Speech Architectureã«å€æŽããããšã§ããã®ã¬ã€ãã³ã·ãŒã¯æ¹åã§ããèŠèŸŒã¿ããããŸãã 人éãè¡ãªã£ãŠããã³ãŒã«ã»ã³ã¿ãŒã§å®éã«ã©ããããã®é
å»¶ã§ããã°èš±å®¹ã§ããããèžãŸããäžã§ãã¬ã€ãã³ã·ãŒãæžããã»ãããã¯æããããªããããªäœããèããå®çšåã«åããŠããŸããŸãªæèŠãåãå
¥ããªããèšèšããŠããå¿
èŠããããšæããŸããã 2.4 ä»åŸã®å±æ åç¯ã§ã¯ãChained Architectureã®é³å£°å¯Ÿè©±åãã«ããšãŒãžã§ã³ãã®æ§èœãšèª²é¡ããäŒãããŸããããã®æ§é ã§ã®é³å£°ãšãŒãžã§ã³ããæã€èª²é¡ãæç¢ºã«ã§ããã®ãä»åã®åç©«ã ã£ãããªãšæããŸããåç¯ã§ãè¿°ã¹ãããã«ãé
å»¶ãæžãã察çãšããŠãæ§é ããªã¢ã«ã¿ã€ã 察話åã¢ãã«ãåºç€ãšããSpeech-to-Speech Architectureã«å€ããããšãèããããŸãããããæ©äŒãããã°ãã®æ§é ã«å€ãããšããã©ãã ãé
å»¶ãæ¹åãããããã玹ä»ã§ããã°ãšæããŸãã 話ãå°ã倧ãããªããŸãããæ¬ç« ã®æåŸã«ãAIãšãŒãžã§ã³ãã®ããããã®å±æã«ã€ããŠå°ã話ãããŠãã ãããç§èªèº«ãAIãšãŒãžã§ã³ããä»åã®ã¢ã«ãã€ãã§åããŠäœã£ãŠã¿ãã®ã§ãããããã¹ãããŒã¹ã®ãã®ã«é³å£°ãä»ãã ãã§äžæ°ã«ãåããåŽã«çžæãããããããªæèŠã匷ãŸãã®ãæããŸãããAIãšãŒãžã§ã³ãã¯ãããããããäººéæ§ã垯ã³ãŠããã®ã§ã¯ãªãããšç§ã¯èããŠããŸããå
·äœçãªäŸãšããŠã人éããã£ã©ã¯ã¿ãŒãæš¡ããã¢ãã¿ãŒããŒã¹ã®AIãšãŒãžã§ã³ãããããããããŒã¹ã®AIãšãŒãžã§ã³ãã®å¿çšã掻çºåããå¯èœæ§ããããŸãããã®åŸãã«ã¹ã¿ãã€ãºã«ããäŒç€Ÿç¬èªã®ããžã¿ã«ãã¥ãŒãã³ãäŒç€Ÿã®æ°ãããã©ã³ãã圢æããããããããå士ãååããŠä»äºããµããŒããããããªæªæ¥ã蚪ãããããããŸããããããæ¥œãã¿ã«ãã€ã€ãããã«ããŠäººéãšAIãå調ããŠããããšããè°è«ãç¶ç¶ããŠè¡ãå¿
èŠããããšèããŸãã ãããã« ä»åã®ã¢ã«ãã€ãã®ããŒãã§ããããŸããããé³å£°ãšãŒãžã§ã³ãé¢é£ãµãŒãã€ãã§åŸãç¥èŠã®äžéšãæ¬èšäºã§ãŸãšããããŠããã ããŸããã æ¬ã¢ã«ãã€ããéããŠãåããŠé³å£°ãšãšãŒãžã§ã³ãäž¡é¢ã®æè¡ã«è§Šããããšãã§ããã©ã¡ããšãé¢å¿ãé«ããããšãã§ããŸããã ä»åŸã¯ãSpeech-to-Speechã¢ãã«ã§ã®ãã«ããšãŒãžã§ã³ãã®æ§ç¯ããã«ã¹ã¿ã ãã€ã¹ãå©çšããã¢ãã¿ãŒããŒã¹ã§ã®é³å£°ãšãŒãžã§ã³ãã®å¯èœæ§ãæ¢ã£ãŠãããããšæããŸãã ä»åã®ã¢ã«ãã€ãã§ãäžè©±ã«ãªããŸããInsight Edgeã®ç€Ÿå¡ã®æ¹ã
ãç¹ã«ãã¥ãŒã¿ãŒãšããŠæ¥é ããã¢ã«ãã€ãã®ãµããŒããããŠãã ãããŸããé è³ããã«ãå¿ããæè¬ãç³ãäžããããšæããŸãã ãããŸã§èªãã§ããã ããããããšãããããŸããïŒïŒ åèè³æ Google Cloud, "Agent2Agent ãããã³ã«ïŒA2AïŒãçºè¡šïŒãšãŒãžã§ã³ãã®çžäºéçšæ§ã®æ°æä»£", https://cloud.google.com/blog/ja/products/ai-machine-learning/a2a-a-new-era-of-agent-interoperability Agent2Agent (A2A) Protocol, Home, https://a2aproject.github.io/A2A/latest/#why-a2a-matters Google AI Developers, "é³å£°çæ(ããã¹ãèªã¿äžã)", https://ai.google.dev/gemini-api/docs/speech-generation?hl=ja AWS, "Amazon Nova Documentation", https://docs.aws.amazon.com/nova/ taku_sid,ããããã§ããããAmazon Nova Sonicå
¥éã, https://zenn.dev/taku_sid/articles/20250413_nova_sonic OpenAI, "Voice agents", https://platform.openai.com/docs/guides/voice-agents?voice-agent-architecture=speech-to-speech OpenAI Agents SDK, https://openai.github.io/openai-agents-python/ja/ takemo101, æ ªåŒäŒç€Ÿãœããã¯ããŒã ,ãGemini API TTSïŒText-to-SpeechïŒã§æŒ«æé³å£°ãçæããŠã¿ãã, https://zenn.dev/sonicmoov/articles/bd862039bcba46