[{"data":1,"prerenderedAt":1110},["ShallowReactive",2],{"article-alternates":3,"article-\u002Fru\u002Fai\u002Fmulti-agent-orchestration-llm":13},{"i18nKey":4,"paths":5},"ai-008-2026-05",{"de":6,"en":7,"es":8,"fr":9,"it":10,"ru":11,"tr":12},"\u002Fde\u002Fai\u002Fmulti-agent-orchestrierung-llm-aufrufe-systeme","\u002Fen\u002Fai\u002Fmulti-agent-orchestration-single-llm-call","\u002Fes\u002Fai\u002Forquestracion-multi-agente","\u002Ffr\u002Fai\u002Fmulti-agent-orchestration-systemes","\u002Fit\u002Fai\u002Fmulti-agent-orchestration-sistemi","\u002Fru\u002Fai\u002Fmulti-agent-orchestration-llm","\u002Ftr\u002Fai\u002Fmulti-agent-orchestration-tek-llm-cagrisindan-sistemlere",{"_path":11,"_dir":14,"_draft":15,"_partial":15,"_locale":16,"title":17,"description":18,"publishedAt":19,"modifiedAt":19,"category":14,"i18nKey":4,"tags":20,"readingTime":26,"author":27,"body":28,"_type":1104,"_id":1105,"_source":1106,"_file":1107,"_stem":1108,"_extension":1109},"ai",false,"","Multi-Agent Orchestration: От одного вызова LLM к системам","Agent SDK'и, tool use и параллельные\u002Fпоследовательные топологии — трансформация LLM в production-систему с анализом latency, стоимости и надежности.","2026-05-23",[21,22,23,24,25],"multi-agent","llm-orchestration","tool-use","agent-sdk","ai-engineering",8,"Roibase",{"type":29,"children":30,"toc":1094},"root",[31,39,46,51,56,61,231,236,242,262,272,289,305,312,419,424,430,442,667,672,682,859,864,870,875,883,923,931,962,972,982,988,993,1001,1006,1018,1024,1036,1041,1074,1079,1083,1088],{"type":32,"tag":33,"props":34,"children":35},"element","p",{},[36],{"type":37,"value":38},"text","В 2024 году «AI-ассистент» означал один цикл prompt-response. В 2026-м production — это совсем другое: параллельные agent mesh'и, последовательные orchestration pipeline'ы, агенты, подключённые к внешним системам через tool use. Вместо единого LLM-вызова система агентов, обменивающихся сигналами — переписывает баланс reliability, cost и latency. Multi-agent orchestration — это архитектурный слой, превращающий LLM в компонент production infrastructure.",{"type":32,"tag":40,"props":41,"children":43},"h2",{"id":42},"agent-sdkи-и-слой-tool-use",[44],{"type":37,"value":45},"Agent SDK'и и слой Tool Use",{"type":32,"tag":33,"props":47,"children":48},{},[49],{"type":37,"value":50},"Agent framework'и — LangGraph, Autogen, CrewAI — дают LLM право «вызывать функции». Tool use — модель трансформирует собственный output в function call согласно JSON schema, а interpreter'ом выполняет функцию и возвращает результат обратно в prompt. OpenAI function calling, tool use API Claude от Anthropic, function declaration Gemini от Google — все построены на одном принципе: LLM не может выполнять детерминированный код, но может указать, какую функцию с какими параметрами вызвать.",{"type":32,"tag":33,"props":52,"children":53},{},[54],{"type":37,"value":55},"SDK'и управляют этим циклом: приходит query пользователя, модель говорит «обратись к weather API с параметром city=Istanbul», orchestrator вызывает API, возвращает результат в prompt, модель создаёт final output. Это 3 раунда = 3× latency. В production цепь tool call'ов растёт до 5–7 шагов, каждый добавляет 200–800ms — итого 1–5 секунд response time. В multi-agent цель — разбить эту latency параллелизацией и кешированием.",{"type":32,"tag":33,"props":57,"children":58},{},[59],{"type":37,"value":60},"Пример определения tool'а:",{"type":32,"tag":62,"props":63,"children":67},"pre",{"code":64,"language":65,"meta":16,"className":66,"style":16},"tools = [\n    {\n        \"name\": \"query_analytics\",\n        \"description\": \"Получить метрику из BigQuery\",\n        \"parameters\": {\n            \"metric\": \"string (revenue|sessions|conversions)\",\n            \"date_range\": \"string (7d|30d|90d)\"\n        }\n    }\n]\n","python","language-python shiki shiki-themes github-dark",[68],{"type":32,"tag":69,"props":70,"children":71},"code",{"__ignoreMap":16},[72,95,104,129,151,165,187,205,213,222],{"type":32,"tag":73,"props":74,"children":77},"span",{"class":75,"line":76},"line",1,[78,84,90],{"type":32,"tag":73,"props":79,"children":81},{"style":80},"--shiki-default:#E1E4E8",[82],{"type":37,"value":83},"tools ",{"type":32,"tag":73,"props":85,"children":87},{"style":86},"--shiki-default:#F97583",[88],{"type":37,"value":89},"=",{"type":32,"tag":73,"props":91,"children":92},{"style":80},[93],{"type":37,"value":94}," [\n",{"type":32,"tag":73,"props":96,"children":98},{"class":75,"line":97},2,[99],{"type":32,"tag":73,"props":100,"children":101},{"style":80},[102],{"type":37,"value":103},"    {\n",{"type":32,"tag":73,"props":105,"children":107},{"class":75,"line":106},3,[108,114,119,124],{"type":32,"tag":73,"props":109,"children":111},{"style":110},"--shiki-default:#9ECBFF",[112],{"type":37,"value":113},"        \"name\"",{"type":32,"tag":73,"props":115,"children":116},{"style":80},[117],{"type":37,"value":118},": ",{"type":32,"tag":73,"props":120,"children":121},{"style":110},[122],{"type":37,"value":123},"\"query_analytics\"",{"type":32,"tag":73,"props":125,"children":126},{"style":80},[127],{"type":37,"value":128},",\n",{"type":32,"tag":73,"props":130,"children":132},{"class":75,"line":131},4,[133,138,142,147],{"type":32,"tag":73,"props":134,"children":135},{"style":110},[136],{"type":37,"value":137},"        \"description\"",{"type":32,"tag":73,"props":139,"children":140},{"style":80},[141],{"type":37,"value":118},{"type":32,"tag":73,"props":143,"children":144},{"style":110},[145],{"type":37,"value":146},"\"Получить метрику из BigQuery\"",{"type":32,"tag":73,"props":148,"children":149},{"style":80},[150],{"type":37,"value":128},{"type":32,"tag":73,"props":152,"children":154},{"class":75,"line":153},5,[155,160],{"type":32,"tag":73,"props":156,"children":157},{"style":110},[158],{"type":37,"value":159},"        \"parameters\"",{"type":32,"tag":73,"props":161,"children":162},{"style":80},[163],{"type":37,"value":164},": {\n",{"type":32,"tag":73,"props":166,"children":168},{"class":75,"line":167},6,[169,174,178,183],{"type":32,"tag":73,"props":170,"children":171},{"style":110},[172],{"type":37,"value":173},"            \"metric\"",{"type":32,"tag":73,"props":175,"children":176},{"style":80},[177],{"type":37,"value":118},{"type":32,"tag":73,"props":179,"children":180},{"style":110},[181],{"type":37,"value":182},"\"string (revenue|sessions|conversions)\"",{"type":32,"tag":73,"props":184,"children":185},{"style":80},[186],{"type":37,"value":128},{"type":32,"tag":73,"props":188,"children":190},{"class":75,"line":189},7,[191,196,200],{"type":32,"tag":73,"props":192,"children":193},{"style":110},[194],{"type":37,"value":195},"            \"date_range\"",{"type":32,"tag":73,"props":197,"children":198},{"style":80},[199],{"type":37,"value":118},{"type":32,"tag":73,"props":201,"children":202},{"style":110},[203],{"type":37,"value":204},"\"string (7d|30d|90d)\"\n",{"type":32,"tag":73,"props":206,"children":207},{"class":75,"line":26},[208],{"type":32,"tag":73,"props":209,"children":210},{"style":80},[211],{"type":37,"value":212},"        }\n",{"type":32,"tag":73,"props":214,"children":216},{"class":75,"line":215},9,[217],{"type":32,"tag":73,"props":218,"children":219},{"style":80},[220],{"type":37,"value":221},"    }\n",{"type":32,"tag":73,"props":223,"children":225},{"class":75,"line":224},10,[226],{"type":32,"tag":73,"props":227,"children":228},{"style":80},[229],{"type":37,"value":230},"]\n",{"type":32,"tag":33,"props":232,"children":233},{},[234],{"type":37,"value":235},"Если модель решает использовать tool, orchestrator вызывает BigQuery client, результат append'ит в prompt, модель выполняет финальный синтез. Сила tool use: LLM получает доступ во внешний мир, не жертвуя детерминизмом.",{"type":32,"tag":40,"props":237,"children":239},{"id":238},"параллельные-и-последовательные-топологии-агентов",[240],{"type":37,"value":241},"Параллельные и последовательные топологии агентов",{"type":32,"tag":33,"props":243,"children":244},{},[245,247,253,255,260],{"type":37,"value":246},"Один агент = последовательная обработка. Multi-agent = гибрид параллели и последовательности. Два основных паттерна: ",{"type":32,"tag":248,"props":249,"children":250},"strong",{},[251],{"type":37,"value":252},"scatter-gather",{"type":37,"value":254}," и ",{"type":32,"tag":248,"props":256,"children":257},{},[258],{"type":37,"value":259},"pipeline",{"type":37,"value":261},".",{"type":32,"tag":33,"props":263,"children":264},{},[265,270],{"type":32,"tag":248,"props":266,"children":267},{},[268],{"type":37,"value":269},"Scatter-gather:",{"type":37,"value":271}," главный orchestrator разбивает задачу на 3 подагента, каждый одновременно работает с разным tool'ом, результаты сходятся в центральном агенте. Пример: «Проанализируй производительность кампании за прошлый месяц» → agent_1 к Google Ads API, agent_2 к Meta Ads API, agent_3 к BigQuery, все параллельно. Orchestrator получает 3 ответа, синтезирует, выдаёт финальный отчёт. Latency: max(agent_1, agent_2, agent_3) + synthesis latency. Если бы было последовательно: agent_1 + agent_2 + agent_3 + synthesis. Вместо 3×800ms получаем 800ms + 300ms = 1.1s.",{"type":32,"tag":33,"props":273,"children":274},{},[275,280,282,287],{"type":32,"tag":248,"props":276,"children":277},{},[278],{"type":37,"value":279},"Pipeline:",{"type":37,"value":281}," output агента A становится input'ом агента B. Пример: (1) query planner агент пишет SQL → (2) execution агент выполняет SQL → (3) visualization агент создаёт spec графика. Каждый шаг — dependency следующего. Latency последовательная, но ",{"type":32,"tag":248,"props":283,"children":284},{},[285],{"type":37,"value":286},"каждый агент специализирован",{"type":37,"value":288}," — query planner это маленькая модель (GPT-4o-mini, 50ms), не требует execution logic, visualization агент может быть Gemini Flash. Вместо одной большой модели 3 маленькие = дешевле + быстрее (в ряде случаев).",{"type":32,"tag":33,"props":290,"children":291},{},[292,294,303],{"type":37,"value":293},"В сервисах Roibase ",{"type":32,"tag":295,"props":296,"children":300},"a",{"href":297,"rel":298},"https:\u002F\u002Fwww.roibase.com.tr\u002Fru\u002Ffirstparty",[299],"nofollow",[301],{"type":37,"value":302},"First-Party Veri & Ölçüm Mimarisi",{"type":37,"value":304}," multi-agent orchestration'ы используются в attribution pipeline'ах: один агент парсит raw event, один привязывает к session, один мэпит revenue, финальный считает cross-channel attribution. Pipeline topology = детерминированные шаги, каждый со своим набором tool'ов.",{"type":32,"tag":306,"props":307,"children":309},"h3",{"id":308},"параллель-vs-последовательность-tradeoff",[310],{"type":37,"value":311},"Параллель vs последовательность: tradeoff",{"type":32,"tag":313,"props":314,"children":315},"table",{},[316,345],{"type":32,"tag":317,"props":318,"children":319},"thead",{},[320],{"type":32,"tag":321,"props":322,"children":323},"tr",{},[324,330,335,340],{"type":32,"tag":325,"props":326,"children":327},"th",{},[328],{"type":37,"value":329},"Топология",{"type":32,"tag":325,"props":331,"children":332},{},[333],{"type":37,"value":334},"Latency",{"type":32,"tag":325,"props":336,"children":337},{},[338],{"type":37,"value":339},"Cost",{"type":32,"tag":325,"props":341,"children":342},{},[343],{"type":37,"value":344},"Использование",{"type":32,"tag":346,"props":347,"children":348},"tbody",{},[349,373,396],{"type":32,"tag":321,"props":350,"children":351},{},[352,358,363,368],{"type":32,"tag":353,"props":354,"children":355},"td",{},[356],{"type":37,"value":357},"Параллель (scatter-gather)",{"type":32,"tag":353,"props":359,"children":360},{},[361],{"type":37,"value":362},"Низкая (макс операция)",{"type":32,"tag":353,"props":364,"children":365},{},[366],{"type":37,"value":367},"Высокая (N агент × LLM call)",{"type":32,"tag":353,"props":369,"children":370},{},[371],{"type":37,"value":372},"Независимые запросы (pull данных из разных источников)",{"type":32,"tag":321,"props":374,"children":375},{},[376,381,386,391],{"type":32,"tag":353,"props":377,"children":378},{},[379],{"type":37,"value":380},"Последовательность (pipeline)",{"type":32,"tag":353,"props":382,"children":383},{},[384],{"type":37,"value":385},"Высокая (сумма операций)",{"type":32,"tag":353,"props":387,"children":388},{},[389],{"type":37,"value":390},"Средняя (каждый агент может быть маленькой моделью)",{"type":32,"tag":353,"props":392,"children":393},{},[394],{"type":37,"value":395},"Зависимые операции (parse → enrich → analyze)",{"type":32,"tag":321,"props":397,"children":398},{},[399,404,409,414],{"type":32,"tag":353,"props":400,"children":401},{},[402],{"type":37,"value":403},"Гибрид (параллель → merge → последовательность)",{"type":32,"tag":353,"props":405,"children":406},{},[407],{"type":37,"value":408},"Средняя",{"type":32,"tag":353,"props":410,"children":411},{},[412],{"type":37,"value":413},"Средняя-Высокая",{"type":32,"tag":353,"props":415,"children":416},{},[417],{"type":37,"value":418},"Сложные задачи (параллель сбор данных, pipeline синтез)",{"type":32,"tag":33,"props":420,"children":421},{},[422],{"type":37,"value":423},"В production параллель scatter-gather защищена от rate limit с помощью concurrency limit (например: max 5 одновременных LLM call'ов). В pipeline кешируют intermediate результаты — если output агента A валиден 10 минут, тот же query заставит агента B начать с cached output вместо нуля.",{"type":32,"tag":40,"props":425,"children":427},{"id":426},"ответственность-orchestratorа-routing-и-error-handling",[428],{"type":37,"value":429},"Ответственность Orchestrator'а: routing и error handling",{"type":32,"tag":33,"props":431,"children":432},{},[433,435,440],{"type":37,"value":434},"Orchestrator не просто триггерит агентов, а ",{"type":32,"tag":248,"props":436,"children":437},{},[438],{"type":37,"value":439},"решает, кому какую задачу дать",{"type":37,"value":441},". В LangGraph это называется «supervisor agent»: categorize query и выполняет routing. Логика:",{"type":32,"tag":62,"props":443,"children":445},{"code":444,"language":65,"meta":16,"className":66,"style":16},"def route_query(user_query: str) -> str:\n    # LLM-based router (маленькая модель, быстро)\n    classification = llm.classify(user_query, categories=[\"data_query\", \"content_gen\", \"code_review\"])\n    \n    if classification == \"data_query\":\n        return \"analytics_agent\"\n    elif classification == \"content_gen\":\n        return \"writer_agent\"\n    else:\n        return \"code_agent\"\n",[446],{"type":32,"tag":69,"props":447,"children":448},{"__ignoreMap":16},[449,488,497,558,566,593,606,631,643,655],{"type":32,"tag":73,"props":450,"children":451},{"class":75,"line":76},[452,457,463,468,474,479,483],{"type":32,"tag":73,"props":453,"children":454},{"style":86},[455],{"type":37,"value":456},"def",{"type":32,"tag":73,"props":458,"children":460},{"style":459},"--shiki-default:#B392F0",[461],{"type":37,"value":462}," route_query",{"type":32,"tag":73,"props":464,"children":465},{"style":80},[466],{"type":37,"value":467},"(user_query: ",{"type":32,"tag":73,"props":469,"children":471},{"style":470},"--shiki-default:#79B8FF",[472],{"type":37,"value":473},"str",{"type":32,"tag":73,"props":475,"children":476},{"style":80},[477],{"type":37,"value":478},") -> ",{"type":32,"tag":73,"props":480,"children":481},{"style":470},[482],{"type":37,"value":473},{"type":32,"tag":73,"props":484,"children":485},{"style":80},[486],{"type":37,"value":487},":\n",{"type":32,"tag":73,"props":489,"children":490},{"class":75,"line":97},[491],{"type":32,"tag":73,"props":492,"children":494},{"style":493},"--shiki-default:#6A737D",[495],{"type":37,"value":496},"    # LLM-based router (маленькая модель, быстро)\n",{"type":32,"tag":73,"props":498,"children":499},{"class":75,"line":106},[500,505,509,514,520,524,529,534,539,544,548,553],{"type":32,"tag":73,"props":501,"children":502},{"style":80},[503],{"type":37,"value":504},"    classification ",{"type":32,"tag":73,"props":506,"children":507},{"style":86},[508],{"type":37,"value":89},{"type":32,"tag":73,"props":510,"children":511},{"style":80},[512],{"type":37,"value":513}," llm.classify(user_query, ",{"type":32,"tag":73,"props":515,"children":517},{"style":516},"--shiki-default:#FFAB70",[518],{"type":37,"value":519},"categories",{"type":32,"tag":73,"props":521,"children":522},{"style":86},[523],{"type":37,"value":89},{"type":32,"tag":73,"props":525,"children":526},{"style":80},[527],{"type":37,"value":528},"[",{"type":32,"tag":73,"props":530,"children":531},{"style":110},[532],{"type":37,"value":533},"\"data_query\"",{"type":32,"tag":73,"props":535,"children":536},{"style":80},[537],{"type":37,"value":538},", ",{"type":32,"tag":73,"props":540,"children":541},{"style":110},[542],{"type":37,"value":543},"\"content_gen\"",{"type":32,"tag":73,"props":545,"children":546},{"style":80},[547],{"type":37,"value":538},{"type":32,"tag":73,"props":549,"children":550},{"style":110},[551],{"type":37,"value":552},"\"code_review\"",{"type":32,"tag":73,"props":554,"children":555},{"style":80},[556],{"type":37,"value":557},"])\n",{"type":32,"tag":73,"props":559,"children":560},{"class":75,"line":131},[561],{"type":32,"tag":73,"props":562,"children":563},{"style":80},[564],{"type":37,"value":565},"    \n",{"type":32,"tag":73,"props":567,"children":568},{"class":75,"line":153},[569,574,579,584,589],{"type":32,"tag":73,"props":570,"children":571},{"style":86},[572],{"type":37,"value":573},"    if",{"type":32,"tag":73,"props":575,"children":576},{"style":80},[577],{"type":37,"value":578}," classification ",{"type":32,"tag":73,"props":580,"children":581},{"style":86},[582],{"type":37,"value":583},"==",{"type":32,"tag":73,"props":585,"children":586},{"style":110},[587],{"type":37,"value":588}," \"data_query\"",{"type":32,"tag":73,"props":590,"children":591},{"style":80},[592],{"type":37,"value":487},{"type":32,"tag":73,"props":594,"children":595},{"class":75,"line":167},[596,601],{"type":32,"tag":73,"props":597,"children":598},{"style":86},[599],{"type":37,"value":600},"        return",{"type":32,"tag":73,"props":602,"children":603},{"style":110},[604],{"type":37,"value":605}," \"analytics_agent\"\n",{"type":32,"tag":73,"props":607,"children":608},{"class":75,"line":189},[609,614,618,622,627],{"type":32,"tag":73,"props":610,"children":611},{"style":86},[612],{"type":37,"value":613},"    elif",{"type":32,"tag":73,"props":615,"children":616},{"style":80},[617],{"type":37,"value":578},{"type":32,"tag":73,"props":619,"children":620},{"style":86},[621],{"type":37,"value":583},{"type":32,"tag":73,"props":623,"children":624},{"style":110},[625],{"type":37,"value":626}," \"content_gen\"",{"type":32,"tag":73,"props":628,"children":629},{"style":80},[630],{"type":37,"value":487},{"type":32,"tag":73,"props":632,"children":633},{"class":75,"line":26},[634,638],{"type":32,"tag":73,"props":635,"children":636},{"style":86},[637],{"type":37,"value":600},{"type":32,"tag":73,"props":639,"children":640},{"style":110},[641],{"type":37,"value":642}," \"writer_agent\"\n",{"type":32,"tag":73,"props":644,"children":645},{"class":75,"line":215},[646,651],{"type":32,"tag":73,"props":647,"children":648},{"style":86},[649],{"type":37,"value":650},"    else",{"type":32,"tag":73,"props":652,"children":653},{"style":80},[654],{"type":37,"value":487},{"type":32,"tag":73,"props":656,"children":657},{"class":75,"line":224},[658,662],{"type":32,"tag":73,"props":659,"children":660},{"style":86},[661],{"type":37,"value":600},{"type":32,"tag":73,"props":663,"children":664},{"style":110},[665],{"type":37,"value":666}," \"code_agent\"\n",{"type":32,"tag":33,"props":668,"children":669},{},[670],{"type":37,"value":671},"Router агент обычно GPT-4o-mini или Claude Haiku — быстрые, дешёвые модели. Добавляют 50–100ms overhead, но предотвращают ненужное использование больших моделей. Если пользователь говорит «суммируй производительность кампании» — идёт к analytics_agent (BigQuery tool use), если «напиши блог» — к writer_agent (web search + writing LLM).",{"type":32,"tag":33,"props":673,"children":674},{},[675,680],{"type":32,"tag":248,"props":676,"children":677},{},[678],{"type":37,"value":679},"Error handling в multi-agent критичен.",{"type":37,"value":681}," Если один агент hallucinate'ит и выдаёт неправильный output, агент_2 работает с этой ошибкой и cascade failure распространяется. Orchestrator должен валидировать output каждого агента:",{"type":32,"tag":62,"props":683,"children":685},{"code":684,"language":65,"meta":16,"className":66,"style":16},"def validate_agent_output(output: dict, schema: dict) -> bool:\n    # JSON schema validation\n    if not matches_schema(output, schema):\n        raise AgentOutputError(\"Output агента не соответствует schema\")\n    \n    # Семантическая проверка (опционально, дорого)\n    if confidence_score(output) \u003C 0.7:\n        return False  # retry или fallback\n    \n    return True\n",[686],{"type":32,"tag":69,"props":687,"children":688},{"__ignoreMap":16},[689,733,741,758,781,788,796,822,839,846],{"type":32,"tag":73,"props":690,"children":691},{"class":75,"line":76},[692,696,701,706,711,716,720,724,729],{"type":32,"tag":73,"props":693,"children":694},{"style":86},[695],{"type":37,"value":456},{"type":32,"tag":73,"props":697,"children":698},{"style":459},[699],{"type":37,"value":700}," validate_agent_output",{"type":32,"tag":73,"props":702,"children":703},{"style":80},[704],{"type":37,"value":705},"(output: ",{"type":32,"tag":73,"props":707,"children":708},{"style":470},[709],{"type":37,"value":710},"dict",{"type":32,"tag":73,"props":712,"children":713},{"style":80},[714],{"type":37,"value":715},", schema: ",{"type":32,"tag":73,"props":717,"children":718},{"style":470},[719],{"type":37,"value":710},{"type":32,"tag":73,"props":721,"children":722},{"style":80},[723],{"type":37,"value":478},{"type":32,"tag":73,"props":725,"children":726},{"style":470},[727],{"type":37,"value":728},"bool",{"type":32,"tag":73,"props":730,"children":731},{"style":80},[732],{"type":37,"value":487},{"type":32,"tag":73,"props":734,"children":735},{"class":75,"line":97},[736],{"type":32,"tag":73,"props":737,"children":738},{"style":493},[739],{"type":37,"value":740},"    # JSON schema validation\n",{"type":32,"tag":73,"props":742,"children":743},{"class":75,"line":106},[744,748,753],{"type":32,"tag":73,"props":745,"children":746},{"style":86},[747],{"type":37,"value":573},{"type":32,"tag":73,"props":749,"children":750},{"style":86},[751],{"type":37,"value":752}," not",{"type":32,"tag":73,"props":754,"children":755},{"style":80},[756],{"type":37,"value":757}," matches_schema(output, schema):\n",{"type":32,"tag":73,"props":759,"children":760},{"class":75,"line":131},[761,766,771,776],{"type":32,"tag":73,"props":762,"children":763},{"style":86},[764],{"type":37,"value":765},"        raise",{"type":32,"tag":73,"props":767,"children":768},{"style":80},[769],{"type":37,"value":770}," AgentOutputError(",{"type":32,"tag":73,"props":772,"children":773},{"style":110},[774],{"type":37,"value":775},"\"Output агента не соответствует schema\"",{"type":32,"tag":73,"props":777,"children":778},{"style":80},[779],{"type":37,"value":780},")\n",{"type":32,"tag":73,"props":782,"children":783},{"class":75,"line":153},[784],{"type":32,"tag":73,"props":785,"children":786},{"style":80},[787],{"type":37,"value":565},{"type":32,"tag":73,"props":789,"children":790},{"class":75,"line":167},[791],{"type":32,"tag":73,"props":792,"children":793},{"style":493},[794],{"type":37,"value":795},"    # Семантическая проверка (опционально, дорого)\n",{"type":32,"tag":73,"props":797,"children":798},{"class":75,"line":189},[799,803,808,813,818],{"type":32,"tag":73,"props":800,"children":801},{"style":86},[802],{"type":37,"value":573},{"type":32,"tag":73,"props":804,"children":805},{"style":80},[806],{"type":37,"value":807}," confidence_score(output) ",{"type":32,"tag":73,"props":809,"children":810},{"style":86},[811],{"type":37,"value":812},"\u003C",{"type":32,"tag":73,"props":814,"children":815},{"style":470},[816],{"type":37,"value":817}," 0.7",{"type":32,"tag":73,"props":819,"children":820},{"style":80},[821],{"type":37,"value":487},{"type":32,"tag":73,"props":823,"children":824},{"class":75,"line":26},[825,829,834],{"type":32,"tag":73,"props":826,"children":827},{"style":86},[828],{"type":37,"value":600},{"type":32,"tag":73,"props":830,"children":831},{"style":470},[832],{"type":37,"value":833}," False",{"type":32,"tag":73,"props":835,"children":836},{"style":493},[837],{"type":37,"value":838},"  # retry или fallback\n",{"type":32,"tag":73,"props":840,"children":841},{"class":75,"line":215},[842],{"type":32,"tag":73,"props":843,"children":844},{"style":80},[845],{"type":37,"value":565},{"type":32,"tag":73,"props":847,"children":848},{"class":75,"line":224},[849,854],{"type":32,"tag":73,"props":850,"children":851},{"style":86},[852],{"type":37,"value":853},"    return",{"type":32,"tag":73,"props":855,"children":856},{"style":470},[857],{"type":37,"value":858}," True\n",{"type":32,"tag":33,"props":860,"children":861},{},[862],{"type":37,"value":863},"Если агент_1 неудачен, orchestrator идёт по fallback chain: сначала retry (1×), потом альтернативный агент (более крупная модель), потом human-in-the-loop. Без этой логики multi-agent ненадёжен.",{"type":32,"tag":40,"props":865,"children":867},{"id":866},"latency-и-cost-benchmarkи",[868],{"type":37,"value":869},"Latency и Cost: benchmark'и",{"type":32,"tag":33,"props":871,"children":872},{},[873],{"type":37,"value":874},"Сценарий: «Проанализируй тренд дохода за 30 дней, суммируй производительность кампании, подготовь summary email для CEO» — 3 независимых задачи.",{"type":32,"tag":33,"props":876,"children":877},{},[878],{"type":32,"tag":248,"props":879,"children":880},{},[881],{"type":37,"value":882},"Один агент (GPT-4, последовательно):",{"type":32,"tag":884,"props":885,"children":886},"ul",{},[887,893,898,903,913],{"type":32,"tag":888,"props":889,"children":890},"li",{},[891],{"type":37,"value":892},"Query BigQuery → 800ms (LLM + API)",{"type":32,"tag":888,"props":894,"children":895},{},[896],{"type":37,"value":897},"Query ad platform'ы → 900ms",{"type":32,"tag":888,"props":899,"children":900},{},[901],{"type":37,"value":902},"Generate email → 600ms",{"type":32,"tag":888,"props":904,"children":905},{},[906,911],{"type":32,"tag":248,"props":907,"children":908},{},[909],{"type":37,"value":910},"Итого:",{"type":37,"value":912}," 2300ms",{"type":32,"tag":888,"props":914,"children":915},{},[916,921],{"type":32,"tag":248,"props":917,"children":918},{},[919],{"type":37,"value":920},"Cost:",{"type":37,"value":922}," 3 turn × $0.03\u002F1K token = ~$0.09 (среднее mix input\u002Foutput)",{"type":32,"tag":33,"props":924,"children":925},{},[926],{"type":32,"tag":248,"props":927,"children":928},{},[929],{"type":37,"value":930},"Multi-agent (scatter-gather + pipeline):",{"type":32,"tag":884,"props":932,"children":933},{},[934,939,944,953],{"type":32,"tag":888,"props":935,"children":936},{},[937],{"type":37,"value":938},"Agent_1, 2, 3 параллельно (BigQuery, ads, email prep) → max 900ms",{"type":32,"tag":888,"props":940,"children":941},{},[942],{"type":37,"value":943},"Orchestrator merge + synthesis → 400ms",{"type":32,"tag":888,"props":945,"children":946},{},[947,951],{"type":32,"tag":248,"props":948,"children":949},{},[950],{"type":37,"value":910},{"type":37,"value":952}," 1300ms",{"type":32,"tag":888,"props":954,"children":955},{},[956,960],{"type":32,"tag":248,"props":957,"children":958},{},[959],{"type":37,"value":920},{"type":37,"value":961}," 3 агента × $0.02 (маленькие модели) + synthesis $0.03 = ~$0.09 (то же, но оптимизацией моделей снижается)",{"type":32,"tag":33,"props":963,"children":964},{},[965,970],{"type":32,"tag":248,"props":966,"children":967},{},[968],{"type":37,"value":969},"Выигрыш:",{"type":37,"value":971}," 43% снижение latency. Cost тот же, но оптимизацией моделей (agent_1 → Gemini Flash, agent_2 → Claude Haiku, orchestrator → GPT-4o-mini) падает до $0.05.",{"type":32,"tag":33,"props":973,"children":974},{},[975,980],{"type":32,"tag":248,"props":976,"children":977},{},[978],{"type":37,"value":979},"Но:",{"type":37,"value":981}," параллельные агенты = параллельное потребление rate limit. Если OpenAI tier limit 500 RPM, то 10 параллельных агентов = 50 пользователей за 5 минут. Один агент = 500 пользователей. В production этот tradeoff'ы управляются queue + кеш.",{"type":32,"tag":40,"props":983,"children":985},{"id":984},"observability-и-debug",[986],{"type":37,"value":987},"Observability и debug",{"type":32,"tag":33,"props":989,"children":990},{},[991],{"type":37,"value":992},"В multi-agent системе ответ на вопрос «где произошла ошибка?» сложен. Инструменты типа LangSmith, Helicone, Arize Phoenix визуализируют agent trace: какой агент когда какой tool вызвал, с каким prompt, что вернул, где был retry. Пример trace:",{"type":32,"tag":62,"props":994,"children":996},{"code":995},"orchestrator → classify_query (50ms, GPT-4o-mini) → \"data_query\"\n→ analytics_agent → query_bigquery (800ms, tool_call) → success\n→ writer_agent → generate_summary (600ms, GPT-4) → success\n→ orchestrator → merge_results (200ms) → final_output\n",[997],{"type":32,"tag":69,"props":998,"children":999},{"__ignoreMap":16},[1000],{"type":37,"value":995},{"type":32,"tag":33,"props":1002,"children":1003},{},[1004],{"type":37,"value":1005},"На каждом шаге логируется token count, latency, cost. Без этой телеметрии multi-agent невозможно debug'ить. Если tool call агента A timeout'ится, видно в trace, добавляешь retry logic.",{"type":32,"tag":33,"props":1007,"children":1008},{},[1009,1011,1016],{"type":37,"value":1010},"Ещё один метрик: ",{"type":32,"tag":248,"props":1012,"children":1013},{},[1014],{"type":37,"value":1015},"agent utilization",{"type":37,"value":1017},". Если определили 5 агентов, но 80% user query'й идёт к одному агенту, то routing logic неправильный. Измеряют accuracy классификации router агента — с user feedback создают labelled dataset и fine-tune'ят router (вместо few-shot prompt'а lightweight classifier).",{"type":32,"tag":40,"props":1019,"children":1021},{"id":1020},"ограничения-multi-agent",[1022],{"type":37,"value":1023},"Ограничения Multi-Agent",{"type":32,"tag":33,"props":1025,"children":1026},{},[1027,1029,1034],{"type":37,"value":1028},"Multi-agent не решает все проблемы. ",{"type":32,"tag":248,"props":1030,"children":1031},{},[1032],{"type":37,"value":1033},"Coordination overhead",{"type":37,"value":1035}," существует: обмен сообщениями между агентами, orchestration logic, error handling — всё добавляет latency. Простой query, завершаемый одним агентом за 1 секунду, может занять 1.5 секунды в multi-agent системе (orchestrator + routing + merge). Архитектурная сложность растёт — кодовая база больше, тестирование сложнее, deployment более деликатный.",{"type":32,"tag":33,"props":1037,"children":1038},{},[1039],{"type":37,"value":1040},"Multi-agent имеет смысл когда:",{"type":32,"tag":884,"props":1042,"children":1043},{},[1044,1054,1064],{"type":32,"tag":888,"props":1045,"children":1046},{},[1047,1052],{"type":32,"tag":248,"props":1048,"children":1049},{},[1050],{"type":37,"value":1051},"Нужен параллельный pull данных:",{"type":37,"value":1053}," если 5 разных API'й — scatter-gather даёт выигрыш",{"type":32,"tag":888,"props":1055,"children":1056},{},[1057,1062],{"type":32,"tag":248,"props":1058,"children":1059},{},[1060],{"type":37,"value":1061},"Оптимальны специализированные модели:",{"type":37,"value":1063}," маленькая для query planning, большая для code generation, pipeline topology снижает cost",{"type":32,"tag":888,"props":1065,"children":1066},{},[1067,1072],{"type":32,"tag":248,"props":1068,"children":1069},{},[1070],{"type":37,"value":1071},"Long-running task:",{"type":37,"value":1073}," агент_1 инициирует, агент_2 async мониторит, агент_3 завершает, orchestrator уведомляет — event-driven вместо sync LLM call",{"type":32,"tag":33,"props":1075,"children":1076},{},[1077],{"type":37,"value":1078},"На коротких, частых, простых query'йх один агент + кеш лучше. Multi-agent создаёт value при decompose сложной задачи и её оптимизации.",{"type":32,"tag":1080,"props":1081,"children":1082},"hr",{},[],{"type":32,"tag":33,"props":1084,"children":1085},{},[1086],{"type":37,"value":1087},"Multi-agent orchestration трансформирует LLM из stateless function call в stateful, observable, scalable систему. Параллельная топология разбивает latency, pipeline снижает cost, orchestrator обеспечивает reliability. В production начни со scatter-gather, мониторь rate limit и cost, переходи на pipeline если нужно. Логируй agent trace, наслаивай error handling, тестируй routing logic. Multi-agent — это переход от LLM engineering к LLM infrastructure.",{"type":32,"tag":1089,"props":1090,"children":1091},"style",{},[1092],{"type":37,"value":1093},"html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}",{"title":16,"searchDepth":106,"depth":106,"links":1095},[1096,1097,1100,1101,1102,1103],{"id":42,"depth":97,"text":45},{"id":238,"depth":97,"text":241,"children":1098},[1099],{"id":308,"depth":106,"text":311},{"id":426,"depth":97,"text":429},{"id":866,"depth":97,"text":869},{"id":984,"depth":97,"text":987},{"id":1020,"depth":97,"text":1023},"markdown","content:ru:ai:multi-agent-orchestration-llm.md","content","ru\u002Fai\u002Fmulti-agent-orchestration-llm.md","ru\u002Fai\u002Fmulti-agent-orchestration-llm","md",1780898616973]