[{"data":1,"prerenderedAt":2637},["ShallowReactive",2],{"article-alternates":3,"article-\u002Ffr\u002Fai\u002Fversionamento-prompt-ab-test":13},{"i18nKey":4,"paths":5},"ai-004-2026-05",{"de":6,"en":7,"es":8,"fr":9,"it":10,"ru":11,"tr":12},"\u002Fde\u002Fai\u002Fprompt-versionierung-llm-evaluation","\u002Fen\u002Fai\u002Fllm-ops-prompt-versioning-ab-testing","\u002Fes\u002Fai\u002Fversionado-prompts-ab-testing-llm-ops","\u002Ffr\u002Fai\u002Fversionamento-prompt-ab-test","\u002Fit\u002Fai\u002Fversionamento-prompt-e-a-b-test-disciplina-llm-ops","\u002Fru\u002Fai\u002Fprompt-versionierung-und-ab-tests-llm-ops-disziplin","\u002Ftr\u002Fai\u002Fprompt-versiyonlama-ve-a-b-testi-llm-operasyonun-disiplini",{"_path":9,"_dir":14,"_draft":15,"_partial":15,"_locale":16,"title":17,"description":18,"publishedAt":19,"modifiedAt":19,"category":14,"i18nKey":4,"tags":20,"readingTime":26,"author":27,"body":28,"_type":2631,"_id":2632,"_source":2633,"_file":2634,"_stem":2635,"_extension":2636},"ai",false,"","Versionamento dei Prompt e A\u002FB Test: La Disciplina delle Operazioni LLM","Come costruire versioning dei prompt, pipeline di valutazione e controllo di qualità deterministico con Promptfoo e LangSmith nei sistemi LLM in production.","2026-05-13",[21,22,23,24,25],"llm-ops","prompt-engineering","valutazione","mlops","qualità-ai",8,"Roibase",{"type":29,"children":30,"toc":2619},"root",[31,39,44,51,56,78,83,103,108,114,119,130,148,161,296,306,324,329,643,653,671,687,694,699,704,1020,1033,1117,1122,1128,1133,1185,1190,1480,1485,1704,1709,1715,1720,1732,1737,2062,2067,2073,2085,2090,2419,2424,2430,2435,2440,2473,2478,2490,2496,2501,2590,2595,2599,2613],{"type":32,"tag":33,"props":34,"children":35},"element","p",{},[36],{"type":37,"value":38},"text","Nei sistemi che utilizzano LLM, ci sono 15 passi tra \"funziona\" e \"affidabile in production\". L'automazione marketing produce output Markdown con Claude API, la segmentazione del customer journey utilizza GPT — ma quando modifichi il prompt, come sai di non aver introdotto una regressione? In ingegneria del software, il versionamento, la copertura dei test e la CI\u002FCD sono standard; nelle operazioni LLM, senza questa disciplina, ogni deployment è una scommessa.",{"type":32,"tag":33,"props":40,"children":41},{},[42],{"type":37,"value":43},"Strumenti come Promptfoo e LangSmith forniscono questa disciplina: versionamento dei prompt, valutazione deterministica, A\u002FB test, tracking delle metriche. Questo articolo mostra come costruire il controllo di qualità in un sistema LLM in production — non a livello di codice, ma a livello di infrastruttura.",{"type":32,"tag":45,"props":46,"children":48},"h2",{"id":47},"lillusione-che-il-prompt-non-sia-codice",[49],{"type":37,"value":50},"L'Illusione che il Prompt Non Sia Codice",{"type":32,"tag":33,"props":52,"children":53},{},[54],{"type":37,"value":55},"La maggior parte dei team vede il prompt come un \"file di configurazione\" — un editor nell'interfaccia, documentazione in Notion, nodo di testo hardcoded nel workflow n8n. In realtà, il prompt è una specification eseguibile che definisce il comportamento del sistema. Ma non c'è versionamento, niente diff, niente rollback.",{"type":32,"tag":33,"props":57,"children":58},{},[59,61,68,70,76],{"type":37,"value":60},"Un commit Git con messaggio \"fix typo\" può cambiare il tono dell'output del modello e abbassare le metriche. Specialmente negli scenari di structured output (schema JSON, frontmatter Markdown, query SQL), una singola parola che rompe il formato crea errori a cascata. Esempio: scrivere ",{"type":32,"tag":62,"props":63,"children":65},"code",{"className":64},[],[66],{"type":37,"value":67},"OUTPUT FORMAT: JSON",{"type":37,"value":69}," invece di ",{"type":32,"tag":62,"props":71,"children":73},{"className":72},[],[74],{"type":37,"value":75},"OUTPUT FORMAT: Valid JSON",{"type":37,"value":77}," fa sì che il modello talvolta aggiunga paragrafi esplicativi — il parser downstream crasha, gli alert si attivano, debugging per 3 ore.",{"type":32,"tag":33,"props":79,"children":80},{},[81],{"type":37,"value":82},"La disciplina del versionamento deve rispondere a queste domande:",{"type":32,"tag":84,"props":85,"children":86},"ul",{},[87,93,98],{"type":32,"tag":88,"props":89,"children":90},"li",{},[91],{"type":37,"value":92},"Quale versione del prompt è attualmente in production?",{"type":32,"tag":88,"props":94,"children":95},{},[96],{"type":37,"value":97},"Qual è la differenza di performance tra la versione di due settimane fa e quella attuale?",{"type":32,"tag":88,"props":99,"children":100},{},[101],{"type":37,"value":102},"Quale variante nell'A\u002FB test ha aumentato la conversione dell'8%?",{"type":32,"tag":33,"props":104,"children":105},{},[106],{"type":37,"value":107},"Se non riesci a rispondere a queste domande, non stai facendo \"operazioni AI\", stai conducendo esperimenti manuali.",{"type":32,"tag":45,"props":109,"children":111},{"id":110},"pipeline-di-valutazione-i-tre-livelli-della-misurazione-delloutput",[112],{"type":37,"value":113},"Pipeline di Valutazione: I Tre Livelli della Misurazione dell'Output",{"type":32,"tag":33,"props":115,"children":116},{},[117],{"type":37,"value":118},"Valutare l'output di un LLM sembra soggettivo, ma nei sistemi in production è possibile costruire metriche deterministiche. La valutazione funziona su tre livelli: sintassi, semantica, risultato di business.",{"type":32,"tag":33,"props":120,"children":121},{},[122,128],{"type":32,"tag":123,"props":124,"children":125},"strong",{},[126],{"type":37,"value":127},"Livello di sintassi",{"type":37,"value":129}," — conformità del formato:",{"type":32,"tag":84,"props":131,"children":132},{},[133,138,143],{"type":32,"tag":88,"props":134,"children":135},{},[136],{"type":37,"value":137},"Il JSON viene parsato correttamente?",{"type":32,"tag":88,"props":139,"children":140},{},[141],{"type":37,"value":142},"Il frontmatter Markdown è valido?",{"type":32,"tag":88,"props":144,"children":145},{},[146],{"type":37,"value":147},"Sono presenti i campi previsti?",{"type":32,"tag":33,"props":149,"children":150},{},[151,153,159],{"type":37,"value":152},"In Promptfoo, si controlla con un'asserzione ",{"type":32,"tag":62,"props":154,"children":156},{"className":155},[],[157],{"type":37,"value":158},"javascript",{"type":37,"value":160},":",{"type":32,"tag":162,"props":163,"children":166},"pre",{"className":164,"code":165,"language":158,"meta":16,"style":16},"language-javascript shiki shiki-themes github-dark","assert: [\n  {\n    type: \"javascript\",\n    value: \"JSON.parse(output).title.length \u003C= 60\"\n  },\n  {\n    type: \"is-json\",\n    value: true\n  }\n]\n",[167],{"type":32,"tag":62,"props":168,"children":169},{"__ignoreMap":16},[170,188,197,217,231,240,248,265,278,287],{"type":32,"tag":171,"props":172,"children":175},"span",{"class":173,"line":174},"line",1,[176,182],{"type":32,"tag":171,"props":177,"children":179},{"style":178},"--shiki-default:#B392F0",[180],{"type":37,"value":181},"assert",{"type":32,"tag":171,"props":183,"children":185},{"style":184},"--shiki-default:#E1E4E8",[186],{"type":37,"value":187},": [\n",{"type":32,"tag":171,"props":189,"children":191},{"class":173,"line":190},2,[192],{"type":32,"tag":171,"props":193,"children":194},{"style":184},[195],{"type":37,"value":196},"  {\n",{"type":32,"tag":171,"props":198,"children":200},{"class":173,"line":199},3,[201,206,212],{"type":32,"tag":171,"props":202,"children":203},{"style":184},[204],{"type":37,"value":205},"    type: ",{"type":32,"tag":171,"props":207,"children":209},{"style":208},"--shiki-default:#9ECBFF",[210],{"type":37,"value":211},"\"javascript\"",{"type":32,"tag":171,"props":213,"children":214},{"style":184},[215],{"type":37,"value":216},",\n",{"type":32,"tag":171,"props":218,"children":220},{"class":173,"line":219},4,[221,226],{"type":32,"tag":171,"props":222,"children":223},{"style":184},[224],{"type":37,"value":225},"    value: ",{"type":32,"tag":171,"props":227,"children":228},{"style":208},[229],{"type":37,"value":230},"\"JSON.parse(output).title.length \u003C= 60\"\n",{"type":32,"tag":171,"props":232,"children":234},{"class":173,"line":233},5,[235],{"type":32,"tag":171,"props":236,"children":237},{"style":184},[238],{"type":37,"value":239},"  },\n",{"type":32,"tag":171,"props":241,"children":243},{"class":173,"line":242},6,[244],{"type":32,"tag":171,"props":245,"children":246},{"style":184},[247],{"type":37,"value":196},{"type":32,"tag":171,"props":249,"children":251},{"class":173,"line":250},7,[252,256,261],{"type":32,"tag":171,"props":253,"children":254},{"style":184},[255],{"type":37,"value":205},{"type":32,"tag":171,"props":257,"children":258},{"style":208},[259],{"type":37,"value":260},"\"is-json\"",{"type":32,"tag":171,"props":262,"children":263},{"style":184},[264],{"type":37,"value":216},{"type":32,"tag":171,"props":266,"children":267},{"class":173,"line":26},[268,272],{"type":32,"tag":171,"props":269,"children":270},{"style":184},[271],{"type":37,"value":225},{"type":32,"tag":171,"props":273,"children":275},{"style":274},"--shiki-default:#79B8FF",[276],{"type":37,"value":277},"true\n",{"type":32,"tag":171,"props":279,"children":281},{"class":173,"line":280},9,[282],{"type":32,"tag":171,"props":283,"children":284},{"style":184},[285],{"type":37,"value":286},"  }\n",{"type":32,"tag":171,"props":288,"children":290},{"class":173,"line":289},10,[291],{"type":32,"tag":171,"props":292,"children":293},{"style":184},[294],{"type":37,"value":295},"]\n",{"type":32,"tag":33,"props":297,"children":298},{},[299,304],{"type":32,"tag":123,"props":300,"children":301},{},[302],{"type":37,"value":303},"Livello di semantica",{"type":37,"value":305}," — qualità del contenuto:",{"type":32,"tag":84,"props":307,"children":308},{},[309,314,319],{"type":32,"tag":88,"props":310,"children":311},{},[312],{"type":37,"value":313},"La risposta è rilevante all'argomento? (similarità embedding, distanza coseno > 0,85)",{"type":32,"tag":88,"props":315,"children":316},{},[317],{"type":37,"value":318},"Sono presenti parole vietate? (regex, token filtering)",{"type":32,"tag":88,"props":320,"children":321},{},[322],{"type":37,"value":323},"Il tono è corretto? (modello classifier, sentiment score)",{"type":32,"tag":33,"props":325,"children":326},{},[327],{"type":37,"value":328},"In LangSmith, valutatore personalizzato:",{"type":32,"tag":162,"props":330,"children":334},{"className":331,"code":332,"language":333,"meta":16,"style":16},"language-python shiki shiki-themes github-dark","from langsmith import evaluate\n\ndef check_brand_compliance(run, example):\n    forbidden = [\"esperto\", \"leader\", \"rivoluzionario\"]\n    output = run.outputs[\"text\"].lower()\n    violations = [w for w in forbidden if w in output]\n    return {\"score\": 0 if violations else 1, \"violations\": violations}\n\nevaluate(\n    dataset_name=\"marketing_blog_posts\",\n    evaluators=[check_brand_compliance]\n)\n","python",[335],{"type":32,"tag":62,"props":336,"children":337},{"__ignoreMap":16},[338,362,371,389,435,462,517,579,586,594,616,634],{"type":32,"tag":171,"props":339,"children":340},{"class":173,"line":174},[341,347,352,357],{"type":32,"tag":171,"props":342,"children":344},{"style":343},"--shiki-default:#F97583",[345],{"type":37,"value":346},"from",{"type":32,"tag":171,"props":348,"children":349},{"style":184},[350],{"type":37,"value":351}," langsmith ",{"type":32,"tag":171,"props":353,"children":354},{"style":343},[355],{"type":37,"value":356},"import",{"type":32,"tag":171,"props":358,"children":359},{"style":184},[360],{"type":37,"value":361}," evaluate\n",{"type":32,"tag":171,"props":363,"children":364},{"class":173,"line":190},[365],{"type":32,"tag":171,"props":366,"children":368},{"emptyLinePlaceholder":367},true,[369],{"type":37,"value":370},"\n",{"type":32,"tag":171,"props":372,"children":373},{"class":173,"line":199},[374,379,384],{"type":32,"tag":171,"props":375,"children":376},{"style":343},[377],{"type":37,"value":378},"def",{"type":32,"tag":171,"props":380,"children":381},{"style":178},[382],{"type":37,"value":383}," check_brand_compliance",{"type":32,"tag":171,"props":385,"children":386},{"style":184},[387],{"type":37,"value":388},"(run, example):\n",{"type":32,"tag":171,"props":390,"children":391},{"class":173,"line":219},[392,397,402,407,412,417,422,426,431],{"type":32,"tag":171,"props":393,"children":394},{"style":184},[395],{"type":37,"value":396},"    forbidden ",{"type":32,"tag":171,"props":398,"children":399},{"style":343},[400],{"type":37,"value":401},"=",{"type":32,"tag":171,"props":403,"children":404},{"style":184},[405],{"type":37,"value":406}," [",{"type":32,"tag":171,"props":408,"children":409},{"style":208},[410],{"type":37,"value":411},"\"esperto\"",{"type":32,"tag":171,"props":413,"children":414},{"style":184},[415],{"type":37,"value":416},", ",{"type":32,"tag":171,"props":418,"children":419},{"style":208},[420],{"type":37,"value":421},"\"leader\"",{"type":32,"tag":171,"props":423,"children":424},{"style":184},[425],{"type":37,"value":416},{"type":32,"tag":171,"props":427,"children":428},{"style":208},[429],{"type":37,"value":430},"\"rivoluzionario\"",{"type":32,"tag":171,"props":432,"children":433},{"style":184},[434],{"type":37,"value":295},{"type":32,"tag":171,"props":436,"children":437},{"class":173,"line":233},[438,443,447,452,457],{"type":32,"tag":171,"props":439,"children":440},{"style":184},[441],{"type":37,"value":442},"    output ",{"type":32,"tag":171,"props":444,"children":445},{"style":343},[446],{"type":37,"value":401},{"type":32,"tag":171,"props":448,"children":449},{"style":184},[450],{"type":37,"value":451}," run.outputs[",{"type":32,"tag":171,"props":453,"children":454},{"style":208},[455],{"type":37,"value":456},"\"text\"",{"type":32,"tag":171,"props":458,"children":459},{"style":184},[460],{"type":37,"value":461},"].lower()\n",{"type":32,"tag":171,"props":463,"children":464},{"class":173,"line":242},[465,470,474,479,484,489,494,499,504,508,512],{"type":32,"tag":171,"props":466,"children":467},{"style":184},[468],{"type":37,"value":469},"    violations ",{"type":32,"tag":171,"props":471,"children":472},{"style":343},[473],{"type":37,"value":401},{"type":32,"tag":171,"props":475,"children":476},{"style":184},[477],{"type":37,"value":478}," [w ",{"type":32,"tag":171,"props":480,"children":481},{"style":343},[482],{"type":37,"value":483},"for",{"type":32,"tag":171,"props":485,"children":486},{"style":184},[487],{"type":37,"value":488}," w ",{"type":32,"tag":171,"props":490,"children":491},{"style":343},[492],{"type":37,"value":493},"in",{"type":32,"tag":171,"props":495,"children":496},{"style":184},[497],{"type":37,"value":498}," forbidden ",{"type":32,"tag":171,"props":500,"children":501},{"style":343},[502],{"type":37,"value":503},"if",{"type":32,"tag":171,"props":505,"children":506},{"style":184},[507],{"type":37,"value":488},{"type":32,"tag":171,"props":509,"children":510},{"style":343},[511],{"type":37,"value":493},{"type":32,"tag":171,"props":513,"children":514},{"style":184},[515],{"type":37,"value":516}," output]\n",{"type":32,"tag":171,"props":518,"children":519},{"class":173,"line":250},[520,525,530,535,540,545,550,555,560,565,569,574],{"type":32,"tag":171,"props":521,"children":522},{"style":343},[523],{"type":37,"value":524},"    return",{"type":32,"tag":171,"props":526,"children":527},{"style":184},[528],{"type":37,"value":529}," {",{"type":32,"tag":171,"props":531,"children":532},{"style":208},[533],{"type":37,"value":534},"\"score\"",{"type":32,"tag":171,"props":536,"children":537},{"style":184},[538],{"type":37,"value":539},": ",{"type":32,"tag":171,"props":541,"children":542},{"style":274},[543],{"type":37,"value":544},"0",{"type":32,"tag":171,"props":546,"children":547},{"style":343},[548],{"type":37,"value":549}," if",{"type":32,"tag":171,"props":551,"children":552},{"style":184},[553],{"type":37,"value":554}," violations ",{"type":32,"tag":171,"props":556,"children":557},{"style":343},[558],{"type":37,"value":559},"else",{"type":32,"tag":171,"props":561,"children":562},{"style":274},[563],{"type":37,"value":564}," 1",{"type":32,"tag":171,"props":566,"children":567},{"style":184},[568],{"type":37,"value":416},{"type":32,"tag":171,"props":570,"children":571},{"style":208},[572],{"type":37,"value":573},"\"violations\"",{"type":32,"tag":171,"props":575,"children":576},{"style":184},[577],{"type":37,"value":578},": violations}\n",{"type":32,"tag":171,"props":580,"children":581},{"class":173,"line":26},[582],{"type":32,"tag":171,"props":583,"children":584},{"emptyLinePlaceholder":367},[585],{"type":37,"value":370},{"type":32,"tag":171,"props":587,"children":588},{"class":173,"line":280},[589],{"type":32,"tag":171,"props":590,"children":591},{"style":184},[592],{"type":37,"value":593},"evaluate(\n",{"type":32,"tag":171,"props":595,"children":596},{"class":173,"line":289},[597,603,607,612],{"type":32,"tag":171,"props":598,"children":600},{"style":599},"--shiki-default:#FFAB70",[601],{"type":37,"value":602},"    dataset_name",{"type":32,"tag":171,"props":604,"children":605},{"style":343},[606],{"type":37,"value":401},{"type":32,"tag":171,"props":608,"children":609},{"style":208},[610],{"type":37,"value":611},"\"marketing_blog_posts\"",{"type":32,"tag":171,"props":613,"children":614},{"style":184},[615],{"type":37,"value":216},{"type":32,"tag":171,"props":617,"children":619},{"class":173,"line":618},11,[620,625,629],{"type":32,"tag":171,"props":621,"children":622},{"style":599},[623],{"type":37,"value":624},"    evaluators",{"type":32,"tag":171,"props":626,"children":627},{"style":343},[628],{"type":37,"value":401},{"type":32,"tag":171,"props":630,"children":631},{"style":184},[632],{"type":37,"value":633},"[check_brand_compliance]\n",{"type":32,"tag":171,"props":635,"children":637},{"class":173,"line":636},12,[638],{"type":32,"tag":171,"props":639,"children":640},{"style":184},[641],{"type":37,"value":642},")\n",{"type":32,"tag":33,"props":644,"children":645},{},[646,651],{"type":32,"tag":123,"props":647,"children":648},{},[649],{"type":37,"value":650},"Livello di risultato di business",{"type":37,"value":652}," — l'impatto reale:",{"type":32,"tag":84,"props":654,"children":655},{},[656,661,666],{"type":32,"tag":88,"props":657,"children":658},{},[659],{"type":37,"value":660},"Il CTR è cambiato?",{"type":32,"tag":88,"props":662,"children":663},{},[664],{"type":37,"value":665},"La conversione è diminuita?",{"type":32,"tag":88,"props":667,"children":668},{},[669],{"type":37,"value":670},"Il bounce rate è aumentato?",{"type":32,"tag":33,"props":672,"children":673},{},[674,676,685],{"type":37,"value":675},"Questo livello si collega alla telemetria in production — nel sistema di ",{"type":32,"tag":677,"props":678,"children":682},"a",{"href":679,"rel":680},"https:\u002F\u002Fwww.roibase.com.tr\u002Ffr\u002Ffirstparty",[681],"nofollow",[683],{"type":37,"value":684},"First-Party Data & Architettura di Misurazione",{"type":37,"value":686},", la versione del prompt viene aggiunta ai metadati del tracking degli eventi, unita in BigQuery, il modello dbt calcola il conversion rate di ogni versione.",{"type":32,"tag":688,"props":689,"children":691},"h3",{"id":690},"promptfoo-costruire-una-suite-di-test-deterministica",[692],{"type":37,"value":693},"Promptfoo: Costruire una Suite di Test Deterministica",{"type":32,"tag":33,"props":695,"children":696},{},[697],{"type":37,"value":698},"Promptfoo è un framework di evaluation basato su YAML che gira localmente. L'obiettivo: verificare con test di regressione prima di ogni modifica del prompt.",{"type":32,"tag":33,"props":700,"children":701},{},[702],{"type":37,"value":703},"Configurazione semplice:",{"type":32,"tag":162,"props":705,"children":709},{"className":706,"code":707,"language":708,"meta":16,"style":16},"language-yaml shiki shiki-themes github-dark","prompts:\n  - file:\u002F\u002Fprompts\u002Fmarketing_blog_v1.md\n  - file:\u002F\u002Fprompts\u002Fmarketing_blog_v2.md\n\nproviders:\n  - anthropic:messages:claude-3-5-sonnet-20241022\n\ntests:\n  - vars:\n      topic: \"Server-side GTM\"\n      category: \"tech\"\n    assert:\n      - type: is-json\n      - type: javascript\n        value: \"output.title.length \u003C= 60\"\n      - type: similar\n        value: \"server-side tracking architecture\"\n        threshold: 0.8\n      - type: not-contains\n        value: \"rivoluzionario\"\n","yaml",[710],{"type":32,"tag":62,"props":711,"children":712},{"__ignoreMap":16},[713,727,740,752,759,771,783,790,802,818,835,852,864,887,908,926,947,964,982,1003],{"type":32,"tag":171,"props":714,"children":715},{"class":173,"line":174},[716,722],{"type":32,"tag":171,"props":717,"children":719},{"style":718},"--shiki-default:#85E89D",[720],{"type":37,"value":721},"prompts",{"type":32,"tag":171,"props":723,"children":724},{"style":184},[725],{"type":37,"value":726},":\n",{"type":32,"tag":171,"props":728,"children":729},{"class":173,"line":190},[730,735],{"type":32,"tag":171,"props":731,"children":732},{"style":184},[733],{"type":37,"value":734},"  - ",{"type":32,"tag":171,"props":736,"children":737},{"style":208},[738],{"type":37,"value":739},"file:\u002F\u002Fprompts\u002Fmarketing_blog_v1.md\n",{"type":32,"tag":171,"props":741,"children":742},{"class":173,"line":199},[743,747],{"type":32,"tag":171,"props":744,"children":745},{"style":184},[746],{"type":37,"value":734},{"type":32,"tag":171,"props":748,"children":749},{"style":208},[750],{"type":37,"value":751},"file:\u002F\u002Fprompts\u002Fmarketing_blog_v2.md\n",{"type":32,"tag":171,"props":753,"children":754},{"class":173,"line":219},[755],{"type":32,"tag":171,"props":756,"children":757},{"emptyLinePlaceholder":367},[758],{"type":37,"value":370},{"type":32,"tag":171,"props":760,"children":761},{"class":173,"line":233},[762,767],{"type":32,"tag":171,"props":763,"children":764},{"style":718},[765],{"type":37,"value":766},"providers",{"type":32,"tag":171,"props":768,"children":769},{"style":184},[770],{"type":37,"value":726},{"type":32,"tag":171,"props":772,"children":773},{"class":173,"line":242},[774,778],{"type":32,"tag":171,"props":775,"children":776},{"style":184},[777],{"type":37,"value":734},{"type":32,"tag":171,"props":779,"children":780},{"style":208},[781],{"type":37,"value":782},"anthropic:messages:claude-3-5-sonnet-20241022\n",{"type":32,"tag":171,"props":784,"children":785},{"class":173,"line":250},[786],{"type":32,"tag":171,"props":787,"children":788},{"emptyLinePlaceholder":367},[789],{"type":37,"value":370},{"type":32,"tag":171,"props":791,"children":792},{"class":173,"line":26},[793,798],{"type":32,"tag":171,"props":794,"children":795},{"style":718},[796],{"type":37,"value":797},"tests",{"type":32,"tag":171,"props":799,"children":800},{"style":184},[801],{"type":37,"value":726},{"type":32,"tag":171,"props":803,"children":804},{"class":173,"line":280},[805,809,814],{"type":32,"tag":171,"props":806,"children":807},{"style":184},[808],{"type":37,"value":734},{"type":32,"tag":171,"props":810,"children":811},{"style":718},[812],{"type":37,"value":813},"vars",{"type":32,"tag":171,"props":815,"children":816},{"style":184},[817],{"type":37,"value":726},{"type":32,"tag":171,"props":819,"children":820},{"class":173,"line":289},[821,826,830],{"type":32,"tag":171,"props":822,"children":823},{"style":718},[824],{"type":37,"value":825},"      topic",{"type":32,"tag":171,"props":827,"children":828},{"style":184},[829],{"type":37,"value":539},{"type":32,"tag":171,"props":831,"children":832},{"style":208},[833],{"type":37,"value":834},"\"Server-side GTM\"\n",{"type":32,"tag":171,"props":836,"children":837},{"class":173,"line":618},[838,843,847],{"type":32,"tag":171,"props":839,"children":840},{"style":718},[841],{"type":37,"value":842},"      category",{"type":32,"tag":171,"props":844,"children":845},{"style":184},[846],{"type":37,"value":539},{"type":32,"tag":171,"props":848,"children":849},{"style":208},[850],{"type":37,"value":851},"\"tech\"\n",{"type":32,"tag":171,"props":853,"children":854},{"class":173,"line":636},[855,860],{"type":32,"tag":171,"props":856,"children":857},{"style":718},[858],{"type":37,"value":859},"    assert",{"type":32,"tag":171,"props":861,"children":862},{"style":184},[863],{"type":37,"value":726},{"type":32,"tag":171,"props":865,"children":867},{"class":173,"line":866},13,[868,873,878,882],{"type":32,"tag":171,"props":869,"children":870},{"style":184},[871],{"type":37,"value":872},"      - ",{"type":32,"tag":171,"props":874,"children":875},{"style":718},[876],{"type":37,"value":877},"type",{"type":32,"tag":171,"props":879,"children":880},{"style":184},[881],{"type":37,"value":539},{"type":32,"tag":171,"props":883,"children":884},{"style":208},[885],{"type":37,"value":886},"is-json\n",{"type":32,"tag":171,"props":888,"children":890},{"class":173,"line":889},14,[891,895,899,903],{"type":32,"tag":171,"props":892,"children":893},{"style":184},[894],{"type":37,"value":872},{"type":32,"tag":171,"props":896,"children":897},{"style":718},[898],{"type":37,"value":877},{"type":32,"tag":171,"props":900,"children":901},{"style":184},[902],{"type":37,"value":539},{"type":32,"tag":171,"props":904,"children":905},{"style":208},[906],{"type":37,"value":907},"javascript\n",{"type":32,"tag":171,"props":909,"children":911},{"class":173,"line":910},15,[912,917,921],{"type":32,"tag":171,"props":913,"children":914},{"style":718},[915],{"type":37,"value":916},"        value",{"type":32,"tag":171,"props":918,"children":919},{"style":184},[920],{"type":37,"value":539},{"type":32,"tag":171,"props":922,"children":923},{"style":208},[924],{"type":37,"value":925},"\"output.title.length \u003C= 60\"\n",{"type":32,"tag":171,"props":927,"children":929},{"class":173,"line":928},16,[930,934,938,942],{"type":32,"tag":171,"props":931,"children":932},{"style":184},[933],{"type":37,"value":872},{"type":32,"tag":171,"props":935,"children":936},{"style":718},[937],{"type":37,"value":877},{"type":32,"tag":171,"props":939,"children":940},{"style":184},[941],{"type":37,"value":539},{"type":32,"tag":171,"props":943,"children":944},{"style":208},[945],{"type":37,"value":946},"similar\n",{"type":32,"tag":171,"props":948,"children":950},{"class":173,"line":949},17,[951,955,959],{"type":32,"tag":171,"props":952,"children":953},{"style":718},[954],{"type":37,"value":916},{"type":32,"tag":171,"props":956,"children":957},{"style":184},[958],{"type":37,"value":539},{"type":32,"tag":171,"props":960,"children":961},{"style":208},[962],{"type":37,"value":963},"\"server-side tracking architecture\"\n",{"type":32,"tag":171,"props":965,"children":967},{"class":173,"line":966},18,[968,973,977],{"type":32,"tag":171,"props":969,"children":970},{"style":718},[971],{"type":37,"value":972},"        threshold",{"type":32,"tag":171,"props":974,"children":975},{"style":184},[976],{"type":37,"value":539},{"type":32,"tag":171,"props":978,"children":979},{"style":274},[980],{"type":37,"value":981},"0.8\n",{"type":32,"tag":171,"props":983,"children":985},{"class":173,"line":984},19,[986,990,994,998],{"type":32,"tag":171,"props":987,"children":988},{"style":184},[989],{"type":37,"value":872},{"type":32,"tag":171,"props":991,"children":992},{"style":718},[993],{"type":37,"value":877},{"type":32,"tag":171,"props":995,"children":996},{"style":184},[997],{"type":37,"value":539},{"type":32,"tag":171,"props":999,"children":1000},{"style":208},[1001],{"type":37,"value":1002},"not-contains\n",{"type":32,"tag":171,"props":1004,"children":1006},{"class":173,"line":1005},20,[1007,1011,1015],{"type":32,"tag":171,"props":1008,"children":1009},{"style":718},[1010],{"type":37,"value":916},{"type":32,"tag":171,"props":1012,"children":1013},{"style":184},[1014],{"type":37,"value":539},{"type":32,"tag":171,"props":1016,"children":1017},{"style":208},[1018],{"type":37,"value":1019},"\"rivoluzionario\"\n",{"type":32,"tag":33,"props":1021,"children":1022},{},[1023,1025,1031],{"type":37,"value":1024},"Con il comando ",{"type":32,"tag":62,"props":1026,"children":1028},{"className":1027},[],[1029],{"type":37,"value":1030},"promptfoo eval",{"type":37,"value":1032},", tutte le varianti vengono testate e la tabella delle metriche viene restituita:",{"type":32,"tag":1034,"props":1035,"children":1036},"table",{},[1037,1066],{"type":32,"tag":1038,"props":1039,"children":1040},"thead",{},[1041],{"type":32,"tag":1042,"props":1043,"children":1044},"tr",{},[1045,1051,1056,1061],{"type":32,"tag":1046,"props":1047,"children":1048},"th",{},[1049],{"type":37,"value":1050},"Prompt",{"type":32,"tag":1046,"props":1052,"children":1053},{},[1054],{"type":37,"value":1055},"Pass Rate",{"type":32,"tag":1046,"props":1057,"children":1058},{},[1059],{"type":37,"value":1060},"Latenza Media",{"type":32,"tag":1046,"props":1062,"children":1063},{},[1064],{"type":37,"value":1065},"Costo",{"type":32,"tag":1067,"props":1068,"children":1069},"tbody",{},[1070,1094],{"type":32,"tag":1042,"props":1071,"children":1072},{},[1073,1079,1084,1089],{"type":32,"tag":1074,"props":1075,"children":1076},"td",{},[1077],{"type":37,"value":1078},"v1",{"type":32,"tag":1074,"props":1080,"children":1081},{},[1082],{"type":37,"value":1083},"92%",{"type":32,"tag":1074,"props":1085,"children":1086},{},[1087],{"type":37,"value":1088},"2,3s",{"type":32,"tag":1074,"props":1090,"children":1091},{},[1092],{"type":37,"value":1093},"$0,012",{"type":32,"tag":1042,"props":1095,"children":1096},{},[1097,1102,1107,1112],{"type":32,"tag":1074,"props":1098,"children":1099},{},[1100],{"type":37,"value":1101},"v2",{"type":32,"tag":1074,"props":1103,"children":1104},{},[1105],{"type":37,"value":1106},"98%",{"type":32,"tag":1074,"props":1108,"children":1109},{},[1110],{"type":37,"value":1111},"2,1s",{"type":32,"tag":1074,"props":1113,"children":1114},{},[1115],{"type":37,"value":1116},"$0,014",{"type":32,"tag":33,"props":1118,"children":1119},{},[1120],{"type":37,"value":1121},"In v2 il pass rate è aumentato ma il costo è salito del 17% — il conteggio dei token aumenta, è necessario controllare nel dettaglio. Senza vedere questo tradeoff, il deploy avrebbe fatto esplodere il budget mensile.",{"type":32,"tag":45,"props":1123,"children":1125},{"id":1124},"ab-test-confrontare-le-varianti-dei-prompt-in-production",[1126],{"type":37,"value":1127},"A\u002FB Test: Confrontare le Varianti dei Prompt in Production",{"type":32,"tag":33,"props":1129,"children":1130},{},[1131],{"type":37,"value":1132},"La suite di evaluation è verde, ora servono dati di traffico reali. L'A\u002FB test in un sistema LLM funziona così:",{"type":32,"tag":1134,"props":1135,"children":1136},"ol",{},[1137,1147,1165,1175],{"type":32,"tag":88,"props":1138,"children":1139},{},[1140,1145],{"type":32,"tag":123,"props":1141,"children":1142},{},[1143],{"type":37,"value":1144},"Variant routing",{"type":37,"value":1146}," — scegli la versione del prompt in base all'ID utente\u002Fsessione (% split)",{"type":32,"tag":88,"props":1148,"children":1149},{},[1150,1155,1157,1163],{"type":32,"tag":123,"props":1151,"children":1152},{},[1153],{"type":37,"value":1154},"Metadata tagging",{"type":37,"value":1156}," — aggiungi ",{"type":32,"tag":62,"props":1158,"children":1160},{"className":1159},[],[1161],{"type":37,"value":1162},"prompt_version",{"type":37,"value":1164}," a ogni API call",{"type":32,"tag":88,"props":1166,"children":1167},{},[1168,1173],{"type":32,"tag":123,"props":1169,"children":1170},{},[1171],{"type":37,"value":1172},"Metric tracking",{"type":37,"value":1174}," — mantieni le informazioni sulla variante negli eventi downstream",{"type":32,"tag":88,"props":1176,"children":1177},{},[1178,1183],{"type":32,"tag":123,"props":1179,"children":1180},{},[1181],{"type":37,"value":1182},"Significatività statistica",{"type":37,"value":1184}," — quando viene raccolta una quantità sufficiente di campioni (min 385 osservazioni per variante, 95% di confidenza), prendi una decisione",{"type":32,"tag":33,"props":1186,"children":1187},{},[1188],{"type":37,"value":1189},"Esempio di workflow n8n:",{"type":32,"tag":162,"props":1191,"children":1193},{"className":164,"code":1192,"language":158,"meta":16,"style":16},"\u002F\u002F Selezione variante A\u002FB\nconst userId = $json.user_id;\nconst variant = (userId % 100 \u003C 50) ? 'v1' : 'v2';\nconst promptUrl = `https:\u002F\u002Fraw.githubusercontent.com\u002Froibase\u002Fprompts\u002Fmain\u002F${variant}.md`;\n\n\u002F\u002F Aggiungi metadati alla API call\nreturn {\n  json: {\n    prompt: await fetch(promptUrl).then(r => r.text()),\n    metadata: {\n      prompt_version: variant,\n      experiment_id: 'blog_tone_test_2026_05'\n    }\n  }\n};\n",[1194],{"type":32,"tag":62,"props":1195,"children":1196},{"__ignoreMap":16},[1197,1206,1229,1300,1335,1342,1350,1363,1371,1428,1436,1444,1457,1465,1472],{"type":32,"tag":171,"props":1198,"children":1199},{"class":173,"line":174},[1200],{"type":32,"tag":171,"props":1201,"children":1203},{"style":1202},"--shiki-default:#6A737D",[1204],{"type":37,"value":1205},"\u002F\u002F Selezione variante A\u002FB\n",{"type":32,"tag":171,"props":1207,"children":1208},{"class":173,"line":190},[1209,1214,1219,1224],{"type":32,"tag":171,"props":1210,"children":1211},{"style":343},[1212],{"type":37,"value":1213},"const",{"type":32,"tag":171,"props":1215,"children":1216},{"style":274},[1217],{"type":37,"value":1218}," userId",{"type":32,"tag":171,"props":1220,"children":1221},{"style":343},[1222],{"type":37,"value":1223}," =",{"type":32,"tag":171,"props":1225,"children":1226},{"style":184},[1227],{"type":37,"value":1228}," $json.user_id;\n",{"type":32,"tag":171,"props":1230,"children":1231},{"class":173,"line":199},[1232,1236,1241,1245,1250,1255,1260,1265,1270,1275,1280,1285,1290,1295],{"type":32,"tag":171,"props":1233,"children":1234},{"style":343},[1235],{"type":37,"value":1213},{"type":32,"tag":171,"props":1237,"children":1238},{"style":274},[1239],{"type":37,"value":1240}," variant",{"type":32,"tag":171,"props":1242,"children":1243},{"style":343},[1244],{"type":37,"value":1223},{"type":32,"tag":171,"props":1246,"children":1247},{"style":184},[1248],{"type":37,"value":1249}," (userId ",{"type":32,"tag":171,"props":1251,"children":1252},{"style":343},[1253],{"type":37,"value":1254},"%",{"type":32,"tag":171,"props":1256,"children":1257},{"style":274},[1258],{"type":37,"value":1259}," 100",{"type":32,"tag":171,"props":1261,"children":1262},{"style":343},[1263],{"type":37,"value":1264}," \u003C",{"type":32,"tag":171,"props":1266,"children":1267},{"style":274},[1268],{"type":37,"value":1269}," 50",{"type":32,"tag":171,"props":1271,"children":1272},{"style":184},[1273],{"type":37,"value":1274},") ",{"type":32,"tag":171,"props":1276,"children":1277},{"style":343},[1278],{"type":37,"value":1279},"?",{"type":32,"tag":171,"props":1281,"children":1282},{"style":208},[1283],{"type":37,"value":1284}," 'v1'",{"type":32,"tag":171,"props":1286,"children":1287},{"style":343},[1288],{"type":37,"value":1289}," :",{"type":32,"tag":171,"props":1291,"children":1292},{"style":208},[1293],{"type":37,"value":1294}," 'v2'",{"type":32,"tag":171,"props":1296,"children":1297},{"style":184},[1298],{"type":37,"value":1299},";\n",{"type":32,"tag":171,"props":1301,"children":1302},{"class":173,"line":219},[1303,1307,1312,1316,1321,1326,1331],{"type":32,"tag":171,"props":1304,"children":1305},{"style":343},[1306],{"type":37,"value":1213},{"type":32,"tag":171,"props":1308,"children":1309},{"style":274},[1310],{"type":37,"value":1311}," promptUrl",{"type":32,"tag":171,"props":1313,"children":1314},{"style":343},[1315],{"type":37,"value":1223},{"type":32,"tag":171,"props":1317,"children":1318},{"style":208},[1319],{"type":37,"value":1320}," `https:\u002F\u002Fraw.githubusercontent.com\u002Froibase\u002Fprompts\u002Fmain\u002F${",{"type":32,"tag":171,"props":1322,"children":1323},{"style":184},[1324],{"type":37,"value":1325},"variant",{"type":32,"tag":171,"props":1327,"children":1328},{"style":208},[1329],{"type":37,"value":1330},"}.md`",{"type":32,"tag":171,"props":1332,"children":1333},{"style":184},[1334],{"type":37,"value":1299},{"type":32,"tag":171,"props":1336,"children":1337},{"class":173,"line":233},[1338],{"type":32,"tag":171,"props":1339,"children":1340},{"emptyLinePlaceholder":367},[1341],{"type":37,"value":370},{"type":32,"tag":171,"props":1343,"children":1344},{"class":173,"line":242},[1345],{"type":32,"tag":171,"props":1346,"children":1347},{"style":1202},[1348],{"type":37,"value":1349},"\u002F\u002F Aggiungi metadati alla API call\n",{"type":32,"tag":171,"props":1351,"children":1352},{"class":173,"line":250},[1353,1358],{"type":32,"tag":171,"props":1354,"children":1355},{"style":343},[1356],{"type":37,"value":1357},"return",{"type":32,"tag":171,"props":1359,"children":1360},{"style":184},[1361],{"type":37,"value":1362}," {\n",{"type":32,"tag":171,"props":1364,"children":1365},{"class":173,"line":26},[1366],{"type":32,"tag":171,"props":1367,"children":1368},{"style":184},[1369],{"type":37,"value":1370},"  json: {\n",{"type":32,"tag":171,"props":1372,"children":1373},{"class":173,"line":280},[1374,1379,1384,1389,1394,1399,1404,1409,1414,1419,1423],{"type":32,"tag":171,"props":1375,"children":1376},{"style":184},[1377],{"type":37,"value":1378},"    prompt: ",{"type":32,"tag":171,"props":1380,"children":1381},{"style":343},[1382],{"type":37,"value":1383},"await",{"type":32,"tag":171,"props":1385,"children":1386},{"style":178},[1387],{"type":37,"value":1388}," fetch",{"type":32,"tag":171,"props":1390,"children":1391},{"style":184},[1392],{"type":37,"value":1393},"(promptUrl).",{"type":32,"tag":171,"props":1395,"children":1396},{"style":178},[1397],{"type":37,"value":1398},"then",{"type":32,"tag":171,"props":1400,"children":1401},{"style":184},[1402],{"type":37,"value":1403},"(",{"type":32,"tag":171,"props":1405,"children":1406},{"style":599},[1407],{"type":37,"value":1408},"r",{"type":32,"tag":171,"props":1410,"children":1411},{"style":343},[1412],{"type":37,"value":1413}," =>",{"type":32,"tag":171,"props":1415,"children":1416},{"style":184},[1417],{"type":37,"value":1418}," r.",{"type":32,"tag":171,"props":1420,"children":1421},{"style":178},[1422],{"type":37,"value":37},{"type":32,"tag":171,"props":1424,"children":1425},{"style":184},[1426],{"type":37,"value":1427},"()),\n",{"type":32,"tag":171,"props":1429,"children":1430},{"class":173,"line":289},[1431],{"type":32,"tag":171,"props":1432,"children":1433},{"style":184},[1434],{"type":37,"value":1435},"    metadata: {\n",{"type":32,"tag":171,"props":1437,"children":1438},{"class":173,"line":618},[1439],{"type":32,"tag":171,"props":1440,"children":1441},{"style":184},[1442],{"type":37,"value":1443},"      prompt_version: variant,\n",{"type":32,"tag":171,"props":1445,"children":1446},{"class":173,"line":636},[1447,1452],{"type":32,"tag":171,"props":1448,"children":1449},{"style":184},[1450],{"type":37,"value":1451},"      experiment_id: ",{"type":32,"tag":171,"props":1453,"children":1454},{"style":208},[1455],{"type":37,"value":1456},"'blog_tone_test_2026_05'\n",{"type":32,"tag":171,"props":1458,"children":1459},{"class":173,"line":866},[1460],{"type":32,"tag":171,"props":1461,"children":1462},{"style":184},[1463],{"type":37,"value":1464},"    }\n",{"type":32,"tag":171,"props":1466,"children":1467},{"class":173,"line":889},[1468],{"type":32,"tag":171,"props":1469,"children":1470},{"style":184},[1471],{"type":37,"value":286},{"type":32,"tag":171,"props":1473,"children":1474},{"class":173,"line":910},[1475],{"type":32,"tag":171,"props":1476,"children":1477},{"style":184},[1478],{"type":37,"value":1479},"};\n",{"type":32,"tag":33,"props":1481,"children":1482},{},[1483],{"type":37,"value":1484},"Analisi in BigQuery:",{"type":32,"tag":162,"props":1486,"children":1490},{"className":1487,"code":1488,"language":1489,"meta":16,"style":16},"language-sql shiki shiki-themes github-dark","SELECT\n  metadata.value:prompt_version AS variant,\n  COUNT(DISTINCT user_id) AS users,\n  AVG(session_duration_sec) AS avg_duration,\n  SUM(conversion) \u002F COUNT(*) AS cvr\nFROM events\nWHERE experiment_id = 'blog_tone_test_2026_05'\n  AND event_date >= '2026-05-01'\nGROUP BY 1\n","sql",[1491],{"type":32,"tag":62,"props":1492,"children":1493},{"__ignoreMap":16},[1494,1502,1535,1566,1588,1633,1646,1668,1691],{"type":32,"tag":171,"props":1495,"children":1496},{"class":173,"line":174},[1497],{"type":32,"tag":171,"props":1498,"children":1499},{"style":343},[1500],{"type":37,"value":1501},"SELECT\n",{"type":32,"tag":171,"props":1503,"children":1504},{"class":173,"line":190},[1505,1510,1515,1520,1525,1530],{"type":32,"tag":171,"props":1506,"children":1507},{"style":274},[1508],{"type":37,"value":1509},"  metadata",{"type":32,"tag":171,"props":1511,"children":1512},{"style":184},[1513],{"type":37,"value":1514},".",{"type":32,"tag":171,"props":1516,"children":1517},{"style":274},[1518],{"type":37,"value":1519},"value",{"type":32,"tag":171,"props":1521,"children":1522},{"style":184},[1523],{"type":37,"value":1524},":prompt_version ",{"type":32,"tag":171,"props":1526,"children":1527},{"style":343},[1528],{"type":37,"value":1529},"AS",{"type":32,"tag":171,"props":1531,"children":1532},{"style":184},[1533],{"type":37,"value":1534}," variant,\n",{"type":32,"tag":171,"props":1536,"children":1537},{"class":173,"line":199},[1538,1543,1547,1552,1557,1561],{"type":32,"tag":171,"props":1539,"children":1540},{"style":274},[1541],{"type":37,"value":1542},"  COUNT",{"type":32,"tag":171,"props":1544,"children":1545},{"style":184},[1546],{"type":37,"value":1403},{"type":32,"tag":171,"props":1548,"children":1549},{"style":343},[1550],{"type":37,"value":1551},"DISTINCT",{"type":32,"tag":171,"props":1553,"children":1554},{"style":184},[1555],{"type":37,"value":1556}," user_id) ",{"type":32,"tag":171,"props":1558,"children":1559},{"style":343},[1560],{"type":37,"value":1529},{"type":32,"tag":171,"props":1562,"children":1563},{"style":184},[1564],{"type":37,"value":1565}," users,\n",{"type":32,"tag":171,"props":1567,"children":1568},{"class":173,"line":219},[1569,1574,1579,1583],{"type":32,"tag":171,"props":1570,"children":1571},{"style":274},[1572],{"type":37,"value":1573},"  AVG",{"type":32,"tag":171,"props":1575,"children":1576},{"style":184},[1577],{"type":37,"value":1578},"(session_duration_sec) ",{"type":32,"tag":171,"props":1580,"children":1581},{"style":343},[1582],{"type":37,"value":1529},{"type":32,"tag":171,"props":1584,"children":1585},{"style":184},[1586],{"type":37,"value":1587}," avg_duration,\n",{"type":32,"tag":171,"props":1589,"children":1590},{"class":173,"line":233},[1591,1596,1601,1606,1611,1615,1620,1624,1628],{"type":32,"tag":171,"props":1592,"children":1593},{"style":274},[1594],{"type":37,"value":1595},"  SUM",{"type":32,"tag":171,"props":1597,"children":1598},{"style":184},[1599],{"type":37,"value":1600},"(conversion) ",{"type":32,"tag":171,"props":1602,"children":1603},{"style":343},[1604],{"type":37,"value":1605},"\u002F",{"type":32,"tag":171,"props":1607,"children":1608},{"style":274},[1609],{"type":37,"value":1610}," COUNT",{"type":32,"tag":171,"props":1612,"children":1613},{"style":184},[1614],{"type":37,"value":1403},{"type":32,"tag":171,"props":1616,"children":1617},{"style":343},[1618],{"type":37,"value":1619},"*",{"type":32,"tag":171,"props":1621,"children":1622},{"style":184},[1623],{"type":37,"value":1274},{"type":32,"tag":171,"props":1625,"children":1626},{"style":343},[1627],{"type":37,"value":1529},{"type":32,"tag":171,"props":1629,"children":1630},{"style":184},[1631],{"type":37,"value":1632}," cvr\n",{"type":32,"tag":171,"props":1634,"children":1635},{"class":173,"line":242},[1636,1641],{"type":32,"tag":171,"props":1637,"children":1638},{"style":343},[1639],{"type":37,"value":1640},"FROM",{"type":32,"tag":171,"props":1642,"children":1643},{"style":184},[1644],{"type":37,"value":1645}," events\n",{"type":32,"tag":171,"props":1647,"children":1648},{"class":173,"line":250},[1649,1654,1659,1663],{"type":32,"tag":171,"props":1650,"children":1651},{"style":343},[1652],{"type":37,"value":1653},"WHERE",{"type":32,"tag":171,"props":1655,"children":1656},{"style":184},[1657],{"type":37,"value":1658}," experiment_id ",{"type":32,"tag":171,"props":1660,"children":1661},{"style":343},[1662],{"type":37,"value":401},{"type":32,"tag":171,"props":1664,"children":1665},{"style":208},[1666],{"type":37,"value":1667}," 'blog_tone_test_2026_05'\n",{"type":32,"tag":171,"props":1669,"children":1670},{"class":173,"line":26},[1671,1676,1681,1686],{"type":32,"tag":171,"props":1672,"children":1673},{"style":343},[1674],{"type":37,"value":1675},"  AND",{"type":32,"tag":171,"props":1677,"children":1678},{"style":184},[1679],{"type":37,"value":1680}," event_date ",{"type":32,"tag":171,"props":1682,"children":1683},{"style":343},[1684],{"type":37,"value":1685},">=",{"type":32,"tag":171,"props":1687,"children":1688},{"style":208},[1689],{"type":37,"value":1690}," '2026-05-01'\n",{"type":32,"tag":171,"props":1692,"children":1693},{"class":173,"line":280},[1694,1699],{"type":32,"tag":171,"props":1695,"children":1696},{"style":343},[1697],{"type":37,"value":1698},"GROUP BY",{"type":32,"tag":171,"props":1700,"children":1701},{"style":274},[1702],{"type":37,"value":1703}," 1\n",{"type":32,"tag":33,"props":1705,"children":1706},{},[1707],{"type":37,"value":1708},"Risultato: la variante v2 ha aumentato il CVR da 0,042 a 0,051 (+21%), p-value 0,003 — può essere portata in production con fiducia.",{"type":32,"tag":45,"props":1710,"children":1712},{"id":1711},"langsmith-observability-e-rilevamento-di-regressioni-a-lungo-termine",[1713],{"type":37,"value":1714},"LangSmith: Observability e Rilevamento di Regressioni a Lungo Termine",{"type":32,"tag":33,"props":1716,"children":1717},{},[1718],{"type":37,"value":1719},"Promptfoo fa i test locali, LangSmith fornisce observability in production. Ogni LLM call viene tracciato: input, output, latency, token count, versione del modello, versione del prompt.",{"type":32,"tag":33,"props":1721,"children":1722},{},[1723,1725,1730],{"type":37,"value":1724},"Il vantaggio di LangSmith è il ",{"type":32,"tag":123,"props":1726,"children":1727},{},[1728],{"type":37,"value":1729},"tracking delle metriche a lungo termine",{"type":37,"value":1731},". Se un bug della versione del prompt di 3 mesi fa viene scoperto oggi tramite feedback, torna alla trace, vedi la differenza input\u002Foutput, trova quale versione era attiva quel giorno, fai il rollback.",{"type":32,"tag":33,"props":1733,"children":1734},{},[1735],{"type":37,"value":1736},"Esempio di trace:",{"type":32,"tag":162,"props":1738,"children":1742},{"className":1739,"code":1740,"language":1741,"meta":16,"style":16},"language-json shiki shiki-themes github-dark","{\n  \"run_id\": \"abc123\",\n  \"prompt_version\": \"v2.1\",\n  \"model\": \"claude-3-5-sonnet-20241022\",\n  \"input\": {\"topic\": \"Server-side GTM\", \"category\": \"tech\"},\n  \"output\": \"---\\ntitle: \\\"Server-Side GTM...\\\"\",\n  \"latency_ms\": 2341,\n  \"tokens\": {\"input\": 1842, \"output\": 1523},\n  \"cost_usd\": 0.0137,\n  \"feedback\": {\"score\": 4, \"comment\": \"il titolo è troppo lungo\"}\n}\n","json",[1743],{"type":32,"tag":62,"props":1744,"children":1745},{"__ignoreMap":16},[1746,1754,1775,1796,1817,1867,1917,1938,1986,2007,2055],{"type":32,"tag":171,"props":1747,"children":1748},{"class":173,"line":174},[1749],{"type":32,"tag":171,"props":1750,"children":1751},{"style":184},[1752],{"type":37,"value":1753},"{\n",{"type":32,"tag":171,"props":1755,"children":1756},{"class":173,"line":190},[1757,1762,1766,1771],{"type":32,"tag":171,"props":1758,"children":1759},{"style":274},[1760],{"type":37,"value":1761},"  \"run_id\"",{"type":32,"tag":171,"props":1763,"children":1764},{"style":184},[1765],{"type":37,"value":539},{"type":32,"tag":171,"props":1767,"children":1768},{"style":208},[1769],{"type":37,"value":1770},"\"abc123\"",{"type":32,"tag":171,"props":1772,"children":1773},{"style":184},[1774],{"type":37,"value":216},{"type":32,"tag":171,"props":1776,"children":1777},{"class":173,"line":199},[1778,1783,1787,1792],{"type":32,"tag":171,"props":1779,"children":1780},{"style":274},[1781],{"type":37,"value":1782},"  \"prompt_version\"",{"type":32,"tag":171,"props":1784,"children":1785},{"style":184},[1786],{"type":37,"value":539},{"type":32,"tag":171,"props":1788,"children":1789},{"style":208},[1790],{"type":37,"value":1791},"\"v2.1\"",{"type":32,"tag":171,"props":1793,"children":1794},{"style":184},[1795],{"type":37,"value":216},{"type":32,"tag":171,"props":1797,"children":1798},{"class":173,"line":219},[1799,1804,1808,1813],{"type":32,"tag":171,"props":1800,"children":1801},{"style":274},[1802],{"type":37,"value":1803},"  \"model\"",{"type":32,"tag":171,"props":1805,"children":1806},{"style":184},[1807],{"type":37,"value":539},{"type":32,"tag":171,"props":1809,"children":1810},{"style":208},[1811],{"type":37,"value":1812},"\"claude-3-5-sonnet-20241022\"",{"type":32,"tag":171,"props":1814,"children":1815},{"style":184},[1816],{"type":37,"value":216},{"type":32,"tag":171,"props":1818,"children":1819},{"class":173,"line":233},[1820,1825,1830,1835,1839,1844,1848,1853,1857,1862],{"type":32,"tag":171,"props":1821,"children":1822},{"style":274},[1823],{"type":37,"value":1824},"  \"input\"",{"type":32,"tag":171,"props":1826,"children":1827},{"style":184},[1828],{"type":37,"value":1829},": {",{"type":32,"tag":171,"props":1831,"children":1832},{"style":274},[1833],{"type":37,"value":1834},"\"topic\"",{"type":32,"tag":171,"props":1836,"children":1837},{"style":184},[1838],{"type":37,"value":539},{"type":32,"tag":171,"props":1840,"children":1841},{"style":208},[1842],{"type":37,"value":1843},"\"Server-side GTM\"",{"type":32,"tag":171,"props":1845,"children":1846},{"style":184},[1847],{"type":37,"value":416},{"type":32,"tag":171,"props":1849,"children":1850},{"style":274},[1851],{"type":37,"value":1852},"\"category\"",{"type":32,"tag":171,"props":1854,"children":1855},{"style":184},[1856],{"type":37,"value":539},{"type":32,"tag":171,"props":1858,"children":1859},{"style":208},[1860],{"type":37,"value":1861},"\"tech\"",{"type":32,"tag":171,"props":1863,"children":1864},{"style":184},[1865],{"type":37,"value":1866},"},\n",{"type":32,"tag":171,"props":1868,"children":1869},{"class":173,"line":242},[1870,1875,1879,1884,1889,1894,1899,1904,1908,1913],{"type":32,"tag":171,"props":1871,"children":1872},{"style":274},[1873],{"type":37,"value":1874},"  \"output\"",{"type":32,"tag":171,"props":1876,"children":1877},{"style":184},[1878],{"type":37,"value":539},{"type":32,"tag":171,"props":1880,"children":1881},{"style":208},[1882],{"type":37,"value":1883},"\"---",{"type":32,"tag":171,"props":1885,"children":1886},{"style":274},[1887],{"type":37,"value":1888},"\\n",{"type":32,"tag":171,"props":1890,"children":1891},{"style":208},[1892],{"type":37,"value":1893},"title: ",{"type":32,"tag":171,"props":1895,"children":1896},{"style":274},[1897],{"type":37,"value":1898},"\\\"",{"type":32,"tag":171,"props":1900,"children":1901},{"style":208},[1902],{"type":37,"value":1903},"Server-Side GTM...",{"type":32,"tag":171,"props":1905,"children":1906},{"style":274},[1907],{"type":37,"value":1898},{"type":32,"tag":171,"props":1909,"children":1910},{"style":208},[1911],{"type":37,"value":1912},"\"",{"type":32,"tag":171,"props":1914,"children":1915},{"style":184},[1916],{"type":37,"value":216},{"type":32,"tag":171,"props":1918,"children":1919},{"class":173,"line":250},[1920,1925,1929,1934],{"type":32,"tag":171,"props":1921,"children":1922},{"style":274},[1923],{"type":37,"value":1924},"  \"latency_ms\"",{"type":32,"tag":171,"props":1926,"children":1927},{"style":184},[1928],{"type":37,"value":539},{"type":32,"tag":171,"props":1930,"children":1931},{"style":274},[1932],{"type":37,"value":1933},"2341",{"type":32,"tag":171,"props":1935,"children":1936},{"style":184},[1937],{"type":37,"value":216},{"type":32,"tag":171,"props":1939,"children":1940},{"class":173,"line":26},[1941,1946,1950,1955,1959,1964,1968,1973,1977,1982],{"type":32,"tag":171,"props":1942,"children":1943},{"style":274},[1944],{"type":37,"value":1945},"  \"tokens\"",{"type":32,"tag":171,"props":1947,"children":1948},{"style":184},[1949],{"type":37,"value":1829},{"type":32,"tag":171,"props":1951,"children":1952},{"style":274},[1953],{"type":37,"value":1954},"\"input\"",{"type":32,"tag":171,"props":1956,"children":1957},{"style":184},[1958],{"type":37,"value":539},{"type":32,"tag":171,"props":1960,"children":1961},{"style":274},[1962],{"type":37,"value":1963},"1842",{"type":32,"tag":171,"props":1965,"children":1966},{"style":184},[1967],{"type":37,"value":416},{"type":32,"tag":171,"props":1969,"children":1970},{"style":274},[1971],{"type":37,"value":1972},"\"output\"",{"type":32,"tag":171,"props":1974,"children":1975},{"style":184},[1976],{"type":37,"value":539},{"type":32,"tag":171,"props":1978,"children":1979},{"style":274},[1980],{"type":37,"value":1981},"1523",{"type":32,"tag":171,"props":1983,"children":1984},{"style":184},[1985],{"type":37,"value":1866},{"type":32,"tag":171,"props":1987,"children":1988},{"class":173,"line":280},[1989,1994,1998,2003],{"type":32,"tag":171,"props":1990,"children":1991},{"style":274},[1992],{"type":37,"value":1993},"  \"cost_usd\"",{"type":32,"tag":171,"props":1995,"children":1996},{"style":184},[1997],{"type":37,"value":539},{"type":32,"tag":171,"props":1999,"children":2000},{"style":274},[2001],{"type":37,"value":2002},"0.0137",{"type":32,"tag":171,"props":2004,"children":2005},{"style":184},[2006],{"type":37,"value":216},{"type":32,"tag":171,"props":2008,"children":2009},{"class":173,"line":289},[2010,2015,2019,2023,2027,2032,2036,2041,2045,2050],{"type":32,"tag":171,"props":2011,"children":2012},{"style":274},[2013],{"type":37,"value":2014},"  \"feedback\"",{"type":32,"tag":171,"props":2016,"children":2017},{"style":184},[2018],{"type":37,"value":1829},{"type":32,"tag":171,"props":2020,"children":2021},{"style":274},[2022],{"type":37,"value":534},{"type":32,"tag":171,"props":2024,"children":2025},{"style":184},[2026],{"type":37,"value":539},{"type":32,"tag":171,"props":2028,"children":2029},{"style":274},[2030],{"type":37,"value":2031},"4",{"type":32,"tag":171,"props":2033,"children":2034},{"style":184},[2035],{"type":37,"value":416},{"type":32,"tag":171,"props":2037,"children":2038},{"style":274},[2039],{"type":37,"value":2040},"\"comment\"",{"type":32,"tag":171,"props":2042,"children":2043},{"style":184},[2044],{"type":37,"value":539},{"type":32,"tag":171,"props":2046,"children":2047},{"style":208},[2048],{"type":37,"value":2049},"\"il titolo è troppo lungo\"",{"type":32,"tag":171,"props":2051,"children":2052},{"style":184},[2053],{"type":37,"value":2054},"}\n",{"type":32,"tag":171,"props":2056,"children":2057},{"class":173,"line":618},[2058],{"type":32,"tag":171,"props":2059,"children":2060},{"style":184},[2061],{"type":37,"value":2054},{"type":32,"tag":33,"props":2063,"children":2064},{},[2065],{"type":37,"value":2066},"Ciclo di feedback: gli editor danno un punteggio 1-5 a ogni blog, LangSmith lega questi punteggi alla trace, il rapporto settimanale avvisa \"la versione v2.3 ha ridotto il punteggio medio a 3,2\". Rollback immediato → diff del prompt → identifica il problema → correggi.",{"type":32,"tag":688,"props":2068,"children":2070},{"id":2069},"gestione-dei-dataset-tenere-il-golden-set-sotto-controllo-di-versione",[2071],{"type":37,"value":2072},"Gestione dei Dataset: Tenere il Golden Set Sotto Controllo di Versione",{"type":32,"tag":33,"props":2074,"children":2075},{},[2076,2078,2083],{"type":37,"value":2077},"Il cuore della pipeline di evaluation è il ",{"type":32,"tag":123,"props":2079,"children":2080},{},[2081],{"type":37,"value":2082},"golden dataset",{"type":37,"value":2084}," — coppie input\u002Foutput conosciute, il riferimento del comportamento previsto. Mantenere questo dataset in Notion, aggiornarlo manualmente in Google Sheets è un rischio di regressione.",{"type":32,"tag":33,"props":2086,"children":2087},{},[2088],{"type":37,"value":2089},"LangSmith dataset sotto controllo di versione:",{"type":32,"tag":162,"props":2091,"children":2093},{"className":331,"code":2092,"language":333,"meta":16,"style":16},"from langsmith import Client\n\nclient = Client()\n\ndataset = client.create_dataset(\"marketing_blog_golden_v3\")\n\n# Aggiungi gli esempi golden\nexamples = [\n    {\n        \"inputs\": {\"topic\": \"Server-side GTM\", \"category\": \"tech\"},\n        \"outputs\": {\"title\": \"Server-Side GTM: Misurazione Dopo i Cookie\"},\n        \"metadata\": {\"expected_h2_count\": 5, \"expected_word_count\": 1500}\n    },\n    # 50+ esempi...\n]\n\nfor ex in examples:\n    client.create_example(**ex, dataset_id=dataset.id)\n",[2094],{"type":32,"tag":62,"props":2095,"children":2096},{"__ignoreMap":16},[2097,2117,2124,2141,2148,2174,2181,2189,2206,2214,2258,2288,2336,2344,2352,2359,2366,2387],{"type":32,"tag":171,"props":2098,"children":2099},{"class":173,"line":174},[2100,2104,2108,2112],{"type":32,"tag":171,"props":2101,"children":2102},{"style":343},[2103],{"type":37,"value":346},{"type":32,"tag":171,"props":2105,"children":2106},{"style":184},[2107],{"type":37,"value":351},{"type":32,"tag":171,"props":2109,"children":2110},{"style":343},[2111],{"type":37,"value":356},{"type":32,"tag":171,"props":2113,"children":2114},{"style":184},[2115],{"type":37,"value":2116}," Client\n",{"type":32,"tag":171,"props":2118,"children":2119},{"class":173,"line":190},[2120],{"type":32,"tag":171,"props":2121,"children":2122},{"emptyLinePlaceholder":367},[2123],{"type":37,"value":370},{"type":32,"tag":171,"props":2125,"children":2126},{"class":173,"line":199},[2127,2132,2136],{"type":32,"tag":171,"props":2128,"children":2129},{"style":184},[2130],{"type":37,"value":2131},"client ",{"type":32,"tag":171,"props":2133,"children":2134},{"style":343},[2135],{"type":37,"value":401},{"type":32,"tag":171,"props":2137,"children":2138},{"style":184},[2139],{"type":37,"value":2140}," Client()\n",{"type":32,"tag":171,"props":2142,"children":2143},{"class":173,"line":219},[2144],{"type":32,"tag":171,"props":2145,"children":2146},{"emptyLinePlaceholder":367},[2147],{"type":37,"value":370},{"type":32,"tag":171,"props":2149,"children":2150},{"class":173,"line":233},[2151,2156,2160,2165,2170],{"type":32,"tag":171,"props":2152,"children":2153},{"style":184},[2154],{"type":37,"value":2155},"dataset ",{"type":32,"tag":171,"props":2157,"children":2158},{"style":343},[2159],{"type":37,"value":401},{"type":32,"tag":171,"props":2161,"children":2162},{"style":184},[2163],{"type":37,"value":2164}," client.create_dataset(",{"type":32,"tag":171,"props":2166,"children":2167},{"style":208},[2168],{"type":37,"value":2169},"\"marketing_blog_golden_v3\"",{"type":32,"tag":171,"props":2171,"children":2172},{"style":184},[2173],{"type":37,"value":642},{"type":32,"tag":171,"props":2175,"children":2176},{"class":173,"line":242},[2177],{"type":32,"tag":171,"props":2178,"children":2179},{"emptyLinePlaceholder":367},[2180],{"type":37,"value":370},{"type":32,"tag":171,"props":2182,"children":2183},{"class":173,"line":250},[2184],{"type":32,"tag":171,"props":2185,"children":2186},{"style":1202},[2187],{"type":37,"value":2188},"# Aggiungi gli esempi golden\n",{"type":32,"tag":171,"props":2190,"children":2191},{"class":173,"line":26},[2192,2197,2201],{"type":32,"tag":171,"props":2193,"children":2194},{"style":184},[2195],{"type":37,"value":2196},"examples ",{"type":32,"tag":171,"props":2198,"children":2199},{"style":343},[2200],{"type":37,"value":401},{"type":32,"tag":171,"props":2202,"children":2203},{"style":184},[2204],{"type":37,"value":2205}," [\n",{"type":32,"tag":171,"props":2207,"children":2208},{"class":173,"line":280},[2209],{"type":32,"tag":171,"props":2210,"children":2211},{"style":184},[2212],{"type":37,"value":2213},"    {\n",{"type":32,"tag":171,"props":2215,"children":2216},{"class":173,"line":289},[2217,2222,2226,2230,2234,2238,2242,2246,2250,2254],{"type":32,"tag":171,"props":2218,"children":2219},{"style":208},[2220],{"type":37,"value":2221},"        \"inputs\"",{"type":32,"tag":171,"props":2223,"children":2224},{"style":184},[2225],{"type":37,"value":1829},{"type":32,"tag":171,"props":2227,"children":2228},{"style":208},[2229],{"type":37,"value":1834},{"type":32,"tag":171,"props":2231,"children":2232},{"style":184},[2233],{"type":37,"value":539},{"type":32,"tag":171,"props":2235,"children":2236},{"style":208},[2237],{"type":37,"value":1843},{"type":32,"tag":171,"props":2239,"children":2240},{"style":184},[2241],{"type":37,"value":416},{"type":32,"tag":171,"props":2243,"children":2244},{"style":208},[2245],{"type":37,"value":1852},{"type":32,"tag":171,"props":2247,"children":2248},{"style":184},[2249],{"type":37,"value":539},{"type":32,"tag":171,"props":2251,"children":2252},{"style":208},[2253],{"type":37,"value":1861},{"type":32,"tag":171,"props":2255,"children":2256},{"style":184},[2257],{"type":37,"value":1866},{"type":32,"tag":171,"props":2259,"children":2260},{"class":173,"line":618},[2261,2266,2270,2275,2279,2284],{"type":32,"tag":171,"props":2262,"children":2263},{"style":208},[2264],{"type":37,"value":2265},"        \"outputs\"",{"type":32,"tag":171,"props":2267,"children":2268},{"style":184},[2269],{"type":37,"value":1829},{"type":32,"tag":171,"props":2271,"children":2272},{"style":208},[2273],{"type":37,"value":2274},"\"title\"",{"type":32,"tag":171,"props":2276,"children":2277},{"style":184},[2278],{"type":37,"value":539},{"type":32,"tag":171,"props":2280,"children":2281},{"style":208},[2282],{"type":37,"value":2283},"\"Server-Side GTM: Misurazione Dopo i Cookie\"",{"type":32,"tag":171,"props":2285,"children":2286},{"style":184},[2287],{"type":37,"value":1866},{"type":32,"tag":171,"props":2289,"children":2290},{"class":173,"line":636},[2291,2296,2300,2305,2309,2314,2318,2323,2327,2332],{"type":32,"tag":171,"props":2292,"children":2293},{"style":208},[2294],{"type":37,"value":2295},"        \"metadata\"",{"type":32,"tag":171,"props":2297,"children":2298},{"style":184},[2299],{"type":37,"value":1829},{"type":32,"tag":171,"props":2301,"children":2302},{"style":208},[2303],{"type":37,"value":2304},"\"expected_h2_count\"",{"type":32,"tag":171,"props":2306,"children":2307},{"style":184},[2308],{"type":37,"value":539},{"type":32,"tag":171,"props":2310,"children":2311},{"style":274},[2312],{"type":37,"value":2313},"5",{"type":32,"tag":171,"props":2315,"children":2316},{"style":184},[2317],{"type":37,"value":416},{"type":32,"tag":171,"props":2319,"children":2320},{"style":208},[2321],{"type":37,"value":2322},"\"expected_word_count\"",{"type":32,"tag":171,"props":2324,"children":2325},{"style":184},[2326],{"type":37,"value":539},{"type":32,"tag":171,"props":2328,"children":2329},{"style":274},[2330],{"type":37,"value":2331},"1500",{"type":32,"tag":171,"props":2333,"children":2334},{"style":184},[2335],{"type":37,"value":2054},{"type":32,"tag":171,"props":2337,"children":2338},{"class":173,"line":866},[2339],{"type":32,"tag":171,"props":2340,"children":2341},{"style":184},[2342],{"type":37,"value":2343},"    },\n",{"type":32,"tag":171,"props":2345,"children":2346},{"class":173,"line":889},[2347],{"type":32,"tag":171,"props":2348,"children":2349},{"style":1202},[2350],{"type":37,"value":2351},"    # 50+ esempi...\n",{"type":32,"tag":171,"props":2353,"children":2354},{"class":173,"line":910},[2355],{"type":32,"tag":171,"props":2356,"children":2357},{"style":184},[2358],{"type":37,"value":295},{"type":32,"tag":171,"props":2360,"children":2361},{"class":173,"line":928},[2362],{"type":32,"tag":171,"props":2363,"children":2364},{"emptyLinePlaceholder":367},[2365],{"type":37,"value":370},{"type":32,"tag":171,"props":2367,"children":2368},{"class":173,"line":949},[2369,2373,2378,2382],{"type":32,"tag":171,"props":2370,"children":2371},{"style":343},[2372],{"type":37,"value":483},{"type":32,"tag":171,"props":2374,"children":2375},{"style":184},[2376],{"type":37,"value":2377}," ex ",{"type":32,"tag":171,"props":2379,"children":2380},{"style":343},[2381],{"type":37,"value":493},{"type":32,"tag":171,"props":2383,"children":2384},{"style":184},[2385],{"type":37,"value":2386}," examples:\n",{"type":32,"tag":171,"props":2388,"children":2389},{"class":173,"line":966},[2390,2395,2400,2405,2410,2414],{"type":32,"tag":171,"props":2391,"children":2392},{"style":184},[2393],{"type":37,"value":2394},"    client.create_example(",{"type":32,"tag":171,"props":2396,"children":2397},{"style":343},[2398],{"type":37,"value":2399},"**",{"type":32,"tag":171,"props":2401,"children":2402},{"style":184},[2403],{"type":37,"value":2404},"ex, ",{"type":32,"tag":171,"props":2406,"children":2407},{"style":599},[2408],{"type":37,"value":2409},"dataset_id",{"type":32,"tag":171,"props":2411,"children":2412},{"style":343},[2413],{"type":37,"value":401},{"type":32,"tag":171,"props":2415,"children":2416},{"style":184},[2417],{"type":37,"value":2418},"dataset.id)\n",{"type":32,"tag":33,"props":2420,"children":2421},{},[2422],{"type":37,"value":2423},"Test ogni modifica del prompt contro questo dataset. Se il pass rate scende, non fare deploy. Aggiungi nuovi edge case al dataset (i bug che trovi in production), evita regressioni.",{"type":32,"tag":45,"props":2425,"children":2427},{"id":2426},"tradeoff-metriche-deterministiche-vs-output-creativo",[2428],{"type":37,"value":2429},"Tradeoff: Metriche Deterministiche vs Output Creativo",{"type":32,"tag":33,"props":2431,"children":2432},{},[2433],{"type":37,"value":2434},"La forza dell'LLM è la non-determinismo — lo stesso input produce output diversi. Ma in un sistema in production, questo potere è un rischio: il cliente vede markdown diverso ogni volta che ricarica la pagina, alcuni sono errati.",{"type":32,"tag":33,"props":2436,"children":2437},{},[2438],{"type":37,"value":2439},"Temperatura 0 aumenta il determinismo, ma l'output diventa monotono. Tradeoff:",{"type":32,"tag":84,"props":2441,"children":2442},{},[2443,2453,2463],{"type":32,"tag":88,"props":2444,"children":2445},{},[2446,2451],{"type":32,"tag":123,"props":2447,"children":2448},{},[2449],{"type":37,"value":2450},"Temperatura 0",{"type":37,"value":2452},": ideale per le suite di evaluation, monotono in production",{"type":32,"tag":88,"props":2454,"children":2455},{},[2456,2461],{"type":32,"tag":123,"props":2457,"children":2458},{},[2459],{"type":37,"value":2460},"Temperatura 0,3-0,5",{"type":37,"value":2462},": varietà ragionevole, comunque coerente",{"type":32,"tag":88,"props":2464,"children":2465},{},[2466,2471],{"type":32,"tag":123,"props":2467,"children":2468},{},[2469],{"type":37,"value":2470},"Temperatura 0,7+",{"type":37,"value":2472},": creativo, ma sorprese in production anche se l'evaluation è verde",{"type":32,"tag":33,"props":2474,"children":2475},{},[2476],{"type":37,"value":2477},"Soluzione: temperatura 0 nell'evaluation, 0,4 in production, nel golden set conserva 5 output accettabili diversi per ogni input (controllo di range).",{"type":32,"tag":33,"props":2479,"children":2480},{},[2481,2483,2488],{"type":37,"value":2482},"Un altro tradeoff: ",{"type":32,"tag":123,"props":2484,"children":2485},{},[2486],{"type":37,"value":2487},"latency vs qualità",{"type":37,"value":2489},". Un prompt più lungo dà output migliore ma il costo dei token di input aumenta, la latency sale. In Promptfoo, se la metrica di latency supera 2,5s fai un alert — non rovinare l'esperienza utente.",{"type":32,"tag":45,"props":2491,"children":2493},{"id":2492},"checklist-di-production-prima-di-deployare-il-sistema-llm",[2494],{"type":37,"value":2495},"Checklist di Production: Prima di Deployare il Sistema LLM",{"type":32,"tag":33,"props":2497,"children":2498},{},[2499],{"type":37,"value":2500},"Checklist di controllo prima del deploy:",{"type":32,"tag":84,"props":2502,"children":2505},{"className":2503},[2504],"contains-task-list",[2506,2518,2527,2536,2545,2554,2563,2572,2581],{"type":32,"tag":88,"props":2507,"children":2510},{"className":2508},[2509],"task-list-item",[2511,2516],{"type":32,"tag":2512,"props":2513,"children":2515},"input",{"disabled":367,"type":2514},"checkbox",[],{"type":37,"value":2517}," Il prompt è in git repo, la storia dei commit è pulita",{"type":32,"tag":88,"props":2519,"children":2521},{"className":2520},[2509],[2522,2525],{"type":32,"tag":2512,"props":2523,"children":2524},{"disabled":367,"type":2514},[],{"type":37,"value":2526}," La suite di evaluation Promptfoo ha pass rate > 95%",{"type":32,"tag":88,"props":2528,"children":2530},{"className":2529},[2509],[2531,2534],{"type":32,"tag":2512,"props":2532,"children":2533},{"disabled":367,"type":2514},[],{"type":37,"value":2535}," Il golden dataset ha min 50 esempi",{"type":32,"tag":88,"props":2537,"children":2539},{"className":2538},[2509],[2540,2543],{"type":32,"tag":2512,"props":2541,"children":2542},{"disabled":367,"type":2514},[],{"type":37,"value":2544}," Il piano di A\u002FB test è pronto, la sample size è calcolata",{"type":32,"tag":88,"props":2546,"children":2548},{"className":2547},[2509],[2549,2552],{"type":32,"tag":2512,"props":2550,"children":2551},{"disabled":367,"type":2514},[],{"type":37,"value":2553}," LangSmith trace è attivo, la API key è in production",{"type":32,"tag":88,"props":2555,"children":2557},{"className":2556},[2509],[2558,2561],{"type":32,"tag":2512,"props":2559,"children":2560},{"disabled":367,"type":2514},[],{"type":37,"value":2562}," Il ciclo di feedback è implementato (scoring da editore, join BigQuery)",{"type":32,"tag":88,"props":2564,"children":2566},{"className":2565},[2509],[2567,2570],{"type":32,"tag":2512,"props":2568,"children":2569},{"disabled":367,"type":2514},[],{"type":37,"value":2571}," La procedura di rollback è definita (su quale calo di metrica tornare indietro automaticamente)",{"type":32,"tag":88,"props":2573,"children":2575},{"className":2574},[2509],[2576,2579],{"type":32,"tag":2512,"props":2577,"children":2578},{"disabled":367,"type":2514},[],{"type":37,"value":2580}," Il monitoring dei costi — daily token spend threshold $X",{"type":32,"tag":88,"props":2582,"children":2584},{"className":2583},[2509],[2585,2588],{"type":32,"tag":2512,"props":2586,"children":2587},{"disabled":367,"type":2514},[],{"type":37,"value":2589}," SLA di latency — p95 \u003C 3s",{"type":32,"tag":33,"props":2591,"children":2592},{},[2593],{"type":37,"value":2594},"Se non completai questa lista, non stai fornendo un \"servizio AI\", sei ancora presto. Senza versionamento, evaluation e observability, le operazioni LLM in production non sono una disciplina ingegneristica, sono caos controllato.",{"type":32,"tag":2596,"props":2597,"children":2598},"hr",{},[],{"type":32,"tag":33,"props":2600,"children":2601},{},[2602,2604,2611],{"type":37,"value":2603},"Il versionamento dei prompt è una questione di disciplina — non per la velocità, ma per l'affidabilità. In tattiche come ",{"type":32,"tag":677,"props":2605,"children":2608},{"href":2606,"rel":2607},"https:\u002F\u002Fwww.roibase.com.tr\u002Ffr\u002Fgeo",[681],[2609],{"type":37,"value":2610},"Generative Engine Optimization",{"type":37,"value":2612},", la qualità dell'output è direttamente collegata al risultato di business. Senza una pipeline di evaluation, ogni deployment rischia la performance precedente. Promptfoo fornisce la sicurezza locale, LangSmith la visibilità in production. Insieme, portano le operazioni LLM agli standard dell'ingegneria del software.",{"type":32,"tag":2614,"props":2615,"children":2616},"style",{},[2617],{"type":37,"value":2618},"html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}",{"title":16,"searchDepth":199,"depth":199,"links":2620},[2621,2622,2625,2626,2629,2630],{"id":47,"depth":190,"text":50},{"id":110,"depth":190,"text":113,"children":2623},[2624],{"id":690,"depth":199,"text":693},{"id":1124,"depth":190,"text":1127},{"id":1711,"depth":190,"text":1714,"children":2627},[2628],{"id":2069,"depth":199,"text":2072},{"id":2426,"depth":190,"text":2429},{"id":2492,"depth":190,"text":2495},"markdown","content:fr:ai:versionamento-prompt-ab-test.md","content","fr\u002Fai\u002Fversionamento-prompt-ab-test.md","fr\u002Fai\u002Fversionamento-prompt-ab-test","md",1778709809230]