[{"data":1,"prerenderedAt":2637},["ShallowReactive",2],{"article-alternates":3,"article-\u002Fit\u002Fai\u002Fversionamento-prompt-e-a-b-test-disciplina-llm-ops":13},{"i18nKey":4,"paths":5},"ai-004-2026-05",{"de":6,"en":7,"es":8,"fr":9,"it":10,"ru":11,"tr":12},"\u002Fde\u002Fai\u002Fprompt-versionierung-llm-evaluation","\u002Fen\u002Fai\u002Fllm-ops-prompt-versioning-ab-testing","\u002Fes\u002Fai\u002Fversionado-prompts-ab-testing-llm-ops","\u002Ffr\u002Fai\u002Fversionamento-prompt-ab-test","\u002Fit\u002Fai\u002Fversionamento-prompt-e-a-b-test-disciplina-llm-ops","\u002Fru\u002Fai\u002Fprompt-versionierung-und-ab-tests-llm-ops-disziplin","\u002Ftr\u002Fai\u002Fprompt-versiyonlama-ve-a-b-testi-llm-operasyonun-disiplini",{"_path":10,"_dir":14,"_draft":15,"_partial":15,"_locale":16,"title":17,"description":18,"publishedAt":19,"modifiedAt":19,"category":14,"i18nKey":4,"tags":20,"readingTime":26,"author":27,"body":28,"_type":2631,"_id":2632,"_source":2633,"_file":2634,"_stem":2635,"_extension":2636},"ai",false,"","Versionamento dei Prompt e A\u002FB Test: La Disciplina delle Operazioni LLM","Come costruire versioning dei prompt, pipeline di evaluation e controllo qualità deterministico con Promptfoo e LangSmith nei sistemi LLM di produzione.","2026-05-13",[21,22,23,24,25],"llm-ops","prompt-engineering","evaluation","mlops","ai-quality",9,"Roibase",{"type":29,"children":30,"toc":2619},"root",[31,39,44,51,56,78,83,103,108,114,119,130,148,161,296,306,324,329,643,653,671,687,694,699,704,1020,1033,1117,1122,1128,1133,1185,1190,1480,1485,1704,1709,1715,1720,1732,1737,2062,2067,2073,2085,2090,2419,2424,2430,2435,2440,2473,2478,2490,2496,2501,2590,2595,2599,2613],{"type":32,"tag":33,"props":34,"children":35},"element","p",{},[36],{"type":37,"value":38},"text","Nei sistemi che utilizzano LLM, tra \"funziona\" e \"affidabile in produzione\" ci sono 15 passi. L'automazione del marketing produce output in markdown con Claude API, la segmentazione del customer journey avviene con GPT — ma quando modifichi il prompt, come sei sicuro di non aver creato una regressione? Nell'ingegneria software il versioning, il test coverage, la CI\u002FCD sono standard; nelle operazioni LLM senza questa disciplina ogni deployment è una scommessa.",{"type":32,"tag":33,"props":40,"children":41},{},[42],{"type":37,"value":43},"Strumenti come Promptfoo e LangSmith forniscono questa disciplina: versionamento dei prompt, evaluation deterministici, A\u002FB test, tracking delle metriche. Questo articolo mostra come costruire il controllo qualità nei sistemi LLM di produzione — a livello di infrastruttura, non di codice sorgente.",{"type":32,"tag":45,"props":46,"children":48},"h2",{"id":47},"lillusione-che-il-prompt-non-sia-codice-software",[49],{"type":37,"value":50},"L'Illusione che il Prompt non sia Codice Software",{"type":32,"tag":33,"props":52,"children":53},{},[54],{"type":37,"value":55},"La maggior parte dei team vede il prompt come un \"file di configurazione\" — editor nell'UI, documentazione in Notion, testo hardcodato in un nodo di workflow n8n. In realtà, il prompt è una specifica eseguibile che definisce il comportamento del sistema. Eppure non c'è versionamento, non ci sono diff, non c'è rollback.",{"type":32,"tag":33,"props":57,"children":58},{},[59,61,68,70,76],{"type":37,"value":60},"Un commit su Git con messaggio \"fix typo\" può cambiare il tono dell'output del modello e far crollare le metriche. Soprattutto negli scenari di structured output (JSON schema, frontmatter markdown, query SQL), una sola parola fuori posto può rompere il formato e causare errori a cascata. Esempio: scrivere ",{"type":32,"tag":62,"props":63,"children":65},"code",{"className":64},[],[66],{"type":37,"value":67},"OUTPUT FORMAT: JSON",{"type":37,"value":69}," al posto di ",{"type":32,"tag":62,"props":71,"children":73},{"className":72},[],[74],{"type":37,"value":75},"OUTPUT FORMAT: Valid JSON",{"type":37,"value":77}," fa sì che il modello a volte aggiunga paragrafi di spiegazione — il parser downstream va in crash, gli alert si moltiplicano, tre ore di debug.",{"type":32,"tag":33,"props":79,"children":80},{},[81],{"type":37,"value":82},"La disciplina del versioning deve rispondere a queste domande:",{"type":32,"tag":84,"props":85,"children":86},"ul",{},[87,93,98],{"type":32,"tag":88,"props":89,"children":90},"li",{},[91],{"type":37,"value":92},"Quale versione del prompt è in produzione adesso?",{"type":32,"tag":88,"props":94,"children":95},{},[96],{"type":37,"value":97},"Qual è la differenza di performance tra la versione di due settimane fa e quella attuale?",{"type":32,"tag":88,"props":99,"children":100},{},[101],{"type":37,"value":102},"Quale variante dell'A\u002FB test ha aumentato la conversione dell'8%?",{"type":32,"tag":33,"props":104,"children":105},{},[106],{"type":37,"value":107},"Se non puoi rispondere a queste domande, non stai facendo \"operazioni AI\", stai conducendo esperimenti manuali.",{"type":32,"tag":45,"props":109,"children":111},{"id":110},"pipeline-di-evaluation-tre-livelli-per-misurare-loutput",[112],{"type":37,"value":113},"Pipeline di Evaluation: Tre Livelli per Misurare l'Output",{"type":32,"tag":33,"props":115,"children":116},{},[117],{"type":37,"value":118},"La valutazione dell'output dell'LLM sembra soggettiva, ma nei sistemi di produzione è possibile costruire metriche deterministiche. La valutazione funziona su tre livelli: sintassi, semantica, outcome di business.",{"type":32,"tag":33,"props":120,"children":121},{},[122,128],{"type":32,"tag":123,"props":124,"children":125},"strong",{},[126],{"type":37,"value":127},"Il livello di sintassi",{"type":37,"value":129}," — conformità del formato:",{"type":32,"tag":84,"props":131,"children":132},{},[133,138,143],{"type":32,"tag":88,"props":134,"children":135},{},[136],{"type":37,"value":137},"Il JSON viene parsato correttamente?",{"type":32,"tag":88,"props":139,"children":140},{},[141],{"type":37,"value":142},"Il frontmatter markdown è valido?",{"type":32,"tag":88,"props":144,"children":145},{},[146],{"type":37,"value":147},"Sono presenti tutti i field attesi?",{"type":32,"tag":33,"props":149,"children":150},{},[151,153,159],{"type":37,"value":152},"Con Promptfoo si controlla tramite asserzioni ",{"type":32,"tag":62,"props":154,"children":156},{"className":155},[],[157],{"type":37,"value":158},"javascript",{"type":37,"value":160},":",{"type":32,"tag":162,"props":163,"children":166},"pre",{"className":164,"code":165,"language":158,"meta":16,"style":16},"language-javascript shiki shiki-themes github-dark","assert: [\n  {\n    type: \"javascript\",\n    value: \"JSON.parse(output).title.length \u003C= 60\"\n  },\n  {\n    type: \"is-json\",\n    value: true\n  }\n]\n",[167],{"type":32,"tag":62,"props":168,"children":169},{"__ignoreMap":16},[170,188,197,217,231,240,248,265,279,287],{"type":32,"tag":171,"props":172,"children":175},"span",{"class":173,"line":174},"line",1,[176,182],{"type":32,"tag":171,"props":177,"children":179},{"style":178},"--shiki-default:#B392F0",[180],{"type":37,"value":181},"assert",{"type":32,"tag":171,"props":183,"children":185},{"style":184},"--shiki-default:#E1E4E8",[186],{"type":37,"value":187},": [\n",{"type":32,"tag":171,"props":189,"children":191},{"class":173,"line":190},2,[192],{"type":32,"tag":171,"props":193,"children":194},{"style":184},[195],{"type":37,"value":196},"  {\n",{"type":32,"tag":171,"props":198,"children":200},{"class":173,"line":199},3,[201,206,212],{"type":32,"tag":171,"props":202,"children":203},{"style":184},[204],{"type":37,"value":205},"    type: ",{"type":32,"tag":171,"props":207,"children":209},{"style":208},"--shiki-default:#9ECBFF",[210],{"type":37,"value":211},"\"javascript\"",{"type":32,"tag":171,"props":213,"children":214},{"style":184},[215],{"type":37,"value":216},",\n",{"type":32,"tag":171,"props":218,"children":220},{"class":173,"line":219},4,[221,226],{"type":32,"tag":171,"props":222,"children":223},{"style":184},[224],{"type":37,"value":225},"    value: ",{"type":32,"tag":171,"props":227,"children":228},{"style":208},[229],{"type":37,"value":230},"\"JSON.parse(output).title.length \u003C= 60\"\n",{"type":32,"tag":171,"props":232,"children":234},{"class":173,"line":233},5,[235],{"type":32,"tag":171,"props":236,"children":237},{"style":184},[238],{"type":37,"value":239},"  },\n",{"type":32,"tag":171,"props":241,"children":243},{"class":173,"line":242},6,[244],{"type":32,"tag":171,"props":245,"children":246},{"style":184},[247],{"type":37,"value":196},{"type":32,"tag":171,"props":249,"children":251},{"class":173,"line":250},7,[252,256,261],{"type":32,"tag":171,"props":253,"children":254},{"style":184},[255],{"type":37,"value":205},{"type":32,"tag":171,"props":257,"children":258},{"style":208},[259],{"type":37,"value":260},"\"is-json\"",{"type":32,"tag":171,"props":262,"children":263},{"style":184},[264],{"type":37,"value":216},{"type":32,"tag":171,"props":266,"children":268},{"class":173,"line":267},8,[269,273],{"type":32,"tag":171,"props":270,"children":271},{"style":184},[272],{"type":37,"value":225},{"type":32,"tag":171,"props":274,"children":276},{"style":275},"--shiki-default:#79B8FF",[277],{"type":37,"value":278},"true\n",{"type":32,"tag":171,"props":280,"children":281},{"class":173,"line":26},[282],{"type":32,"tag":171,"props":283,"children":284},{"style":184},[285],{"type":37,"value":286},"  }\n",{"type":32,"tag":171,"props":288,"children":290},{"class":173,"line":289},10,[291],{"type":32,"tag":171,"props":292,"children":293},{"style":184},[294],{"type":37,"value":295},"]\n",{"type":32,"tag":33,"props":297,"children":298},{},[299,304],{"type":32,"tag":123,"props":300,"children":301},{},[302],{"type":37,"value":303},"Il livello di semantica",{"type":37,"value":305}," — qualità del contenuto:",{"type":32,"tag":84,"props":307,"children":308},{},[309,314,319],{"type":32,"tag":88,"props":310,"children":311},{},[312],{"type":37,"value":313},"La risposta è pertinente al topic? (somiglianza di embedding, cosine distance > 0.85)",{"type":32,"tag":88,"props":315,"children":316},{},[317],{"type":37,"value":318},"Contiene parole vietate? (regex, token filtering)",{"type":32,"tag":88,"props":320,"children":321},{},[322],{"type":37,"value":323},"Il tono è corretto? (modello classifier, sentiment score)",{"type":32,"tag":33,"props":325,"children":326},{},[327],{"type":37,"value":328},"Con LangSmith, uno evaluator personalizzato:",{"type":32,"tag":162,"props":330,"children":334},{"className":331,"code":332,"language":333,"meta":16,"style":16},"language-python shiki shiki-themes github-dark","from langsmith import evaluate\n\ndef check_brand_compliance(run, example):\n    forbidden = [\"esperto\", \"leader\", \"rivoluzionario\"]\n    output = run.outputs[\"text\"].lower()\n    violations = [w for w in forbidden if w in output]\n    return {\"score\": 0 if violations else 1, \"violations\": violations}\n\nevaluate(\n    dataset_name=\"marketing_blog_posts\",\n    evaluators=[check_brand_compliance]\n)\n","python",[335],{"type":32,"tag":62,"props":336,"children":337},{"__ignoreMap":16},[338,362,371,389,435,462,517,579,586,594,616,634],{"type":32,"tag":171,"props":339,"children":340},{"class":173,"line":174},[341,347,352,357],{"type":32,"tag":171,"props":342,"children":344},{"style":343},"--shiki-default:#F97583",[345],{"type":37,"value":346},"from",{"type":32,"tag":171,"props":348,"children":349},{"style":184},[350],{"type":37,"value":351}," langsmith ",{"type":32,"tag":171,"props":353,"children":354},{"style":343},[355],{"type":37,"value":356},"import",{"type":32,"tag":171,"props":358,"children":359},{"style":184},[360],{"type":37,"value":361}," evaluate\n",{"type":32,"tag":171,"props":363,"children":364},{"class":173,"line":190},[365],{"type":32,"tag":171,"props":366,"children":368},{"emptyLinePlaceholder":367},true,[369],{"type":37,"value":370},"\n",{"type":32,"tag":171,"props":372,"children":373},{"class":173,"line":199},[374,379,384],{"type":32,"tag":171,"props":375,"children":376},{"style":343},[377],{"type":37,"value":378},"def",{"type":32,"tag":171,"props":380,"children":381},{"style":178},[382],{"type":37,"value":383}," check_brand_compliance",{"type":32,"tag":171,"props":385,"children":386},{"style":184},[387],{"type":37,"value":388},"(run, example):\n",{"type":32,"tag":171,"props":390,"children":391},{"class":173,"line":219},[392,397,402,407,412,417,422,426,431],{"type":32,"tag":171,"props":393,"children":394},{"style":184},[395],{"type":37,"value":396},"    forbidden ",{"type":32,"tag":171,"props":398,"children":399},{"style":343},[400],{"type":37,"value":401},"=",{"type":32,"tag":171,"props":403,"children":404},{"style":184},[405],{"type":37,"value":406}," [",{"type":32,"tag":171,"props":408,"children":409},{"style":208},[410],{"type":37,"value":411},"\"esperto\"",{"type":32,"tag":171,"props":413,"children":414},{"style":184},[415],{"type":37,"value":416},", ",{"type":32,"tag":171,"props":418,"children":419},{"style":208},[420],{"type":37,"value":421},"\"leader\"",{"type":32,"tag":171,"props":423,"children":424},{"style":184},[425],{"type":37,"value":416},{"type":32,"tag":171,"props":427,"children":428},{"style":208},[429],{"type":37,"value":430},"\"rivoluzionario\"",{"type":32,"tag":171,"props":432,"children":433},{"style":184},[434],{"type":37,"value":295},{"type":32,"tag":171,"props":436,"children":437},{"class":173,"line":233},[438,443,447,452,457],{"type":32,"tag":171,"props":439,"children":440},{"style":184},[441],{"type":37,"value":442},"    output ",{"type":32,"tag":171,"props":444,"children":445},{"style":343},[446],{"type":37,"value":401},{"type":32,"tag":171,"props":448,"children":449},{"style":184},[450],{"type":37,"value":451}," run.outputs[",{"type":32,"tag":171,"props":453,"children":454},{"style":208},[455],{"type":37,"value":456},"\"text\"",{"type":32,"tag":171,"props":458,"children":459},{"style":184},[460],{"type":37,"value":461},"].lower()\n",{"type":32,"tag":171,"props":463,"children":464},{"class":173,"line":242},[465,470,474,479,484,489,494,499,504,508,512],{"type":32,"tag":171,"props":466,"children":467},{"style":184},[468],{"type":37,"value":469},"    violations ",{"type":32,"tag":171,"props":471,"children":472},{"style":343},[473],{"type":37,"value":401},{"type":32,"tag":171,"props":475,"children":476},{"style":184},[477],{"type":37,"value":478}," [w ",{"type":32,"tag":171,"props":480,"children":481},{"style":343},[482],{"type":37,"value":483},"for",{"type":32,"tag":171,"props":485,"children":486},{"style":184},[487],{"type":37,"value":488}," w ",{"type":32,"tag":171,"props":490,"children":491},{"style":343},[492],{"type":37,"value":493},"in",{"type":32,"tag":171,"props":495,"children":496},{"style":184},[497],{"type":37,"value":498}," forbidden ",{"type":32,"tag":171,"props":500,"children":501},{"style":343},[502],{"type":37,"value":503},"if",{"type":32,"tag":171,"props":505,"children":506},{"style":184},[507],{"type":37,"value":488},{"type":32,"tag":171,"props":509,"children":510},{"style":343},[511],{"type":37,"value":493},{"type":32,"tag":171,"props":513,"children":514},{"style":184},[515],{"type":37,"value":516}," output]\n",{"type":32,"tag":171,"props":518,"children":519},{"class":173,"line":250},[520,525,530,535,540,545,550,555,560,565,569,574],{"type":32,"tag":171,"props":521,"children":522},{"style":343},[523],{"type":37,"value":524},"    return",{"type":32,"tag":171,"props":526,"children":527},{"style":184},[528],{"type":37,"value":529}," {",{"type":32,"tag":171,"props":531,"children":532},{"style":208},[533],{"type":37,"value":534},"\"score\"",{"type":32,"tag":171,"props":536,"children":537},{"style":184},[538],{"type":37,"value":539},": ",{"type":32,"tag":171,"props":541,"children":542},{"style":275},[543],{"type":37,"value":544},"0",{"type":32,"tag":171,"props":546,"children":547},{"style":343},[548],{"type":37,"value":549}," if",{"type":32,"tag":171,"props":551,"children":552},{"style":184},[553],{"type":37,"value":554}," violations ",{"type":32,"tag":171,"props":556,"children":557},{"style":343},[558],{"type":37,"value":559},"else",{"type":32,"tag":171,"props":561,"children":562},{"style":275},[563],{"type":37,"value":564}," 1",{"type":32,"tag":171,"props":566,"children":567},{"style":184},[568],{"type":37,"value":416},{"type":32,"tag":171,"props":570,"children":571},{"style":208},[572],{"type":37,"value":573},"\"violations\"",{"type":32,"tag":171,"props":575,"children":576},{"style":184},[577],{"type":37,"value":578},": violations}\n",{"type":32,"tag":171,"props":580,"children":581},{"class":173,"line":267},[582],{"type":32,"tag":171,"props":583,"children":584},{"emptyLinePlaceholder":367},[585],{"type":37,"value":370},{"type":32,"tag":171,"props":587,"children":588},{"class":173,"line":26},[589],{"type":32,"tag":171,"props":590,"children":591},{"style":184},[592],{"type":37,"value":593},"evaluate(\n",{"type":32,"tag":171,"props":595,"children":596},{"class":173,"line":289},[597,603,607,612],{"type":32,"tag":171,"props":598,"children":600},{"style":599},"--shiki-default:#FFAB70",[601],{"type":37,"value":602},"    dataset_name",{"type":32,"tag":171,"props":604,"children":605},{"style":343},[606],{"type":37,"value":401},{"type":32,"tag":171,"props":608,"children":609},{"style":208},[610],{"type":37,"value":611},"\"marketing_blog_posts\"",{"type":32,"tag":171,"props":613,"children":614},{"style":184},[615],{"type":37,"value":216},{"type":32,"tag":171,"props":617,"children":619},{"class":173,"line":618},11,[620,625,629],{"type":32,"tag":171,"props":621,"children":622},{"style":599},[623],{"type":37,"value":624},"    evaluators",{"type":32,"tag":171,"props":626,"children":627},{"style":343},[628],{"type":37,"value":401},{"type":32,"tag":171,"props":630,"children":631},{"style":184},[632],{"type":37,"value":633},"[check_brand_compliance]\n",{"type":32,"tag":171,"props":635,"children":637},{"class":173,"line":636},12,[638],{"type":32,"tag":171,"props":639,"children":640},{"style":184},[641],{"type":37,"value":642},")\n",{"type":32,"tag":33,"props":644,"children":645},{},[646,651],{"type":32,"tag":123,"props":647,"children":648},{},[649],{"type":37,"value":650},"Il livello di outcome di business",{"type":37,"value":652}," — l'impatto reale:",{"type":32,"tag":84,"props":654,"children":655},{},[656,661,666],{"type":32,"tag":88,"props":657,"children":658},{},[659],{"type":37,"value":660},"È cambiato il CTR?",{"type":32,"tag":88,"props":662,"children":663},{},[664],{"type":37,"value":665},"È diminuita la conversione?",{"type":32,"tag":88,"props":667,"children":668},{},[669],{"type":37,"value":670},"È aumentato il bounce rate?",{"type":32,"tag":33,"props":672,"children":673},{},[674,676,685],{"type":37,"value":675},"Questo livello si connette alla telemetria di produzione — nel sistema di ",{"type":32,"tag":677,"props":678,"children":682},"a",{"href":679,"rel":680},"https:\u002F\u002Fwww.roibase.com.tr\u002Fit\u002Ffirstparty",[681],"nofollow",[683],{"type":37,"value":684},"Misurazione e Dati First-Party",{"type":37,"value":686},", la versione del prompt viene aggiunta ai metadati dell'evento, unita in BigQuery, e un modello dbt calcola il conversion rate per ogni versione.",{"type":32,"tag":688,"props":689,"children":691},"h3",{"id":690},"promptfoo-costruire-una-test-suite-deterministica",[692],{"type":37,"value":693},"Promptfoo: Costruire una Test Suite Deterministica",{"type":32,"tag":33,"props":695,"children":696},{},[697],{"type":37,"value":698},"Promptfoo è un framework di evaluation basato su YAML che gira in locale. L'obiettivo: convalidare ogni modifica del prompt attraverso un test di regressione prima di deployare.",{"type":32,"tag":33,"props":700,"children":701},{},[702],{"type":37,"value":703},"Un config semplice:",{"type":32,"tag":162,"props":705,"children":709},{"className":706,"code":707,"language":708,"meta":16,"style":16},"language-yaml shiki shiki-themes github-dark","prompts:\n  - file:\u002F\u002Fprompts\u002Fmarketing_blog_v1.md\n  - file:\u002F\u002Fprompts\u002Fmarketing_blog_v2.md\n\nproviders:\n  - anthropic:messages:claude-3-5-sonnet-20241022\n\ntests:\n  - vars:\n      topic: \"Server-side GTM\"\n      category: \"tech\"\n    assert:\n      - type: is-json\n      - type: javascript\n        value: \"output.title.length \u003C= 60\"\n      - type: similar\n        value: \"architettura di tracciamento server-side\"\n        threshold: 0.8\n      - type: not-contains\n        value: \"rivoluzionario\"\n","yaml",[710],{"type":32,"tag":62,"props":711,"children":712},{"__ignoreMap":16},[713,727,740,752,759,771,783,790,802,818,835,852,864,887,908,926,947,964,982,1003],{"type":32,"tag":171,"props":714,"children":715},{"class":173,"line":174},[716,722],{"type":32,"tag":171,"props":717,"children":719},{"style":718},"--shiki-default:#85E89D",[720],{"type":37,"value":721},"prompts",{"type":32,"tag":171,"props":723,"children":724},{"style":184},[725],{"type":37,"value":726},":\n",{"type":32,"tag":171,"props":728,"children":729},{"class":173,"line":190},[730,735],{"type":32,"tag":171,"props":731,"children":732},{"style":184},[733],{"type":37,"value":734},"  - ",{"type":32,"tag":171,"props":736,"children":737},{"style":208},[738],{"type":37,"value":739},"file:\u002F\u002Fprompts\u002Fmarketing_blog_v1.md\n",{"type":32,"tag":171,"props":741,"children":742},{"class":173,"line":199},[743,747],{"type":32,"tag":171,"props":744,"children":745},{"style":184},[746],{"type":37,"value":734},{"type":32,"tag":171,"props":748,"children":749},{"style":208},[750],{"type":37,"value":751},"file:\u002F\u002Fprompts\u002Fmarketing_blog_v2.md\n",{"type":32,"tag":171,"props":753,"children":754},{"class":173,"line":219},[755],{"type":32,"tag":171,"props":756,"children":757},{"emptyLinePlaceholder":367},[758],{"type":37,"value":370},{"type":32,"tag":171,"props":760,"children":761},{"class":173,"line":233},[762,767],{"type":32,"tag":171,"props":763,"children":764},{"style":718},[765],{"type":37,"value":766},"providers",{"type":32,"tag":171,"props":768,"children":769},{"style":184},[770],{"type":37,"value":726},{"type":32,"tag":171,"props":772,"children":773},{"class":173,"line":242},[774,778],{"type":32,"tag":171,"props":775,"children":776},{"style":184},[777],{"type":37,"value":734},{"type":32,"tag":171,"props":779,"children":780},{"style":208},[781],{"type":37,"value":782},"anthropic:messages:claude-3-5-sonnet-20241022\n",{"type":32,"tag":171,"props":784,"children":785},{"class":173,"line":250},[786],{"type":32,"tag":171,"props":787,"children":788},{"emptyLinePlaceholder":367},[789],{"type":37,"value":370},{"type":32,"tag":171,"props":791,"children":792},{"class":173,"line":267},[793,798],{"type":32,"tag":171,"props":794,"children":795},{"style":718},[796],{"type":37,"value":797},"tests",{"type":32,"tag":171,"props":799,"children":800},{"style":184},[801],{"type":37,"value":726},{"type":32,"tag":171,"props":803,"children":804},{"class":173,"line":26},[805,809,814],{"type":32,"tag":171,"props":806,"children":807},{"style":184},[808],{"type":37,"value":734},{"type":32,"tag":171,"props":810,"children":811},{"style":718},[812],{"type":37,"value":813},"vars",{"type":32,"tag":171,"props":815,"children":816},{"style":184},[817],{"type":37,"value":726},{"type":32,"tag":171,"props":819,"children":820},{"class":173,"line":289},[821,826,830],{"type":32,"tag":171,"props":822,"children":823},{"style":718},[824],{"type":37,"value":825},"      topic",{"type":32,"tag":171,"props":827,"children":828},{"style":184},[829],{"type":37,"value":539},{"type":32,"tag":171,"props":831,"children":832},{"style":208},[833],{"type":37,"value":834},"\"Server-side GTM\"\n",{"type":32,"tag":171,"props":836,"children":837},{"class":173,"line":618},[838,843,847],{"type":32,"tag":171,"props":839,"children":840},{"style":718},[841],{"type":37,"value":842},"      category",{"type":32,"tag":171,"props":844,"children":845},{"style":184},[846],{"type":37,"value":539},{"type":32,"tag":171,"props":848,"children":849},{"style":208},[850],{"type":37,"value":851},"\"tech\"\n",{"type":32,"tag":171,"props":853,"children":854},{"class":173,"line":636},[855,860],{"type":32,"tag":171,"props":856,"children":857},{"style":718},[858],{"type":37,"value":859},"    assert",{"type":32,"tag":171,"props":861,"children":862},{"style":184},[863],{"type":37,"value":726},{"type":32,"tag":171,"props":865,"children":867},{"class":173,"line":866},13,[868,873,878,882],{"type":32,"tag":171,"props":869,"children":870},{"style":184},[871],{"type":37,"value":872},"      - ",{"type":32,"tag":171,"props":874,"children":875},{"style":718},[876],{"type":37,"value":877},"type",{"type":32,"tag":171,"props":879,"children":880},{"style":184},[881],{"type":37,"value":539},{"type":32,"tag":171,"props":883,"children":884},{"style":208},[885],{"type":37,"value":886},"is-json\n",{"type":32,"tag":171,"props":888,"children":890},{"class":173,"line":889},14,[891,895,899,903],{"type":32,"tag":171,"props":892,"children":893},{"style":184},[894],{"type":37,"value":872},{"type":32,"tag":171,"props":896,"children":897},{"style":718},[898],{"type":37,"value":877},{"type":32,"tag":171,"props":900,"children":901},{"style":184},[902],{"type":37,"value":539},{"type":32,"tag":171,"props":904,"children":905},{"style":208},[906],{"type":37,"value":907},"javascript\n",{"type":32,"tag":171,"props":909,"children":911},{"class":173,"line":910},15,[912,917,921],{"type":32,"tag":171,"props":913,"children":914},{"style":718},[915],{"type":37,"value":916},"        value",{"type":32,"tag":171,"props":918,"children":919},{"style":184},[920],{"type":37,"value":539},{"type":32,"tag":171,"props":922,"children":923},{"style":208},[924],{"type":37,"value":925},"\"output.title.length \u003C= 60\"\n",{"type":32,"tag":171,"props":927,"children":929},{"class":173,"line":928},16,[930,934,938,942],{"type":32,"tag":171,"props":931,"children":932},{"style":184},[933],{"type":37,"value":872},{"type":32,"tag":171,"props":935,"children":936},{"style":718},[937],{"type":37,"value":877},{"type":32,"tag":171,"props":939,"children":940},{"style":184},[941],{"type":37,"value":539},{"type":32,"tag":171,"props":943,"children":944},{"style":208},[945],{"type":37,"value":946},"similar\n",{"type":32,"tag":171,"props":948,"children":950},{"class":173,"line":949},17,[951,955,959],{"type":32,"tag":171,"props":952,"children":953},{"style":718},[954],{"type":37,"value":916},{"type":32,"tag":171,"props":956,"children":957},{"style":184},[958],{"type":37,"value":539},{"type":32,"tag":171,"props":960,"children":961},{"style":208},[962],{"type":37,"value":963},"\"architettura di tracciamento server-side\"\n",{"type":32,"tag":171,"props":965,"children":967},{"class":173,"line":966},18,[968,973,977],{"type":32,"tag":171,"props":969,"children":970},{"style":718},[971],{"type":37,"value":972},"        threshold",{"type":32,"tag":171,"props":974,"children":975},{"style":184},[976],{"type":37,"value":539},{"type":32,"tag":171,"props":978,"children":979},{"style":275},[980],{"type":37,"value":981},"0.8\n",{"type":32,"tag":171,"props":983,"children":985},{"class":173,"line":984},19,[986,990,994,998],{"type":32,"tag":171,"props":987,"children":988},{"style":184},[989],{"type":37,"value":872},{"type":32,"tag":171,"props":991,"children":992},{"style":718},[993],{"type":37,"value":877},{"type":32,"tag":171,"props":995,"children":996},{"style":184},[997],{"type":37,"value":539},{"type":32,"tag":171,"props":999,"children":1000},{"style":208},[1001],{"type":37,"value":1002},"not-contains\n",{"type":32,"tag":171,"props":1004,"children":1006},{"class":173,"line":1005},20,[1007,1011,1015],{"type":32,"tag":171,"props":1008,"children":1009},{"style":718},[1010],{"type":37,"value":916},{"type":32,"tag":171,"props":1012,"children":1013},{"style":184},[1014],{"type":37,"value":539},{"type":32,"tag":171,"props":1016,"children":1017},{"style":208},[1018],{"type":37,"value":1019},"\"rivoluzionario\"\n",{"type":32,"tag":33,"props":1021,"children":1022},{},[1023,1025,1031],{"type":37,"value":1024},"Con il comando ",{"type":32,"tag":62,"props":1026,"children":1028},{"className":1027},[],[1029],{"type":37,"value":1030},"promptfoo eval",{"type":37,"value":1032},", tutte le varianti vengono testate e viene restituita una tabella di metriche:",{"type":32,"tag":1034,"props":1035,"children":1036},"table",{},[1037,1066],{"type":32,"tag":1038,"props":1039,"children":1040},"thead",{},[1041],{"type":32,"tag":1042,"props":1043,"children":1044},"tr",{},[1045,1051,1056,1061],{"type":32,"tag":1046,"props":1047,"children":1048},"th",{},[1049],{"type":37,"value":1050},"Prompt",{"type":32,"tag":1046,"props":1052,"children":1053},{},[1054],{"type":37,"value":1055},"Pass Rate",{"type":32,"tag":1046,"props":1057,"children":1058},{},[1059],{"type":37,"value":1060},"Avg Latency",{"type":32,"tag":1046,"props":1062,"children":1063},{},[1064],{"type":37,"value":1065},"Cost",{"type":32,"tag":1067,"props":1068,"children":1069},"tbody",{},[1070,1094],{"type":32,"tag":1042,"props":1071,"children":1072},{},[1073,1079,1084,1089],{"type":32,"tag":1074,"props":1075,"children":1076},"td",{},[1077],{"type":37,"value":1078},"v1",{"type":32,"tag":1074,"props":1080,"children":1081},{},[1082],{"type":37,"value":1083},"92%",{"type":32,"tag":1074,"props":1085,"children":1086},{},[1087],{"type":37,"value":1088},"2.3s",{"type":32,"tag":1074,"props":1090,"children":1091},{},[1092],{"type":37,"value":1093},"$0.012",{"type":32,"tag":1042,"props":1095,"children":1096},{},[1097,1102,1107,1112],{"type":32,"tag":1074,"props":1098,"children":1099},{},[1100],{"type":37,"value":1101},"v2",{"type":32,"tag":1074,"props":1103,"children":1104},{},[1105],{"type":37,"value":1106},"98%",{"type":32,"tag":1074,"props":1108,"children":1109},{},[1110],{"type":37,"value":1111},"2.1s",{"type":32,"tag":1074,"props":1113,"children":1114},{},[1115],{"type":37,"value":1116},"$0.014",{"type":32,"tag":33,"props":1118,"children":1119},{},[1120],{"type":37,"value":1121},"La versione v2 ha un pass rate migliore ma il costo è aumentato del 17% — il token count sta salendo, va investigato nel dettaglio. Se avessimo deployato senza vedere questo tradeoff, il budget mensile sarebbe esploso.",{"type":32,"tag":45,"props":1123,"children":1125},{"id":1124},"ab-test-confrontare-le-varianti-dei-prompt-in-produzione",[1126],{"type":37,"value":1127},"A\u002FB Test: Confrontare le Varianti dei Prompt in Produzione",{"type":32,"tag":33,"props":1129,"children":1130},{},[1131],{"type":37,"value":1132},"La test suite per l'evaluation è verde, ora serve il traffico reale. L'A\u002FB test in un sistema LLM funziona così:",{"type":32,"tag":1134,"props":1135,"children":1136},"ol",{},[1137,1147,1165,1175],{"type":32,"tag":88,"props":1138,"children":1139},{},[1140,1145],{"type":32,"tag":123,"props":1141,"children":1142},{},[1143],{"type":37,"value":1144},"Variant routing",{"type":37,"value":1146}," — seleziona la versione del prompt in base all'ID utente\u002Fsessione (% split)",{"type":32,"tag":88,"props":1148,"children":1149},{},[1150,1155,1157,1163],{"type":32,"tag":123,"props":1151,"children":1152},{},[1153],{"type":37,"value":1154},"Metadata tagging",{"type":37,"value":1156}," — aggiungi ",{"type":32,"tag":62,"props":1158,"children":1160},{"className":1159},[],[1161],{"type":37,"value":1162},"prompt_version",{"type":37,"value":1164}," ad ogni call API",{"type":32,"tag":88,"props":1166,"children":1167},{},[1168,1173],{"type":32,"tag":123,"props":1169,"children":1170},{},[1171],{"type":37,"value":1172},"Metric tracking",{"type":37,"value":1174}," — mantieni l'informazione della variante negli event downstream",{"type":32,"tag":88,"props":1176,"children":1177},{},[1178,1183],{"type":32,"tag":123,"props":1179,"children":1180},{},[1181],{"type":37,"value":1182},"Statistical significance",{"type":37,"value":1184}," — quando hai raccolto un numero sufficiente di sample (minimo 385 osservazioni per variante, 95% confidence), prendi una decisione",{"type":32,"tag":33,"props":1186,"children":1187},{},[1188],{"type":37,"value":1189},"Esempio di workflow n8n:",{"type":32,"tag":162,"props":1191,"children":1193},{"className":164,"code":1192,"language":158,"meta":16,"style":16},"\u002F\u002F Selezione della variante A\u002FB\nconst userId = $json.user_id;\nconst variant = (userId % 100 \u003C 50) ? 'v1' : 'v2';\nconst promptUrl = `https:\u002F\u002Fraw.githubusercontent.com\u002Froibase\u002Fprompts\u002Fmain\u002F${variant}.md`;\n\n\u002F\u002F Aggiungi metadati alla call API\nreturn {\n  json: {\n    prompt: await fetch(promptUrl).then(r => r.text()),\n    metadata: {\n      prompt_version: variant,\n      experiment_id: 'blog_tone_test_2026_05'\n    }\n  }\n};\n",[1194],{"type":32,"tag":62,"props":1195,"children":1196},{"__ignoreMap":16},[1197,1206,1229,1300,1335,1342,1350,1363,1371,1428,1436,1444,1457,1465,1472],{"type":32,"tag":171,"props":1198,"children":1199},{"class":173,"line":174},[1200],{"type":32,"tag":171,"props":1201,"children":1203},{"style":1202},"--shiki-default:#6A737D",[1204],{"type":37,"value":1205},"\u002F\u002F Selezione della variante A\u002FB\n",{"type":32,"tag":171,"props":1207,"children":1208},{"class":173,"line":190},[1209,1214,1219,1224],{"type":32,"tag":171,"props":1210,"children":1211},{"style":343},[1212],{"type":37,"value":1213},"const",{"type":32,"tag":171,"props":1215,"children":1216},{"style":275},[1217],{"type":37,"value":1218}," userId",{"type":32,"tag":171,"props":1220,"children":1221},{"style":343},[1222],{"type":37,"value":1223}," =",{"type":32,"tag":171,"props":1225,"children":1226},{"style":184},[1227],{"type":37,"value":1228}," $json.user_id;\n",{"type":32,"tag":171,"props":1230,"children":1231},{"class":173,"line":199},[1232,1236,1241,1245,1250,1255,1260,1265,1270,1275,1280,1285,1290,1295],{"type":32,"tag":171,"props":1233,"children":1234},{"style":343},[1235],{"type":37,"value":1213},{"type":32,"tag":171,"props":1237,"children":1238},{"style":275},[1239],{"type":37,"value":1240}," variant",{"type":32,"tag":171,"props":1242,"children":1243},{"style":343},[1244],{"type":37,"value":1223},{"type":32,"tag":171,"props":1246,"children":1247},{"style":184},[1248],{"type":37,"value":1249}," (userId ",{"type":32,"tag":171,"props":1251,"children":1252},{"style":343},[1253],{"type":37,"value":1254},"%",{"type":32,"tag":171,"props":1256,"children":1257},{"style":275},[1258],{"type":37,"value":1259}," 100",{"type":32,"tag":171,"props":1261,"children":1262},{"style":343},[1263],{"type":37,"value":1264}," \u003C",{"type":32,"tag":171,"props":1266,"children":1267},{"style":275},[1268],{"type":37,"value":1269}," 50",{"type":32,"tag":171,"props":1271,"children":1272},{"style":184},[1273],{"type":37,"value":1274},") ",{"type":32,"tag":171,"props":1276,"children":1277},{"style":343},[1278],{"type":37,"value":1279},"?",{"type":32,"tag":171,"props":1281,"children":1282},{"style":208},[1283],{"type":37,"value":1284}," 'v1'",{"type":32,"tag":171,"props":1286,"children":1287},{"style":343},[1288],{"type":37,"value":1289}," :",{"type":32,"tag":171,"props":1291,"children":1292},{"style":208},[1293],{"type":37,"value":1294}," 'v2'",{"type":32,"tag":171,"props":1296,"children":1297},{"style":184},[1298],{"type":37,"value":1299},";\n",{"type":32,"tag":171,"props":1301,"children":1302},{"class":173,"line":219},[1303,1307,1312,1316,1321,1326,1331],{"type":32,"tag":171,"props":1304,"children":1305},{"style":343},[1306],{"type":37,"value":1213},{"type":32,"tag":171,"props":1308,"children":1309},{"style":275},[1310],{"type":37,"value":1311}," promptUrl",{"type":32,"tag":171,"props":1313,"children":1314},{"style":343},[1315],{"type":37,"value":1223},{"type":32,"tag":171,"props":1317,"children":1318},{"style":208},[1319],{"type":37,"value":1320}," `https:\u002F\u002Fraw.githubusercontent.com\u002Froibase\u002Fprompts\u002Fmain\u002F${",{"type":32,"tag":171,"props":1322,"children":1323},{"style":184},[1324],{"type":37,"value":1325},"variant",{"type":32,"tag":171,"props":1327,"children":1328},{"style":208},[1329],{"type":37,"value":1330},"}.md`",{"type":32,"tag":171,"props":1332,"children":1333},{"style":184},[1334],{"type":37,"value":1299},{"type":32,"tag":171,"props":1336,"children":1337},{"class":173,"line":233},[1338],{"type":32,"tag":171,"props":1339,"children":1340},{"emptyLinePlaceholder":367},[1341],{"type":37,"value":370},{"type":32,"tag":171,"props":1343,"children":1344},{"class":173,"line":242},[1345],{"type":32,"tag":171,"props":1346,"children":1347},{"style":1202},[1348],{"type":37,"value":1349},"\u002F\u002F Aggiungi metadati alla call API\n",{"type":32,"tag":171,"props":1351,"children":1352},{"class":173,"line":250},[1353,1358],{"type":32,"tag":171,"props":1354,"children":1355},{"style":343},[1356],{"type":37,"value":1357},"return",{"type":32,"tag":171,"props":1359,"children":1360},{"style":184},[1361],{"type":37,"value":1362}," {\n",{"type":32,"tag":171,"props":1364,"children":1365},{"class":173,"line":267},[1366],{"type":32,"tag":171,"props":1367,"children":1368},{"style":184},[1369],{"type":37,"value":1370},"  json: {\n",{"type":32,"tag":171,"props":1372,"children":1373},{"class":173,"line":26},[1374,1379,1384,1389,1394,1399,1404,1409,1414,1419,1423],{"type":32,"tag":171,"props":1375,"children":1376},{"style":184},[1377],{"type":37,"value":1378},"    prompt: ",{"type":32,"tag":171,"props":1380,"children":1381},{"style":343},[1382],{"type":37,"value":1383},"await",{"type":32,"tag":171,"props":1385,"children":1386},{"style":178},[1387],{"type":37,"value":1388}," fetch",{"type":32,"tag":171,"props":1390,"children":1391},{"style":184},[1392],{"type":37,"value":1393},"(promptUrl).",{"type":32,"tag":171,"props":1395,"children":1396},{"style":178},[1397],{"type":37,"value":1398},"then",{"type":32,"tag":171,"props":1400,"children":1401},{"style":184},[1402],{"type":37,"value":1403},"(",{"type":32,"tag":171,"props":1405,"children":1406},{"style":599},[1407],{"type":37,"value":1408},"r",{"type":32,"tag":171,"props":1410,"children":1411},{"style":343},[1412],{"type":37,"value":1413}," =>",{"type":32,"tag":171,"props":1415,"children":1416},{"style":184},[1417],{"type":37,"value":1418}," r.",{"type":32,"tag":171,"props":1420,"children":1421},{"style":178},[1422],{"type":37,"value":37},{"type":32,"tag":171,"props":1424,"children":1425},{"style":184},[1426],{"type":37,"value":1427},"()),\n",{"type":32,"tag":171,"props":1429,"children":1430},{"class":173,"line":289},[1431],{"type":32,"tag":171,"props":1432,"children":1433},{"style":184},[1434],{"type":37,"value":1435},"    metadata: {\n",{"type":32,"tag":171,"props":1437,"children":1438},{"class":173,"line":618},[1439],{"type":32,"tag":171,"props":1440,"children":1441},{"style":184},[1442],{"type":37,"value":1443},"      prompt_version: variant,\n",{"type":32,"tag":171,"props":1445,"children":1446},{"class":173,"line":636},[1447,1452],{"type":32,"tag":171,"props":1448,"children":1449},{"style":184},[1450],{"type":37,"value":1451},"      experiment_id: ",{"type":32,"tag":171,"props":1453,"children":1454},{"style":208},[1455],{"type":37,"value":1456},"'blog_tone_test_2026_05'\n",{"type":32,"tag":171,"props":1458,"children":1459},{"class":173,"line":866},[1460],{"type":32,"tag":171,"props":1461,"children":1462},{"style":184},[1463],{"type":37,"value":1464},"    }\n",{"type":32,"tag":171,"props":1466,"children":1467},{"class":173,"line":889},[1468],{"type":32,"tag":171,"props":1469,"children":1470},{"style":184},[1471],{"type":37,"value":286},{"type":32,"tag":171,"props":1473,"children":1474},{"class":173,"line":910},[1475],{"type":32,"tag":171,"props":1476,"children":1477},{"style":184},[1478],{"type":37,"value":1479},"};\n",{"type":32,"tag":33,"props":1481,"children":1482},{},[1483],{"type":37,"value":1484},"Analisi in BigQuery:",{"type":32,"tag":162,"props":1486,"children":1490},{"className":1487,"code":1488,"language":1489,"meta":16,"style":16},"language-sql shiki shiki-themes github-dark","SELECT\n  metadata.value:prompt_version AS variant,\n  COUNT(DISTINCT user_id) AS users,\n  AVG(session_duration_sec) AS avg_duration,\n  SUM(conversion) \u002F COUNT(*) AS cvr\nFROM events\nWHERE experiment_id = 'blog_tone_test_2026_05'\n  AND event_date >= '2026-05-01'\nGROUP BY 1\n","sql",[1491],{"type":32,"tag":62,"props":1492,"children":1493},{"__ignoreMap":16},[1494,1502,1535,1566,1588,1633,1646,1668,1691],{"type":32,"tag":171,"props":1495,"children":1496},{"class":173,"line":174},[1497],{"type":32,"tag":171,"props":1498,"children":1499},{"style":343},[1500],{"type":37,"value":1501},"SELECT\n",{"type":32,"tag":171,"props":1503,"children":1504},{"class":173,"line":190},[1505,1510,1515,1520,1525,1530],{"type":32,"tag":171,"props":1506,"children":1507},{"style":275},[1508],{"type":37,"value":1509},"  metadata",{"type":32,"tag":171,"props":1511,"children":1512},{"style":184},[1513],{"type":37,"value":1514},".",{"type":32,"tag":171,"props":1516,"children":1517},{"style":275},[1518],{"type":37,"value":1519},"value",{"type":32,"tag":171,"props":1521,"children":1522},{"style":184},[1523],{"type":37,"value":1524},":prompt_version ",{"type":32,"tag":171,"props":1526,"children":1527},{"style":343},[1528],{"type":37,"value":1529},"AS",{"type":32,"tag":171,"props":1531,"children":1532},{"style":184},[1533],{"type":37,"value":1534}," variant,\n",{"type":32,"tag":171,"props":1536,"children":1537},{"class":173,"line":199},[1538,1543,1547,1552,1557,1561],{"type":32,"tag":171,"props":1539,"children":1540},{"style":275},[1541],{"type":37,"value":1542},"  COUNT",{"type":32,"tag":171,"props":1544,"children":1545},{"style":184},[1546],{"type":37,"value":1403},{"type":32,"tag":171,"props":1548,"children":1549},{"style":343},[1550],{"type":37,"value":1551},"DISTINCT",{"type":32,"tag":171,"props":1553,"children":1554},{"style":184},[1555],{"type":37,"value":1556}," user_id) ",{"type":32,"tag":171,"props":1558,"children":1559},{"style":343},[1560],{"type":37,"value":1529},{"type":32,"tag":171,"props":1562,"children":1563},{"style":184},[1564],{"type":37,"value":1565}," users,\n",{"type":32,"tag":171,"props":1567,"children":1568},{"class":173,"line":219},[1569,1574,1579,1583],{"type":32,"tag":171,"props":1570,"children":1571},{"style":275},[1572],{"type":37,"value":1573},"  AVG",{"type":32,"tag":171,"props":1575,"children":1576},{"style":184},[1577],{"type":37,"value":1578},"(session_duration_sec) ",{"type":32,"tag":171,"props":1580,"children":1581},{"style":343},[1582],{"type":37,"value":1529},{"type":32,"tag":171,"props":1584,"children":1585},{"style":184},[1586],{"type":37,"value":1587}," avg_duration,\n",{"type":32,"tag":171,"props":1589,"children":1590},{"class":173,"line":233},[1591,1596,1601,1606,1611,1615,1620,1624,1628],{"type":32,"tag":171,"props":1592,"children":1593},{"style":275},[1594],{"type":37,"value":1595},"  SUM",{"type":32,"tag":171,"props":1597,"children":1598},{"style":184},[1599],{"type":37,"value":1600},"(conversion) ",{"type":32,"tag":171,"props":1602,"children":1603},{"style":343},[1604],{"type":37,"value":1605},"\u002F",{"type":32,"tag":171,"props":1607,"children":1608},{"style":275},[1609],{"type":37,"value":1610}," COUNT",{"type":32,"tag":171,"props":1612,"children":1613},{"style":184},[1614],{"type":37,"value":1403},{"type":32,"tag":171,"props":1616,"children":1617},{"style":343},[1618],{"type":37,"value":1619},"*",{"type":32,"tag":171,"props":1621,"children":1622},{"style":184},[1623],{"type":37,"value":1274},{"type":32,"tag":171,"props":1625,"children":1626},{"style":343},[1627],{"type":37,"value":1529},{"type":32,"tag":171,"props":1629,"children":1630},{"style":184},[1631],{"type":37,"value":1632}," cvr\n",{"type":32,"tag":171,"props":1634,"children":1635},{"class":173,"line":242},[1636,1641],{"type":32,"tag":171,"props":1637,"children":1638},{"style":343},[1639],{"type":37,"value":1640},"FROM",{"type":32,"tag":171,"props":1642,"children":1643},{"style":184},[1644],{"type":37,"value":1645}," events\n",{"type":32,"tag":171,"props":1647,"children":1648},{"class":173,"line":250},[1649,1654,1659,1663],{"type":32,"tag":171,"props":1650,"children":1651},{"style":343},[1652],{"type":37,"value":1653},"WHERE",{"type":32,"tag":171,"props":1655,"children":1656},{"style":184},[1657],{"type":37,"value":1658}," experiment_id ",{"type":32,"tag":171,"props":1660,"children":1661},{"style":343},[1662],{"type":37,"value":401},{"type":32,"tag":171,"props":1664,"children":1665},{"style":208},[1666],{"type":37,"value":1667}," 'blog_tone_test_2026_05'\n",{"type":32,"tag":171,"props":1669,"children":1670},{"class":173,"line":267},[1671,1676,1681,1686],{"type":32,"tag":171,"props":1672,"children":1673},{"style":343},[1674],{"type":37,"value":1675},"  AND",{"type":32,"tag":171,"props":1677,"children":1678},{"style":184},[1679],{"type":37,"value":1680}," event_date ",{"type":32,"tag":171,"props":1682,"children":1683},{"style":343},[1684],{"type":37,"value":1685},">=",{"type":32,"tag":171,"props":1687,"children":1688},{"style":208},[1689],{"type":37,"value":1690}," '2026-05-01'\n",{"type":32,"tag":171,"props":1692,"children":1693},{"class":173,"line":26},[1694,1699],{"type":32,"tag":171,"props":1695,"children":1696},{"style":343},[1697],{"type":37,"value":1698},"GROUP BY",{"type":32,"tag":171,"props":1700,"children":1701},{"style":275},[1702],{"type":37,"value":1703}," 1\n",{"type":32,"tag":33,"props":1705,"children":1706},{},[1707],{"type":37,"value":1708},"Risultato: la variante v2 ha aumentato il CVR da 0.042 a 0.051 (+21%), p-value 0.003 — puoi passare con fiducia a produzione.",{"type":32,"tag":45,"props":1710,"children":1712},{"id":1711},"langsmith-observability-e-rilevamento-di-regressioni-long-term",[1713],{"type":37,"value":1714},"LangSmith: Observability e Rilevamento di Regressioni Long-Term",{"type":32,"tag":33,"props":1716,"children":1717},{},[1718],{"type":37,"value":1719},"Promptfoo fa i test in locale, LangSmith fornisce l'observability in produzione. Ogni call LLM viene tracciata: input, output, latency, token count, versione del modello, versione del prompt.",{"type":32,"tag":33,"props":1721,"children":1722},{},[1723,1725,1730],{"type":37,"value":1724},"Il vantaggio di LangSmith è il ",{"type":32,"tag":123,"props":1726,"children":1727},{},[1728],{"type":37,"value":1729},"tracking delle metriche nel lungo termine",{"type":37,"value":1731},". Se un bug di una versione del prompt da 3 mesi fa viene scoperto oggi tramite feedback, puoi tornare alla trace, vedere la differenza tra input\u002Foutput, trovare quale versione era in uso quel giorno, e fare rollback.",{"type":32,"tag":33,"props":1733,"children":1734},{},[1735],{"type":37,"value":1736},"Esempio di trace:",{"type":32,"tag":162,"props":1738,"children":1742},{"className":1739,"code":1740,"language":1741,"meta":16,"style":16},"language-json shiki shiki-themes github-dark","{\n  \"run_id\": \"abc123\",\n  \"prompt_version\": \"v2.1\",\n  \"model\": \"claude-3-5-sonnet-20241022\",\n  \"input\": {\"topic\": \"Server-side GTM\", \"category\": \"tech\"},\n  \"output\": \"---\\ntitle: \\\"Server-Side GTM...\\\"\",\n  \"latency_ms\": 2341,\n  \"tokens\": {\"input\": 1842, \"output\": 1523},\n  \"cost_usd\": 0.0137,\n  \"feedback\": {\"score\": 4, \"comment\": \"titolo troppo lungo\"}\n}\n","json",[1743],{"type":32,"tag":62,"props":1744,"children":1745},{"__ignoreMap":16},[1746,1754,1775,1796,1817,1867,1917,1938,1986,2007,2055],{"type":32,"tag":171,"props":1747,"children":1748},{"class":173,"line":174},[1749],{"type":32,"tag":171,"props":1750,"children":1751},{"style":184},[1752],{"type":37,"value":1753},"{\n",{"type":32,"tag":171,"props":1755,"children":1756},{"class":173,"line":190},[1757,1762,1766,1771],{"type":32,"tag":171,"props":1758,"children":1759},{"style":275},[1760],{"type":37,"value":1761},"  \"run_id\"",{"type":32,"tag":171,"props":1763,"children":1764},{"style":184},[1765],{"type":37,"value":539},{"type":32,"tag":171,"props":1767,"children":1768},{"style":208},[1769],{"type":37,"value":1770},"\"abc123\"",{"type":32,"tag":171,"props":1772,"children":1773},{"style":184},[1774],{"type":37,"value":216},{"type":32,"tag":171,"props":1776,"children":1777},{"class":173,"line":199},[1778,1783,1787,1792],{"type":32,"tag":171,"props":1779,"children":1780},{"style":275},[1781],{"type":37,"value":1782},"  \"prompt_version\"",{"type":32,"tag":171,"props":1784,"children":1785},{"style":184},[1786],{"type":37,"value":539},{"type":32,"tag":171,"props":1788,"children":1789},{"style":208},[1790],{"type":37,"value":1791},"\"v2.1\"",{"type":32,"tag":171,"props":1793,"children":1794},{"style":184},[1795],{"type":37,"value":216},{"type":32,"tag":171,"props":1797,"children":1798},{"class":173,"line":219},[1799,1804,1808,1813],{"type":32,"tag":171,"props":1800,"children":1801},{"style":275},[1802],{"type":37,"value":1803},"  \"model\"",{"type":32,"tag":171,"props":1805,"children":1806},{"style":184},[1807],{"type":37,"value":539},{"type":32,"tag":171,"props":1809,"children":1810},{"style":208},[1811],{"type":37,"value":1812},"\"claude-3-5-sonnet-20241022\"",{"type":32,"tag":171,"props":1814,"children":1815},{"style":184},[1816],{"type":37,"value":216},{"type":32,"tag":171,"props":1818,"children":1819},{"class":173,"line":233},[1820,1825,1830,1835,1839,1844,1848,1853,1857,1862],{"type":32,"tag":171,"props":1821,"children":1822},{"style":275},[1823],{"type":37,"value":1824},"  \"input\"",{"type":32,"tag":171,"props":1826,"children":1827},{"style":184},[1828],{"type":37,"value":1829},": {",{"type":32,"tag":171,"props":1831,"children":1832},{"style":275},[1833],{"type":37,"value":1834},"\"topic\"",{"type":32,"tag":171,"props":1836,"children":1837},{"style":184},[1838],{"type":37,"value":539},{"type":32,"tag":171,"props":1840,"children":1841},{"style":208},[1842],{"type":37,"value":1843},"\"Server-side GTM\"",{"type":32,"tag":171,"props":1845,"children":1846},{"style":184},[1847],{"type":37,"value":416},{"type":32,"tag":171,"props":1849,"children":1850},{"style":275},[1851],{"type":37,"value":1852},"\"category\"",{"type":32,"tag":171,"props":1854,"children":1855},{"style":184},[1856],{"type":37,"value":539},{"type":32,"tag":171,"props":1858,"children":1859},{"style":208},[1860],{"type":37,"value":1861},"\"tech\"",{"type":32,"tag":171,"props":1863,"children":1864},{"style":184},[1865],{"type":37,"value":1866},"},\n",{"type":32,"tag":171,"props":1868,"children":1869},{"class":173,"line":242},[1870,1875,1879,1884,1889,1894,1899,1904,1908,1913],{"type":32,"tag":171,"props":1871,"children":1872},{"style":275},[1873],{"type":37,"value":1874},"  \"output\"",{"type":32,"tag":171,"props":1876,"children":1877},{"style":184},[1878],{"type":37,"value":539},{"type":32,"tag":171,"props":1880,"children":1881},{"style":208},[1882],{"type":37,"value":1883},"\"---",{"type":32,"tag":171,"props":1885,"children":1886},{"style":275},[1887],{"type":37,"value":1888},"\\n",{"type":32,"tag":171,"props":1890,"children":1891},{"style":208},[1892],{"type":37,"value":1893},"title: ",{"type":32,"tag":171,"props":1895,"children":1896},{"style":275},[1897],{"type":37,"value":1898},"\\\"",{"type":32,"tag":171,"props":1900,"children":1901},{"style":208},[1902],{"type":37,"value":1903},"Server-Side GTM...",{"type":32,"tag":171,"props":1905,"children":1906},{"style":275},[1907],{"type":37,"value":1898},{"type":32,"tag":171,"props":1909,"children":1910},{"style":208},[1911],{"type":37,"value":1912},"\"",{"type":32,"tag":171,"props":1914,"children":1915},{"style":184},[1916],{"type":37,"value":216},{"type":32,"tag":171,"props":1918,"children":1919},{"class":173,"line":250},[1920,1925,1929,1934],{"type":32,"tag":171,"props":1921,"children":1922},{"style":275},[1923],{"type":37,"value":1924},"  \"latency_ms\"",{"type":32,"tag":171,"props":1926,"children":1927},{"style":184},[1928],{"type":37,"value":539},{"type":32,"tag":171,"props":1930,"children":1931},{"style":275},[1932],{"type":37,"value":1933},"2341",{"type":32,"tag":171,"props":1935,"children":1936},{"style":184},[1937],{"type":37,"value":216},{"type":32,"tag":171,"props":1939,"children":1940},{"class":173,"line":267},[1941,1946,1950,1955,1959,1964,1968,1973,1977,1982],{"type":32,"tag":171,"props":1942,"children":1943},{"style":275},[1944],{"type":37,"value":1945},"  \"tokens\"",{"type":32,"tag":171,"props":1947,"children":1948},{"style":184},[1949],{"type":37,"value":1829},{"type":32,"tag":171,"props":1951,"children":1952},{"style":275},[1953],{"type":37,"value":1954},"\"input\"",{"type":32,"tag":171,"props":1956,"children":1957},{"style":184},[1958],{"type":37,"value":539},{"type":32,"tag":171,"props":1960,"children":1961},{"style":275},[1962],{"type":37,"value":1963},"1842",{"type":32,"tag":171,"props":1965,"children":1966},{"style":184},[1967],{"type":37,"value":416},{"type":32,"tag":171,"props":1969,"children":1970},{"style":275},[1971],{"type":37,"value":1972},"\"output\"",{"type":32,"tag":171,"props":1974,"children":1975},{"style":184},[1976],{"type":37,"value":539},{"type":32,"tag":171,"props":1978,"children":1979},{"style":275},[1980],{"type":37,"value":1981},"1523",{"type":32,"tag":171,"props":1983,"children":1984},{"style":184},[1985],{"type":37,"value":1866},{"type":32,"tag":171,"props":1987,"children":1988},{"class":173,"line":26},[1989,1994,1998,2003],{"type":32,"tag":171,"props":1990,"children":1991},{"style":275},[1992],{"type":37,"value":1993},"  \"cost_usd\"",{"type":32,"tag":171,"props":1995,"children":1996},{"style":184},[1997],{"type":37,"value":539},{"type":32,"tag":171,"props":1999,"children":2000},{"style":275},[2001],{"type":37,"value":2002},"0.0137",{"type":32,"tag":171,"props":2004,"children":2005},{"style":184},[2006],{"type":37,"value":216},{"type":32,"tag":171,"props":2008,"children":2009},{"class":173,"line":289},[2010,2015,2019,2023,2027,2032,2036,2041,2045,2050],{"type":32,"tag":171,"props":2011,"children":2012},{"style":275},[2013],{"type":37,"value":2014},"  \"feedback\"",{"type":32,"tag":171,"props":2016,"children":2017},{"style":184},[2018],{"type":37,"value":1829},{"type":32,"tag":171,"props":2020,"children":2021},{"style":275},[2022],{"type":37,"value":534},{"type":32,"tag":171,"props":2024,"children":2025},{"style":184},[2026],{"type":37,"value":539},{"type":32,"tag":171,"props":2028,"children":2029},{"style":275},[2030],{"type":37,"value":2031},"4",{"type":32,"tag":171,"props":2033,"children":2034},{"style":184},[2035],{"type":37,"value":416},{"type":32,"tag":171,"props":2037,"children":2038},{"style":275},[2039],{"type":37,"value":2040},"\"comment\"",{"type":32,"tag":171,"props":2042,"children":2043},{"style":184},[2044],{"type":37,"value":539},{"type":32,"tag":171,"props":2046,"children":2047},{"style":208},[2048],{"type":37,"value":2049},"\"titolo troppo lungo\"",{"type":32,"tag":171,"props":2051,"children":2052},{"style":184},[2053],{"type":37,"value":2054},"}\n",{"type":32,"tag":171,"props":2056,"children":2057},{"class":173,"line":618},[2058],{"type":32,"tag":171,"props":2059,"children":2060},{"style":184},[2061],{"type":37,"value":2054},{"type":32,"tag":33,"props":2063,"children":2064},{},[2065],{"type":37,"value":2066},"Loop di feedback: gli editor danno un voto da 1 a 5 per ogni blog, LangSmith collega questi voti alle trace, il report settimanale avvisa \"la versione v2.3 ha un average score sceso a 3.2\". Rollback immediato → diff del prompt → individua il problema → correggi.",{"type":32,"tag":688,"props":2068,"children":2070},{"id":2069},"dataset-management-mantenere-il-golden-set-sotto-controllo-di-versione",[2071],{"type":37,"value":2072},"Dataset Management: Mantenere il Golden Set Sotto Controllo di Versione",{"type":32,"tag":33,"props":2074,"children":2075},{},[2076,2078,2083],{"type":37,"value":2077},"Il cuore della pipeline di evaluation è il ",{"type":32,"tag":123,"props":2079,"children":2080},{},[2081],{"type":37,"value":2082},"golden dataset",{"type":37,"value":2084}," — coppie input\u002Foutput conosciute, la referenza del comportamento atteso. Mantenere questo dataset in Notion, aggiornarlo manualmente in Google Sheets rappresenta un rischio di regressione.",{"type":32,"tag":33,"props":2086,"children":2087},{},[2088],{"type":37,"value":2089},"Mantieni il dataset di LangSmith sotto controllo di versione:",{"type":32,"tag":162,"props":2091,"children":2093},{"className":331,"code":2092,"language":333,"meta":16,"style":16},"from langsmith import Client\n\nclient = Client()\n\ndataset = client.create_dataset(\"marketing_blog_golden_v3\")\n\n# Aggiungi gli esempi golden\nexamples = [\n    {\n        \"inputs\": {\"topic\": \"Server-side GTM\", \"category\": \"tech\"},\n        \"outputs\": {\"title\": \"Server-Side GTM: Misurazione Post-Cookie\"},\n        \"metadata\": {\"expected_h2_count\": 5, \"expected_word_count\": 1500}\n    },\n    # 50+ esempi...\n]\n\nfor ex in examples:\n    client.create_example(**ex, dataset_id=dataset.id)\n",[2094],{"type":32,"tag":62,"props":2095,"children":2096},{"__ignoreMap":16},[2097,2117,2124,2141,2148,2174,2181,2189,2206,2214,2258,2288,2336,2344,2352,2359,2366,2387],{"type":32,"tag":171,"props":2098,"children":2099},{"class":173,"line":174},[2100,2104,2108,2112],{"type":32,"tag":171,"props":2101,"children":2102},{"style":343},[2103],{"type":37,"value":346},{"type":32,"tag":171,"props":2105,"children":2106},{"style":184},[2107],{"type":37,"value":351},{"type":32,"tag":171,"props":2109,"children":2110},{"style":343},[2111],{"type":37,"value":356},{"type":32,"tag":171,"props":2113,"children":2114},{"style":184},[2115],{"type":37,"value":2116}," Client\n",{"type":32,"tag":171,"props":2118,"children":2119},{"class":173,"line":190},[2120],{"type":32,"tag":171,"props":2121,"children":2122},{"emptyLinePlaceholder":367},[2123],{"type":37,"value":370},{"type":32,"tag":171,"props":2125,"children":2126},{"class":173,"line":199},[2127,2132,2136],{"type":32,"tag":171,"props":2128,"children":2129},{"style":184},[2130],{"type":37,"value":2131},"client ",{"type":32,"tag":171,"props":2133,"children":2134},{"style":343},[2135],{"type":37,"value":401},{"type":32,"tag":171,"props":2137,"children":2138},{"style":184},[2139],{"type":37,"value":2140}," Client()\n",{"type":32,"tag":171,"props":2142,"children":2143},{"class":173,"line":219},[2144],{"type":32,"tag":171,"props":2145,"children":2146},{"emptyLinePlaceholder":367},[2147],{"type":37,"value":370},{"type":32,"tag":171,"props":2149,"children":2150},{"class":173,"line":233},[2151,2156,2160,2165,2170],{"type":32,"tag":171,"props":2152,"children":2153},{"style":184},[2154],{"type":37,"value":2155},"dataset ",{"type":32,"tag":171,"props":2157,"children":2158},{"style":343},[2159],{"type":37,"value":401},{"type":32,"tag":171,"props":2161,"children":2162},{"style":184},[2163],{"type":37,"value":2164}," client.create_dataset(",{"type":32,"tag":171,"props":2166,"children":2167},{"style":208},[2168],{"type":37,"value":2169},"\"marketing_blog_golden_v3\"",{"type":32,"tag":171,"props":2171,"children":2172},{"style":184},[2173],{"type":37,"value":642},{"type":32,"tag":171,"props":2175,"children":2176},{"class":173,"line":242},[2177],{"type":32,"tag":171,"props":2178,"children":2179},{"emptyLinePlaceholder":367},[2180],{"type":37,"value":370},{"type":32,"tag":171,"props":2182,"children":2183},{"class":173,"line":250},[2184],{"type":32,"tag":171,"props":2185,"children":2186},{"style":1202},[2187],{"type":37,"value":2188},"# Aggiungi gli esempi golden\n",{"type":32,"tag":171,"props":2190,"children":2191},{"class":173,"line":267},[2192,2197,2201],{"type":32,"tag":171,"props":2193,"children":2194},{"style":184},[2195],{"type":37,"value":2196},"examples ",{"type":32,"tag":171,"props":2198,"children":2199},{"style":343},[2200],{"type":37,"value":401},{"type":32,"tag":171,"props":2202,"children":2203},{"style":184},[2204],{"type":37,"value":2205}," [\n",{"type":32,"tag":171,"props":2207,"children":2208},{"class":173,"line":26},[2209],{"type":32,"tag":171,"props":2210,"children":2211},{"style":184},[2212],{"type":37,"value":2213},"    {\n",{"type":32,"tag":171,"props":2215,"children":2216},{"class":173,"line":289},[2217,2222,2226,2230,2234,2238,2242,2246,2250,2254],{"type":32,"tag":171,"props":2218,"children":2219},{"style":208},[2220],{"type":37,"value":2221},"        \"inputs\"",{"type":32,"tag":171,"props":2223,"children":2224},{"style":184},[2225],{"type":37,"value":1829},{"type":32,"tag":171,"props":2227,"children":2228},{"style":208},[2229],{"type":37,"value":1834},{"type":32,"tag":171,"props":2231,"children":2232},{"style":184},[2233],{"type":37,"value":539},{"type":32,"tag":171,"props":2235,"children":2236},{"style":208},[2237],{"type":37,"value":1843},{"type":32,"tag":171,"props":2239,"children":2240},{"style":184},[2241],{"type":37,"value":416},{"type":32,"tag":171,"props":2243,"children":2244},{"style":208},[2245],{"type":37,"value":1852},{"type":32,"tag":171,"props":2247,"children":2248},{"style":184},[2249],{"type":37,"value":539},{"type":32,"tag":171,"props":2251,"children":2252},{"style":208},[2253],{"type":37,"value":1861},{"type":32,"tag":171,"props":2255,"children":2256},{"style":184},[2257],{"type":37,"value":1866},{"type":32,"tag":171,"props":2259,"children":2260},{"class":173,"line":618},[2261,2266,2270,2275,2279,2284],{"type":32,"tag":171,"props":2262,"children":2263},{"style":208},[2264],{"type":37,"value":2265},"        \"outputs\"",{"type":32,"tag":171,"props":2267,"children":2268},{"style":184},[2269],{"type":37,"value":1829},{"type":32,"tag":171,"props":2271,"children":2272},{"style":208},[2273],{"type":37,"value":2274},"\"title\"",{"type":32,"tag":171,"props":2276,"children":2277},{"style":184},[2278],{"type":37,"value":539},{"type":32,"tag":171,"props":2280,"children":2281},{"style":208},[2282],{"type":37,"value":2283},"\"Server-Side GTM: Misurazione Post-Cookie\"",{"type":32,"tag":171,"props":2285,"children":2286},{"style":184},[2287],{"type":37,"value":1866},{"type":32,"tag":171,"props":2289,"children":2290},{"class":173,"line":636},[2291,2296,2300,2305,2309,2314,2318,2323,2327,2332],{"type":32,"tag":171,"props":2292,"children":2293},{"style":208},[2294],{"type":37,"value":2295},"        \"metadata\"",{"type":32,"tag":171,"props":2297,"children":2298},{"style":184},[2299],{"type":37,"value":1829},{"type":32,"tag":171,"props":2301,"children":2302},{"style":208},[2303],{"type":37,"value":2304},"\"expected_h2_count\"",{"type":32,"tag":171,"props":2306,"children":2307},{"style":184},[2308],{"type":37,"value":539},{"type":32,"tag":171,"props":2310,"children":2311},{"style":275},[2312],{"type":37,"value":2313},"5",{"type":32,"tag":171,"props":2315,"children":2316},{"style":184},[2317],{"type":37,"value":416},{"type":32,"tag":171,"props":2319,"children":2320},{"style":208},[2321],{"type":37,"value":2322},"\"expected_word_count\"",{"type":32,"tag":171,"props":2324,"children":2325},{"style":184},[2326],{"type":37,"value":539},{"type":32,"tag":171,"props":2328,"children":2329},{"style":275},[2330],{"type":37,"value":2331},"1500",{"type":32,"tag":171,"props":2333,"children":2334},{"style":184},[2335],{"type":37,"value":2054},{"type":32,"tag":171,"props":2337,"children":2338},{"class":173,"line":866},[2339],{"type":32,"tag":171,"props":2340,"children":2341},{"style":184},[2342],{"type":37,"value":2343},"    },\n",{"type":32,"tag":171,"props":2345,"children":2346},{"class":173,"line":889},[2347],{"type":32,"tag":171,"props":2348,"children":2349},{"style":1202},[2350],{"type":37,"value":2351},"    # 50+ esempi...\n",{"type":32,"tag":171,"props":2353,"children":2354},{"class":173,"line":910},[2355],{"type":32,"tag":171,"props":2356,"children":2357},{"style":184},[2358],{"type":37,"value":295},{"type":32,"tag":171,"props":2360,"children":2361},{"class":173,"line":928},[2362],{"type":32,"tag":171,"props":2363,"children":2364},{"emptyLinePlaceholder":367},[2365],{"type":37,"value":370},{"type":32,"tag":171,"props":2367,"children":2368},{"class":173,"line":949},[2369,2373,2378,2382],{"type":32,"tag":171,"props":2370,"children":2371},{"style":343},[2372],{"type":37,"value":483},{"type":32,"tag":171,"props":2374,"children":2375},{"style":184},[2376],{"type":37,"value":2377}," ex ",{"type":32,"tag":171,"props":2379,"children":2380},{"style":343},[2381],{"type":37,"value":493},{"type":32,"tag":171,"props":2383,"children":2384},{"style":184},[2385],{"type":37,"value":2386}," examples:\n",{"type":32,"tag":171,"props":2388,"children":2389},{"class":173,"line":966},[2390,2395,2400,2405,2410,2414],{"type":32,"tag":171,"props":2391,"children":2392},{"style":184},[2393],{"type":37,"value":2394},"    client.create_example(",{"type":32,"tag":171,"props":2396,"children":2397},{"style":343},[2398],{"type":37,"value":2399},"**",{"type":32,"tag":171,"props":2401,"children":2402},{"style":184},[2403],{"type":37,"value":2404},"ex, ",{"type":32,"tag":171,"props":2406,"children":2407},{"style":599},[2408],{"type":37,"value":2409},"dataset_id",{"type":32,"tag":171,"props":2411,"children":2412},{"style":343},[2413],{"type":37,"value":401},{"type":32,"tag":171,"props":2415,"children":2416},{"style":184},[2417],{"type":37,"value":2418},"dataset.id)\n",{"type":32,"tag":33,"props":2420,"children":2421},{},[2422],{"type":37,"value":2423},"Ad ogni modifica del prompt, testa contro questo dataset. Se il pass rate diminuisce, non deployare. Quando trovi un edge case in produzione (bug che non era nel golden set), aggiungilo — così non regredisci.",{"type":32,"tag":45,"props":2425,"children":2427},{"id":2426},"tradeoff-metriche-deterministiche-vs-output-creativo",[2428],{"type":37,"value":2429},"Tradeoff: Metriche Deterministiche vs Output Creativo",{"type":32,"tag":33,"props":2431,"children":2432},{},[2433],{"type":37,"value":2434},"La forza degli LLM è la non-deterministicità — lo stesso input produce output diversi. Ma in un sistema di produzione, questa caratteristica diventa un rischio: il cliente vede un markdown diverso ad ogni refresh della pagina, alcuni versioni contengono errori.",{"type":32,"tag":33,"props":2436,"children":2437},{},[2438],{"type":37,"value":2439},"La temperature 0 aumenta il determinismo ma l'output diventa monotono. Tradeoff:",{"type":32,"tag":84,"props":2441,"children":2442},{},[2443,2453,2463],{"type":32,"tag":88,"props":2444,"children":2445},{},[2446,2451],{"type":32,"tag":123,"props":2447,"children":2448},{},[2449],{"type":37,"value":2450},"Temperature 0",{"type":37,"value":2452},": ideale per la test suite, in produzione è noioso",{"type":32,"tag":88,"props":2454,"children":2455},{},[2456,2461],{"type":32,"tag":123,"props":2457,"children":2458},{},[2459],{"type":37,"value":2460},"Temperature 0.3-0.5",{"type":37,"value":2462},": varietà ragionevole, comunque coerente",{"type":32,"tag":88,"props":2464,"children":2465},{},[2466,2471],{"type":32,"tag":123,"props":2467,"children":2468},{},[2469],{"type":37,"value":2470},"Temperature 0.7+",{"type":37,"value":2472},": creativo ma anche in produzione sorprese, anche se la test suite è verde",{"type":32,"tag":33,"props":2474,"children":2475},{},[2476],{"type":37,"value":2477},"La soluzione: usa temperature 0 negli eval, 0.4 in produzione, nel golden set salva 5 output accettabili diversi per ogni input (controllo di range).",{"type":32,"tag":33,"props":2479,"children":2480},{},[2481,2483,2488],{"type":37,"value":2482},"Un altro tradeoff: ",{"type":32,"tag":123,"props":2484,"children":2485},{},[2486],{"type":37,"value":2487},"latency vs qualità",{"type":37,"value":2489},". Un prompt più lungo produce output migliore ma il costo di input token aumenta e la latency cresce. In Promptfoo, se la metrica di latency supera 2.5s, lancia un alert — non rovinare l'esperienza utente.",{"type":32,"tag":45,"props":2491,"children":2493},{"id":2492},"checklist-di-produzione-prima-di-deployare-il-tuo-sistema-llm",[2494],{"type":37,"value":2495},"Checklist di Produzione: Prima di Deployare il Tuo Sistema LLM",{"type":32,"tag":33,"props":2497,"children":2498},{},[2499],{"type":37,"value":2500},"Lista di controllo pre-deployment:",{"type":32,"tag":84,"props":2502,"children":2505},{"className":2503},[2504],"contains-task-list",[2506,2518,2527,2536,2545,2554,2563,2572,2581],{"type":32,"tag":88,"props":2507,"children":2510},{"className":2508},[2509],"task-list-item",[2511,2516],{"type":32,"tag":2512,"props":2513,"children":2515},"input",{"disabled":367,"type":2514},"checkbox",[],{"type":37,"value":2517}," Il prompt è in git repo, la storia dei commit è pulita",{"type":32,"tag":88,"props":2519,"children":2521},{"className":2520},[2509],[2522,2525],{"type":32,"tag":2512,"props":2523,"children":2524},{"disabled":367,"type":2514},[],{"type":37,"value":2526}," La test suite Promptfoo ha pass rate > 95%",{"type":32,"tag":88,"props":2528,"children":2530},{"className":2529},[2509],[2531,2534],{"type":32,"tag":2512,"props":2532,"children":2533},{"disabled":367,"type":2514},[],{"type":37,"value":2535}," Il golden dataset contiene almeno 50 esempi",{"type":32,"tag":88,"props":2537,"children":2539},{"className":2538},[2509],[2540,2543],{"type":32,"tag":2512,"props":2541,"children":2542},{"disabled":367,"type":2514},[],{"type":37,"value":2544}," Il piano dell'A\u002FB test è pronto, sample size calcolata",{"type":32,"tag":88,"props":2546,"children":2548},{"className":2547},[2509],[2549,2552],{"type":32,"tag":2512,"props":2550,"children":2551},{"disabled":367,"type":2514},[],{"type":37,"value":2553}," LangSmith tracing è attivo, API key in produzione",{"type":32,"tag":88,"props":2555,"children":2557},{"className":2556},[2509],[2558,2561],{"type":32,"tag":2512,"props":2559,"children":2560},{"disabled":367,"type":2514},[],{"type":37,"value":2562}," Loop di feedback è configurato (scoring degli editor, join in BigQuery)",{"type":32,"tag":88,"props":2564,"children":2566},{"className":2565},[2509],[2567,2570],{"type":32,"tag":2512,"props":2568,"children":2569},{"disabled":367,"type":2514},[],{"type":37,"value":2571}," Procedura di rollback definita (quale metrica in calo triggerizza il rollback automatico)",{"type":32,"tag":88,"props":2573,"children":2575},{"className":2574},[2509],[2576,2579],{"type":32,"tag":2512,"props":2577,"children":2578},{"disabled":367,"type":2514},[],{"type":37,"value":2580}," Monitoring dei costi — soglia di spend giornaliero $X",{"type":32,"tag":88,"props":2582,"children":2584},{"className":2583},[2509],[2585,2588],{"type":32,"tag":2512,"props":2586,"children":2587},{"disabled":367,"type":2514},[],{"type":37,"value":2589}," SLA di latency — p95 \u003C 3s",{"type":32,"tag":33,"props":2591,"children":2592},{},[2593],{"type":37,"value":2594},"Se non completi questa lista, non puoi dire di fornire un \"servizio AI\". Senza versioning, evaluation, observability, il deployment di un LLM in produzione non è operazione controllata, è caos controllato.",{"type":32,"tag":2596,"props":2597,"children":2598},"hr",{},[],{"type":32,"tag":33,"props":2600,"children":2601},{},[2602,2604,2611],{"type":37,"value":2603},"Il versionamento dei prompt è una questione di disciplina — non per velocità, ma per affidabilità. In tattiche come ",{"type":32,"tag":677,"props":2605,"children":2608},{"href":2606,"rel":2607},"https:\u002F\u002Fwww.roibase.com.tr\u002Fit\u002Fgeo",[681],[2609],{"type":37,"value":2610},"Generative Engine Optimization",{"type":37,"value":2612},", la qualità dell'output si collega direttamente all'outcome di business. Senza una pipeline di evaluation, ogni deployment rischia le performance precedenti. Promptfoo fornisce garanzie locali, LangSmith visibilità in produzione. Insieme, portano le operazioni LLM allo standard dell'ingegneria software.",{"type":32,"tag":2614,"props":2615,"children":2616},"style",{},[2617],{"type":37,"value":2618},"html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}",{"title":16,"searchDepth":199,"depth":199,"links":2620},[2621,2622,2625,2626,2629,2630],{"id":47,"depth":190,"text":50},{"id":110,"depth":190,"text":113,"children":2623},[2624],{"id":690,"depth":199,"text":693},{"id":1124,"depth":190,"text":1127},{"id":1711,"depth":190,"text":1714,"children":2627},[2628],{"id":2069,"depth":199,"text":2072},{"id":2426,"depth":190,"text":2429},{"id":2492,"depth":190,"text":2495},"markdown","content:it:ai:versionamento-prompt-e-a-b-test-disciplina-llm-ops.md","content","it\u002Fai\u002Fversionamento-prompt-e-a-b-test-disciplina-llm-ops.md","it\u002Fai\u002Fversionamento-prompt-e-a-b-test-disciplina-llm-ops","md",1778709809702]