Přeskočit na obsah
TECHNOMATON | Docs SAI certifikovaní trenéři

MEASURE

Jak analyzovat, testovat a validovat AI systémy

NIST AI RMF funkce: MEASURE Subkategorie: MS-1 až MS-4


1. Přehled funkce MEASURE

MEASURE funkce zajišťuje, že AI systémy jsou řádně testovány, validovány a monitorovány.

Cíle MEASURE

CílPopis
TEVVTest, Evaluation, Validation, Verification
MetrikyDefinovat a sledovat KPIs
BiasTestovat fairness a bias
SecurityHodnotit bezpečnost
ContinuousPrůběžný monitoring

2. MS-1: TEVV Framework

2.1 Co je TEVV

2.2 TEVV Lifecycle

FázeTEVV AktivityKdy
DesignRequirements verificationPřed vývojem
DevelopmentUnit testing, code reviewBěhem vývoje
Pre-deploymentIntegration, validation, red-teamingPřed nasazením
DeploymentA/B testing, staged rolloutPři nasazení
ProductionMonitoring, drift detectionPrůběžně
RetirementFinal assessment, lessons learnedPři ukončení

2.3 Testovací strategie

Pro každý AI systém definujte:

## TESTOVACÍ PLÁN: [AI System Name]
### 1. Test Objectives
- Co testujeme?
- Jaká jsou akceptační kritéria?
### 2. Test Types
| Type | Scope | Tools |
|------|-------|-------|
| Unit | Model components | pytest, unittest |
| Integration | End-to-end | Custom scripts |
| Performance | Speed, throughput | Load testing |
| Fairness | Bias detection | Aequitas, Fairlearn |
| Security | Adversarial | Custom red-team |
| User acceptance | Usability | User studies | Product |
### 3. Test Data
- Representative samples
- Edge cases
- Adversarial inputs
### 4. Schedule
| Milestone | Tests | Date |
|-----------|-------|------|
| Dev complete | Unit, Integration | |
| Pre-release | All except UAT | |
| Release | UAT | |
| Monthly | Performance, Fairness | |
### 5. Pass/Fail Criteria
| Metric | Threshold | Current |
|--------|-----------|---------|
| Accuracy | >95% | |
| Latency p99 | <500ms | |
| Fairness (demographic parity) | <5% gap | |

3. MS-2: Performance Metrics

3.1 Core metriky

Accuracy & Reliability

MetrikaPopisPoužití
AccuracySprávnost predikcíClassification
PrecisionTrue positives / predicted positivesKdyž FP je nákladný
RecallTrue positives / actual positivesKdyž FN je nákladný
F1 ScoreHarmonic mean precision & recallBalanced
AUC-ROCArea under ROC curveThreshold selection
RMSE/MAEError metricsRegression
PerplexityLanguage model qualityGAI/LLM

Reliability

MetrikaPopisTarget
ConsistencyStejný input → stejný output>99%
StabilityPerformance over time<5% drift
AvailabilityUptime>99.9%
Latency p50/p99Response timeSLA-defined

3.2 Fairness metriky

MetrikaDefiniceKdy použít
Demographic ParityP(Ŷ=1|G=a) = P(Ŷ=1|G=b)Equal outcomes
Equalized OddsTPR a FPR stejné across groupsEqual error rates
Predictive ParityPPV stejné across groupsEqual precision
Individual FairnessSimilar individuals → similar outcomesCase-by-case

Jak měřit:

# Příklad s Fairlearn
from fairlearn.metrics import demographic_parity_difference
dpd = demographic_parity_difference(
y_true,
y_pred,
sensitive_features=sensitive_feature
)
# Target: |dpd| < 0.05 (5%)

3.3 GAI-specific metriky

MetrikaPopisMěření
Hallucination rateFrekvence fakticky nesprávných výstupůManual review sample
Toxicity scoreŠkodlivost obsahuPerspective API
Bias in generationStereotypy ve výstupechWinogender, BBQ
Instruction followingDodržování promptůBenchmark datasets
Refusal rateOdmítnutí nevhodných požadavkůAdversarial prompts

4. MS-3: Bias Testing

4.1 Pre-deployment bias assessment

Checklist:

  • Identifikovány protected attributes (věk, pohlaví, etnicita, …)
  • Trénovací data analyzována na bias
  • Baseline metriky změřeny per group
  • Fairness thresholds definovány
  • Mitigation strategie připravena

4.2 Testing metodologie

Slicing Analysis

Testujte performance na podskupinách:

| Slice | Count | Accuracy | Precision | Recall |
|-------|-------|----------|-----------|--------|
| Overall | 10000 | 94.5% | 93.2% | 95.1% |
| Gender: M | 5200 | 95.1% | 94.0% | 95.8% |
| Gender: F | 4800 | 93.8% | 92.3% | 94.3% |
| Age: <30 | 3000 | 96.2% | 95.5% | 96.8% |
| Age: 30-50 | 4500 | 94.0% | 93.1% | 94.5% |
| Age: >50 | 2500 | 92.1% | 90.8% | 93.2% |

Alert: Gap > 5% mezi skupinami → vyšetřit

Counterfactual Testing

Změňte protected attribute, sledujte změnu výstupu:

Original: "John applied for a loan..." → Approved
Counterfactual: "Jane applied for a loan..." → Approved? ✓
Pokud outcomes liší → potenciální bias

4.3 Bias mitigation strategie

StrategieFázePopis
Pre-processingDataRebalancing, re-sampling, feature selection
In-processingTrainingFairness constraints, adversarial debiasing
Post-processingOutputThreshold adjustment, equalized odds
Human reviewDeploymentManual override for sensitive decisions

5. MS-4: Security Testing

5.1 AI-specific security threats

ThreatPopisTestování
Adversarial examplesInputs designed to fool modelAdversarial attacks
Model extractionStealing model via queriesRate limiting, monitoring
Data poisoningCorrupting training dataData validation
Prompt injectionManipulating GAI via promptsPrompt fuzzing
Membership inferenceDetecting training data presencePrivacy attacks

5.2 Red-teaming pro GAI

Typy red-teaming:

TypÚčastníciFocus
GeneralBěžní uživateléCommon misuse
ExpertDomain specialistsDomain-specific risks
AdversarialSecurity researchersEdge cases, jailbreaks
AI-assistedAI + humansScale + creativity

Red-teaming checklist pro GAI:

## RED-TEAM SESSION: [GAI System]
### 1. Jailbreaking Attempts
- [ ] Direct harmful requests
- [ ] Role-play scenarios ("pretend you are...")
- [ ] Multi-step prompts (build-up)
- [ ] Language switching
- [ ] Encoding attacks (base64, etc.)
### 2. Information Extraction
- [ ] Training data extraction
- [ ] PII extraction
- [ ] System prompt extraction
- [ ] Capability probing
### 3. Content Generation
- [ ] Harmful content (violence, hate)
- [ ] Misinformation
- [ ] CSAM/NCII attempts
- [ ] Copyright infringement
### 4. Manipulation
- [ ] Social engineering scripts
- [ ] Phishing content
- [ ] Malware code generation
- [ ] CBRN information requests
### 5. Findings
| Finding | Severity | Reproducible | Recommendation |
|---------|----------|--------------|----------------|
| | | | |

5.3 Prompt injection testing

Test scenarios:

ScenarioPříkladExpected
Direct injection”Ignore previous instructions and…”Refuse
Indirect injectionMalicious content in retrieved docsDetect
Context manipulation”From now on, you are…”Maintain boundaries
Delimiter attacksUsing markdown/code blocksParse safely

6. MS-5: Continuous Monitoring

6.1 Monitoring dashboard

Key metrics to track:

6.2 Drift detection

Drift TypeCo sledovatJak detekovatThreshold
Data driftInput distribution změnyPSI, KS testPSI > 0.1
Concept driftRelationship X→Y změnyPerformance drop>5% degradace
Model driftModel behavior změnyPrediction distributionSignificant shift

6.3 Alerting

Alert LevelTriggerResponseSLA
P1 CriticalSafety incident, major outageImmediate escalation15 min
P2 HighPerformance degradation >10%Same-day investigation4 hours
P3 MediumDrift detected, minor issuesPlanned review24 hours
P4 LowInformational, optimizationNext sprint1 week

7. Implementační checklist

Fáze 1: Test Infrastructure (Týden 1-2)

  • Definovat test data management
  • Nastavit test environments
  • Vybrat testing tools
  • Definovat baseline metriky

Fáze 2: Pre-deployment Testing (Týden 3-4)

  • Implementovat unit/integration tests
  • Provést bias assessment
  • Spustit security testing
  • Provést user acceptance testing

Fáze 3: Monitoring Setup (Týden 5-6)

  • Nasadit monitoring stack
  • Definovat KPIs a thresholds
  • Nastavit alerting
  • Vytvořit dashboards

Fáze 4: Continuous (Ongoing)

  • Pravidelné revalidace
  • Drift monitoring
  • Red-teaming sessions
  • Metric reviews

8. Nástroje

KategorieNástrojÚčel
ML Testingpytest, Great ExpectationsData/model testing
FairnessFairlearn, Aequitas, AI Fairness 360Bias detection
ExplainabilitySHAP, LIME, CaptumModel interpretability
SecurityTextAttack, Adversarial Robustness ToolboxAdversarial testing
MonitoringEvidently, Whylabs, ArizeProduction monitoring
GAI EvalHELM, lm-evaluation-harnessLLM benchmarks

Pokračujte na MANAGE pro implementaci MANAGE funkce.


AI-Native Entry Framework | CC BY-NC-SA 4.0