Skip to main content
Back to NLP Topics Validate your ML model to estimate its generalization performance and identify training gaps before deploying. Go to Natural Language > Training > Validate Model to access validation options.

K-fold Cross-Validation

Partitions utterances into k subsets, trains on k-1 folds, and tests on the remaining fold — repeated until all utterances are tested at least once.
K-fold Cross-Validation is not available for Few-shot and Zero-shot models. Minimum requirements: 5 utterances per intent, 250 utterances total.

Configure K-fold

Set the k parameter at NLU Config > Engine Tuning > K Fold (value between 2 and 10; default: 2).

Generate the Report

  1. Go to Natural Language > Training.
  2. Click Validate Model > K-fold Cross-Validation.
  3. Click Generate (first time) or Re-generate (subsequent runs).

Metrics

MetricDescription
PrecisionRatio of true positives to total predicted positives. Measures accuracy.
RecallRatio of true positives to actual positives. Measures coverage.
F1 ScoreWeighted average of precision and recall.
MeanAverage of precision, recall, and F1 across all folds.
Additional context:
  • Total Utterances — total training corpus size.
  • Number of Intents — intents in the app.
  • Number of Folds — k value used.
  • Test/Training Data per Fold — utterances in each subset.

Export

Click the Export icon on the K-fold page → Proceed. File format: Kfold_BotName_YYYYMMDDHHmmSS.csv.

Confusion Matrix

Visualizes how well trained utterances match their intended tasks. Each dot represents an utterance, plotted in one of four quadrants per task.
Requires at least one utterance. Only includes intents with utterances in the training dataset.

Quadrants

QuadrantMeaningAction
True PositiveUtterance matches its trained intent.Favorable. Higher position = more confidence.
True NegativeUtterance correctly does not match an unrelated intent.Favorable. Lower position = better separation.
False PositiveUtterance incorrectly matches an unrelated intent.Fix: retrain the utterance or the competing intent.
False NegativeUtterance fails to match its trained intent.Fix: retrain the utterance or the intent.

Reading the Graph

  • Utterances in the top of True quadrants — best match quality.
  • Utterances in False quadrants — require immediate attention.
  • An utterance in the True Positive quadrant of multiple tasks — overlapping intents; must be resolved.

Edit and Reassign Utterances

  1. Click a quadrant to open the quadrant view.
  2. Click the Edit icon on an utterance row.
  3. Update the text or reassign to a different intent using Expected Task.

Filter the Graph

Filter by In Development / Published, Weak to Strong or Strong to Weak score order, specific intents, or specific utterances.

Re-run Model

After any changes, click Re-Run Model to generate the latest matrix. The platform prompts you to Train and Regenerate (if unsaved changes) or Regenerate (if model is current).