Model Validation

Back to NLP Topics Validate your ML model to estimate its generalization performance and identify training gaps before deploying. Go to Natural Language > Training > Validate Model to access validation options.

K-fold Cross-Validation

Partitions utterances into k subsets, trains on k-1 folds, and tests on the remaining fold — repeated until all utterances are tested at least once.

K-fold Cross-Validation is not available for Few-shot and Zero-shot models. Minimum requirements: 5 utterances per intent, 250 utterances total.

Configure K-fold

Set the k parameter at NLU Config > Engine Tuning > K Fold (value between 2 and 10; default: 2).

Generate the Report

Go to Natural Language > Training.
Click Validate Model > K-fold Cross-Validation.
Click Generate (first time) or Re-generate (subsequent runs).

Metrics

Metric	Description
Precision	Ratio of true positives to total predicted positives. Measures accuracy.
Recall	Ratio of true positives to actual positives. Measures coverage.
F1 Score	Weighted average of precision and recall.
Mean	Average of precision, recall, and F1 across all folds.

Additional context:

Total Utterances — total training corpus size.
Number of Intents — intents in the app.
Number of Folds — k value used.
Test/Training Data per Fold — utterances in each subset.

Export

Click the Export icon on the K-fold page → Proceed. File format: Kfold_BotName_YYYYMMDDHHmmSS.csv.

Confusion Matrix

Visualizes how well trained utterances match their intended tasks. Each dot represents an utterance, plotted in one of four quadrants per task.

Requires at least one utterance. Only includes intents with utterances in the training dataset.

Quadrants

Quadrant	Meaning	Action
True Positive	Utterance matches its trained intent.	Favorable. Higher position = more confidence.
True Negative	Utterance correctly does not match an unrelated intent.	Favorable. Lower position = better separation.
False Positive	Utterance incorrectly matches an unrelated intent.	Fix: retrain the utterance or the competing intent.
False Negative	Utterance fails to match its trained intent.	Fix: retrain the utterance or the intent.

Reading the Graph

Utterances in the top of True quadrants — best match quality.
Utterances in False quadrants — require immediate attention.
An utterance in the True Positive quadrant of multiple tasks — overlapping intents; must be resolved.

Edit and Reassign Utterances

Click a quadrant to open the quadrant view.
Click the Edit icon on an utterance row.
Update the text or reassign to a different intent using Expected Task.

Filter the Graph

Filter by In Development / Published, Weak to Strong or Strong to Weak score order, specific intents, or specific utterances.

Re-run Model

After any changes, click Re-Run Model to generate the latest matrix. The platform prompts you to Train and Regenerate (if unsaved changes) or Regenerate (if model is current).

Modules

Platform Services

References

K-fold Cross-Validation

Configure K-fold

Generate the Report

Metrics

Export

Confusion Matrix

Quadrants

Reading the Graph

Edit and Reassign Utterances

Filter the Graph

Re-run Model

Modules

Platform Services

References

​K-fold Cross-Validation

​Configure K-fold

​Generate the Report

​Metrics

​Export

​Confusion Matrix

​Quadrants

​Reading the Graph

​Edit and Reassign Utterances

​Filter the Graph

​Re-run Model

K-fold Cross-Validation

Configure K-fold

Generate the Report

Metrics

Export

Confusion Matrix

Quadrants

Reading the Graph

Edit and Reassign Utterances

Filter the Graph

Re-run Model