Understanding AI Document Classification
Platform Features
CaseFlow's AI classification engine is the core of our document intelligence platform. Understanding how it works helps you trust its recommendations and know when to override them.
How AI Classification Works
When you upload a document, CaseFlow's AI performs multiple analyses simultaneously:
Document Type Recognition: Is this a complaint, motion, order, subpoena, evidence file, or something else?
Content Extraction: What are the case numbers, party names, dates, and other key facts?
Jurisdiction Detection: What rules and procedures apply based on the court and case type?
Compliance Checking: Are there any deadlines, requirements, or red flags to note? This happens in seconds, even for complex multi-page documents, because the AI has been trained on millions of legal documents across hundreds of jurisdictions.
The Training Data
CaseFlow's models are trained on publicly available court documents, legal filings, and anonymized case files (with permission). This training data spans federal, state, and municipal courts across diverse case types— criminal, civil, family, probate, traffic, and more. The AI learns patterns in document structure, legal terminology, and procedural requirements.
Jurisdiction-Specific Learning
Beyond general training, CaseFlow adapts to your specific jurisdiction. As you use the platform and correct any misclassifications, the AI learns your court's unique terminology, local rules, and document formats. Over time, accuracy improves for your jurisdiction specifically—this is why CaseFlow gets better the more you use it.
Confidence Scores
Every classification comes with a confidence score (0-100%). High confidence (90%+) means the AI is very certain about its classification. Low confidence (below 70%) means the AI detected ambiguity and flags the document for your review. You'll see these confidence scores in your dashboard, and you can set thresholds for automatic vs. manual review.
What the AI Looks
For Document classification relies on multiple signals:
Headers and Titles: "Motion to Dismiss", "Complaint for Damages", etc.
Structural Patterns: Where case numbers appear, how parties are listed, signature blocks Legal
Language: Specific phrases like "Plaintiff alleges" or "Court hereby orders"
Formatting Cues: Font choices, spacing, section numbering common to certain document types
The AI doesn't just look for keywords—it understands context and structure, which is why it can distinguish between a motion to dismiss and a response to a motion to dismiss, even though both contain similar words.
Extracting Key Information
Beyond classification, the AI extracts structured data:
Party Names: Plaintiff, defendant, petitioner, respondent, and their attorneys
Case Numbers: Including docket numbers, filing numbers, and jurisdiction identifiers
Dates: Filing dates, hearing dates, deadlines, statute of limitations calculations
Court Information: Division, judge assignment, courthouse location
Financial Data: Damages sought, filing fees, bond amounts
This extracted data populates your case management dashboard automatically, eliminating manual data entry.
Handling Ambiguity
Sometimes documents are genuinely ambiguous. A document might be both a motion and a notice, or it might be a type the AI hasn't seen before. In these cases, CaseFlow flags the document as "Requires Review" rather than guessing. This is intentional, we'd rather ask for your input than auto-classify incorrectly.
Your Role in Classification
You are always the final authority. CaseFlow's classifications are suggestions, not commands. Every classified document shows a "Confirm" or "Correct" option. If the AI got it right, one click confirms. If not, you select the correct classification from a dropdown, and CaseFlow remembers your correction for similar documents in the future.
Continuous Improvement
Every correction you make trains the AI to be more accurate. If you tell CaseFlow that a document labeled "Motion" is actually a "Notice," it updates its understanding of how your jurisdiction uses these terms. This learning is isolated to your account—your corrections don't affect other jurisdictions unless you choose to share anonymized feedback with us for general model improvement.
Privacy and Data Usage
Your documents are never used to train our public AI models without explicit permission. Document analysis happens in your isolated environment, and extracted data stays within your jurisdiction's account. If you choose to participate in our model improvement program (opt-in only), documents are anonymized and stripped of personally identifiable information before being added to training data.
Accuracy Benchmarks
CaseFlow maintains 90%+ classification accuracy across most document types and jurisdictions. For common documents like complaints and motions, accuracy typically exceeds 95%. For rare or highly specialized documents, accuracy may be lower initially but improves quickly as you provide corrections.
Question not answered?
contact us at