distillcore
Skip to Content
API ReferenceOverview

API Reference

Complete API reference for the distillcore library.

Standalone Chunking

FunctionDescription
chunk()Split text into chunks using paragraph, sentence, fixed, or LLM strategy
achunk()Async version of chunk()
estimate_tokens()Estimate token count for a string

See Chunking for details.

Pipeline Entry Points

FunctionDescription
process_document()Process a file through the full pipeline
process_text()Process raw text through the full pipeline
extract()Extract text from a file
StoreSQLite storage with search

Async Variants

FunctionDescription
process_document_async()Async version of process_document
process_text_async()Async version of process_text
process_batch()Process multiple files concurrently
process_batch_sync()Synchronous batch processing

See Async & Batch for details.

Configuration

ClassDescription
DistillConfigTop-level configuration
ChunkConfigChunk size, overlap, and strategy
EmbeddingConfigEmbedding model selection
DomainConfigDomain-specific LLM prompts

See Configuration for details.

Utilities

FunctionDescription
load_preset(name)Load a domain preset
register_extractor(ext)Register a custom extractor
compute_coverage(original, derived)Word-level coverage metric
find_missing_segments(original, derived)Find text segments lost during processing
safe_parse(raw)Parse JSON with truncation repair fallback