Does extract text work on scanned PDFs?

No — folio reads the PDF text layer only. Scanned-image PDFs contain pictures of text with no embedded text layer, so the output is empty. Scanned documents need OCR (optical character recognition), which folio does not ship in this version.

Will tables and columns be preserved?

No — output is plain text in roughly the reading order PDF declares. Complex tables and multi-column layouts often lose structure. The output is optimised for searching, indexing and programmatic processing, not for reading.

Can I extract text from a single page?

Not directly — every page's text is concatenated with `--- Page N ---` markers. Use the Split tool first to extract a single page, then run text extraction on the smaller PDF.

Is this PDF toolkit free?

Yes. folio is free to use, no account is required, and there are no paid plans.

Are my PDFs uploaded to a server?

No. folio runs entirely in your browser. Your PDFs are never uploaded, never seen by our infrastructure, and never stored anywhere outside your device. When you close the tab, every file is gone.

Which operations are supported?

Merge PDFs, split a PDF into ranges, compress a PDF, convert PDF pages to JPG or PNG, build a PDF from images, rotate all or selected pages, add a watermark, and extract plain text. All of this happens in your browser.

Is there a file size limit?

Yes. Each file may be up to 100 MB. Browser memory is the real ceiling — very large scanned PDFs may struggle on lower-end devices.

Can I work with multiple PDFs at once?

Yes. Merge and Images→PDF accept up to 20 files in a single batch. Split, Compress, Rotate, Watermark and Extract Text take one PDF at a time.

Will my converted images / PDFs lose quality?

Compress and PDF→JPG/PNG re-render pages at the DPI/quality you pick, so there is a small generational loss like any image re-encode. Merge, Split, Rotate, Watermark and Extract Text do not re-render — they only edit the existing PDF objects, so quality is preserved.

Can folio open password-protected PDFs?

Not yet. If your PDF requires a password to open, please unlock it first in a dedicated app. We deliberately skip password handling to keep folio fully client-side and free of legal grey areas.

Not in this version. Extract Text only works for PDFs that already have selectable text. Scanned image-only PDFs need OCR, which is a separate, much heavier project we may add later.

Why does the first PDF→Image take a moment?

When you first use an operation that renders PDF pages (PDF→JPG, PDF→PNG, or split previews), folio downloads the rendering engine (~1 MB). After that it stays cached and subsequent operations start immediately.

Are ads required to use the tool?

No. Ads are isolated from the converter. The tool works completely even if ads fail to load or are blocked.

PDF → Plain .txt

Extract text from a PDF

Drop a PDF below and folio reads every page's selectable text using pdf.js, then offers it as a plain `.txt` file. Everything runs in your browser — your document never leaves your device.

Drop your PDF here

Pull out every line of selectable text as plain .txt

Choose PDF

PDF

Upload a file to start.

One PDF up to 100 MB. Selectable text only — no OCR.

Every operation runs entirely in your browser. Your files never leave your device.

About Extract text

What kind of text can folio extract?

folio captures the selectable text stored inside the PDF — the same content you'd get by highlighting and copying inside a PDF viewer. Page breaks are marked with `--- Page N ---` headers so you can stitch context back together.

Why is the output empty?

Scanned PDFs are typically just images of text — the document has no embedded text layer, so there's nothing to extract. You'd need OCR (optical character recognition) to recognise the words, which folio doesn't ship yet.

Does folio preserve formatting?

No. The output is plain text, optimised for searching, indexing, or feeding into another tool. Headings, columns, tables and inline styles are flattened — that's by design for a `.txt` export.

About this operation

PDF → TXT

What it does

folio walks every page with pdf.js and dumps the selectable text into a single `.txt` file. Page breaks are marked with `--- Page N ---` headers so you can stitch context back together. This is exactly the content you'd get by highlighting and copying inside a PDF viewer — useful for search indexing, programmatic processing, or pasting into another tool. Scanned PDFs (image-only with no text layer) return empty output because there is nothing to extract; that needs OCR, which folio does not ship.

When to use it

Feed a PDF's words into a search index
Quickly grep a long document for a phrase
Paste a transcript's content into a doc editor
Process PDF content with a script

Limitations — what it doesn't do

Text layer only — does not OCR scanned-image PDFs
Does not preserve formatting, tables or column structure
Does not extract images or vector graphics
Cannot extract text from password-protected PDFs
Output is one big .txt per input PDF — no per-page splitting (use Split first)

Every tool runs entirely in your browser.

Frequently asked questions

Your PDFs never leave your device

folio is a static page. Every operation runs inside your browser via pdf-lib (edit) and pdfjs (render). There is no server-side processing, no upload, no temporary file, no cache. When you close this tab, every file is gone.

No account required.
No server processing. Your PDFs stay on your device.
No caching, no Service Worker, no IndexedDB persistence.
pdfjs-dist (lazy-loaded for rendering) is fetched from your own origin; nothing else is sent.