Skip to content
epitometool

PDF to Word (DOCX)

PDF tools

Convert a PDF with a text layer to an editable .docx, locally.

Updated

Up to 200 MB. PDF must already contain a text layer (i.e. selectable text). For scans, use /tools/pdf-ocr first.

  • Vpaste PDF

Quick start

How to convert a PDF to Word (.docx)

Extract text from a PDF and write a Word document, entirely in your browser.

  1. Step 1
    Drop or pick a PDF

    Drag a PDF onto the drop zone, click to pick it, or paste from the clipboard. The file stays on your device.

  2. Step 2
    Pick pages (optional)

    Convert every page or pick a range like 1-3,5,7-9 to extract just a section.

  3. Step 3
    Convert and download

    Hit Convert. Text is extracted with pdf.js, paragraphs are inferred from layout gaps, the docx library writes the file, and you download <basename>.docx.

In-depth guide

Convert PDF to Word (.docx) in your browser — full guide

This tool extracts text from a PDF that already has a real text layer and writes it out as a Microsoft Word .docx file. Mozilla's pdf.js handles the text extraction; the docx npm package handles the .docx writing. Everything runs locally — your PDF never leaves the page.

Honest scope: we produce paragraph-flow text, not a pixel-perfect copy. See the limits section below for the unvarnished version.

What this tool actually does

The conversion has three stages:

  1. Extract the text layer — pdf.js walks every page and returns the individual text items along with their position on the page.
  2. Group items into paragraphs — we detect vertical gaps between items: a small gap means same paragraph, a bigger gap means new paragraph. Adjacent runs on the same line are joined with a space.
  3. Write a DOCX — each paragraph becomes a Word paragraph; a page break is inserted between source PDF pages.

The result is a normal Word document you can open and edit in any DOCX-aware app.

What this tool does NOT do

To set expectations honestly:

  • No table reconstruction. Cells come out as flowing text.
  • No multi-column reflow. Two-column pages may interleave — pdfjs emits items in render order, which on column layouts often alternates by line.
  • No images. Embedded images and vector graphics are ignored.
  • No font / styling preservation. All text gets a default style.
  • No headings detection. Without parsing PDF tagging or doing font analysis, we'd produce a lot of false positives. Better to leave that to a human pass in Word.

If you need any of those things, the most reliable free path is desktop LibreOffice (open the PDF, Save As → .docx). It does ~80 % of those things well but it's 700 MB of install — out of reach for a browser-only tool.

Scanned / image-only PDFs

Quick test: open your PDF in any reader and try to select a sentence. If selection works, this tool will work. If you "select" entire page blocks, you have a scan.

If you try to convert a scanned PDF you'll get "no text could be extracted" — there's no text layer for pdfjs to read. The fix is to add a text layer first:

  1. Run the PDF through PDF searchable — that tool OCRs each page and embeds the recognised text invisibly back into the PDF.
  2. Then come back and run that output PDF through this converter.

If you only need the text (not a DOCX), PDF OCR is one step instead of two.

Selecting specific pages

For a long PDF where you only want the introduction, executive summary, or a specific chapter — pick Selected pages and enter ranges:

  • 1-5 → just the first five pages
  • 3,7,11 → those three pages, in that order
  • 1-3,9,15-18 → mixed ranges

This also makes a great way to sanity-check the result on a small sample before committing to a long document.

Frequently asked questions

Is my PDF uploaded anywhere?

No. pdf.js extracts the text inside your browser and the `docx` library writes the Word document locally. Open DevTools → Network while converting — you'll see zero outbound requests for your file.

Will the DOCX look identical to my PDF?

No. True layout-preserving PDF → DOCX (multi-column reflow, table reconstruction, image positioning, vector graphics → DrawingML) is a problem that needs 50+ MB of WebAssembly (LibreOffice, MuPDF) or a commercial server engine. We don't bundle that. This tool gives you faithful paragraph-flow text — exactly what most users actually need from a PDF → Word conversion.

What does "paragraph-flow text" mean exactly?

Each page's text layer is read in render order, items are grouped into paragraphs by detecting vertical gaps, and the result is written as Word paragraphs. Page breaks between source PDF pages are preserved. Headings, bold, italics, tables, columns and inline images are not preserved.

Why is it telling me my PDF has no text?

The PDF is image-only — typically a scanned document where each page is just a picture of text. There's no underlying text layer to extract. Run /tools/pdf-ocr to extract the text as plain text, or /tools/pdf-searchable to embed an OCR text layer back into the PDF (then you can run this tool on the result).

Can I convert only some pages?

Yes — pick "Selected pages" and enter a range like 1-3,5,7-9. Only those pages will be extracted.

Multi-column pages come out interleaved. Why?

pdfjs returns text items in their rendering order, which on a two-column page typically alternates between columns line by line. Real column detection is a separate hard problem. If you have a column-heavy document and need clean column-by-column output, the most reliable path today is still LibreOffice (open the PDF, save as DOCX) — but it can't run in a browser at any reasonable bundle size.

Can I edit the DOCX afterwards?

Of course — it's a regular Word document. Open it in Word, Pages, LibreOffice, Google Docs or any other DOCX-aware editor and clean up anything that came out imperfect.

Why is the file size so different from the PDF?

DOCX stores text + lightweight structure as zipped XML. A 20 MB image-heavy PDF will become a tiny DOCX (because we don't carry images across); a text-only PDF and its DOCX will be similar order of magnitude.

Can I convert password-protected PDFs?

No. Decrypt the file first using /tools/pdf-unlock, then convert it here.

Keep exploring

More tools you'll like

Hand-picked utilities that pair well with the one you're on — all free, client-side, and zero-signup.