What this tool actually does
The conversion has three stages:
- Extract the text layer — pdf.js walks every page and returns the individual text items along with their position on the page.
- Group items into paragraphs — we detect vertical gaps between items: a small gap means same paragraph, a bigger gap means new paragraph. Adjacent runs on the same line are joined with a space.
- Write a DOCX — each paragraph becomes a Word paragraph; a page break is inserted between source PDF pages.
The result is a normal Word document you can open and edit in any DOCX-aware app.