Skip to content
epitometool

Whitespace cleaner

Text utilities

Strip BOMs, zero-width chars and trailing whitespace; normalise EOLs, indentation and final newlines.

Updated

Input

Paste any text — code, prose, log output, scraped HTML. Tabs, newlines and invisible characters are all detected.

Cleanup options

Line endings
Final newline

Output

  • EnterCopy output
  • KClear input

Quick start

How to clean whitespace from text

Paste text, tick the cleanup options you need, copy a normalised result.

  1. Step 1
    Paste the source text

    Any text — code, prose, log output, copied HTML. Invisible characters and mixed line endings are detected automatically.

  2. Step 2
    Pick cleanup steps

    Strip BOM, remove zero-width chars, trim trailing spaces, collapse blanks, normalise EOL, convert tabs ↔ spaces.

  3. Step 3
    Copy the normalised text

    Hit Copy or ⌘/Ctrl+Enter. The output is safe to paste into source control, CSV pipelines or production data stores.

In-depth guide

Whitespace cleaner: tabs, trailing spaces, BOM and invisible characters

Real-world text is full of invisible noise: BOM headers, zero-width characters from emoji apps, non-breaking spaces from Word, ragged trailing whitespace, mixed CRLF/LF line endings, tabs where there should be spaces. This tool composes the standard cleanups into a single predictable pipeline you can replay on any input.

Pipeline order

Cleanup steps run in a deterministic order so the result doesn't depend on which checkbox you toggled first:

  1. Strip BOM from the start of the buffer.
  2. Strip zero-width characters (U+200B/C/D, U+2060) wherever they occur.
  3. Convert NBSP (U+00A0) to ordinary space.
  4. Normalise line endings to LF internally.
  5. Trim trailing spaces/tabs on each line, if enabled.
  6. Trim leading spaces/tabs on each line, if enabled.
  7. Collapse runs of spaces/tabs to one space, if enabled.
  8. Collapse three+ blank lines to one, if enabled.
  9. Tab ↔ space conversion (mutually exclusive).
  10. Final-newline policy: ensure / remove / preserve.
  11. Re-emit CRLF if you chose CRLF; LF otherwise.

The invisible-character problem

Three Unicode characters routinely cause silent bugs:

  • BOM (U+FEFF) — UTF-8 files saved by Windows tools sometimes carry a leading BOM. PHP echoes it to the response body (breaks header() calls). Bash refuses to parse a shebang line that starts with one. CSV libraries that ignore it leave a stray  in the first cell.
  • Zero-width characters (U+200B–U+200D, U+2060) — copy-pasted from messaging apps, emoji pickers and some marketing emails. They render to nothing, so "foo" with a hidden ZWJ between the letters looks identical but doesn't equal "foo" in any string comparison.
  • NBSP (U+00A0) — common in HTML-pasted text and Word documents. \s in JavaScript regex matches it; [ \t] does not. Splitting on plain space leaves NBSP-glued tokens intact.

The tool's defaults remove all three because the upside (clean strings) almost always outweighs the very rare case where you needed them for visual layout.

Line-ending conventions

StyleBytesUsed by
LF0x0ALinux, macOS, Git internal format, the modern web (HTTP, JSON).
CRLF0x0D 0x0AWindows native, HTTP headers, RFC 5322 email.
CR0x0DClassic Mac OS (pre-OS X). Rare today; the tool collapses it to LF.

For source control, Git stores files internally with LF and converts to CRLF on Windows checkout only when core.autocrlf=true. Picking LF here aligns with that and avoids "the whole file changed" noise in pull requests.

Tips and pitfalls

  • Tab width matters for the round-trip. If you flip tabs-to-spaces with width 4, then spaces-to-tabs with width 8, the indentation will be wrong. Pick the width that matches your editor's tab-size setting and stick to it.
  • Don't collapse spaces on poetry, ASCII art or code aligned by spaces. Collapse runs eats deliberate vertical alignment. Off by default for that reason.
  • YAML and Makefile gotchas. YAML cares about leading-space depth (don't trim leading whitespace blindly). Makefiles require literal tabs at the start of recipe lines (don't run tabs-to-spaces on a Makefile).
  • POSIX text files require a trailing newline (last line ends in \n). Leave "Ensure final newline" on for any file that goes into version control.

When to use it vs alternatives

Use this tool for quick text transformation, inspection, decoding, testing, or generation without opening a heavier application. Use a project script or test suite when the same transformation must be repeated automatically.

Privacy and security

Browser-first by design. The tool page explains any exception before you use it.

The text you paste is handled in the browser tab and is not sent to a third-party API by EpitomeTool. Close or refresh the tab when you are done with sensitive snippets.

Frequently asked questions

Does this tool send my text anywhere?

No. Every transform is a pure JavaScript string operation in your browser. Open DevTools → Network and you'll see zero traffic while you paste and clean.

What's a BOM, and why strip it?

The Byte Order Mark (U+FEFF) is a leading invisible character UTF-8 files sometimes start with. It's harmless in browsers but breaks PHP scripts, shebang lines in shell scripts, and CSV parsers that don't expect it. The Unicode standard treats it as deprecated for UTF-8; strip it on output.

What counts as a 'zero-width' character?

U+200B (zero-width space), U+200C (zero-width non-joiner), U+200D (zero-width joiner), and U+2060 (word joiner). They render to nothing but participate in length counts, search comparisons and word boundaries. Common culprits when text gets accidentally pasted from emoji-rich apps, Discord messages or some PDF copy operations.

Why convert non-breaking space (U+00A0) to a regular space?

NBSP is what most word processors and many websites insert between numbers and units, or between a person's first and last name, so they don't get split across lines. When you copy that text into code or a CSV, NBSP looks identical to a space but doesn't match `\s` in some regex flavours and isn't accepted by command-line parsers expecting ASCII. Converting it normalises the data.

What's the difference between 'Trim trailing whitespace per line' and 'Collapse blank lines'?

Trim trailing strips spaces and tabs at the end of every non-empty line — common style rule for source files. Collapse blank lines reduces runs of three or more empty lines down to one empty line, useful for pasting code that came with excessive vertical spacing.

How does the tabs-to-spaces conversion handle column alignment?

It expands each tab to the number of spaces needed to reach the next multiple of the tab width — exactly like a terminal would. So with width = 4, a tab in column 0 expands to 4 spaces, a tab in column 6 expands to 2 spaces. This preserves columnar layouts (e.g. tabular log output).

Will 'spaces to tabs' replace spaces in the middle of a line?

No. Only leading spaces (the indentation portion) are converted. In-line spaces between words and trailing spaces are left alone, because converting them would break the meaning of natural-language text.

Why is 'Preserve' for line endings the same as 'LF' in the output?

All passes operate on a CRLF/CR-normalised internal buffer (cleaner regex semantics), so by the time we emit, the per-line origin is lost. Preserve effectively means 'don't force CRLF on me' — we leave the LF representation in place. Use LF or CRLF explicitly when you need a specific platform's convention.

Keep exploring

More tools you'll like

Hand-picked utilities that pair well with the one you're on — all free, client-side, and zero-signup.