Part of the Technical Health audit

Check your character encoding declaration

A missing charset declaration can turn special characters into garbled symbols. SiteCurl checks that every page declares its encoding correctly.

Start 7-Day Studio Trial

No signup required. Results in under 60 seconds.

423,000+ checks run and counting

What this check does

SiteCurl looks for a <meta charset='utf-8'> tag (or its matching Content-Type header) on each page in your scan. If the tag is missing, the browser guesses the encoding. That guess is often wrong, and the result is garbled text: question marks, black diamonds, or random symbols where your content should be.

The check confirms both the tag and that it shows early in the HTML head. Browsers need to see the charset within the first 1024 bytes of the page. A tag buried deep in the head may be skipped.

UTF-8 is the standard encoding for the modern web. SiteCurl flags pages that are missing the tag or using an old encoding like ISO-8859-1 or Windows-1252.

How this shows up in the real world

Encoding tells the browser how to turn raw bytes of your HTML into readable text. Each letter, number, and symbol has a number code. The encoding says which number maps to which letter. If the browser uses the wrong encoding, the map breaks and text turns to noise.

UTF-8 covers nearly all letters in all tongues, including accented letters, emoji, currency signs, and CJK glyphs. It has been the main encoding on the web since 2008 and is now used by over 98% of all sites.

With no charset tag, browsers fall back to guessing. They look at the bytes and try to pick the encoding. This works most of the time for plain English text, but fails as soon as you use a curly quote, an accented name, or a currency sign like the euro symbol. The browser picks the wrong map and shows garbage text.

The tag needs to show early in the page. Browsers start parsing HTML right away. If the charset tag comes after 1024 bytes, the browser may have already locked in the wrong encoding and need to restart. This adds delay and can cause a visible flash.

Why it matters

Garbled text makes your site look broken. A user who sees question marks or diamond symbols instead of proper letters loses trust in your content. On a pricing page, a bad currency symbol can make prices hard to read.

Google also relies on correct encoding to index your content. If Google reads your page with the wrong encoding, your indexed content may hold garbage letters. This can stop your pages from showing in search results for queries with special chars.

The fix is one line of HTML. Adding the charset meta tag takes seconds and stops a whole class of display issues that can hit each page on your site.

Who this impacts most

Sites in many tongues are most at risk. Sites serving content in French, German, Spanish, or any tongue with accented letters will show broken text if the charset is wrong or missing.

Online stores using currency symbols (EUR, GBP, JPY) need correct encoding so prices show right. A garbled price is worse than no price at all.

Sites with user-posted content (comments, reviews, forums) are at risk since users paste text from word tools that include curly quotes and special chars.

How to fix it

Step 1: Add the charset meta tag. Place <meta charset='utf-8'> as the first item inside your <head> tag. It must come before any other tags, including the title tag.

Step 2: Check your server headers. Make sure your server sends Content-Type: text/html; charset=utf-8 in its headers. This is a backup that works even if the meta tag is missing. Most web servers set this by default.

Step 3: Save files as UTF-8. Make sure your HTML files are saved in UTF-8, not ISO-8859-1 or Windows-1252. Most modern text editors default to UTF-8, but older files may use old encodings. Re-save them as UTF-8 if needed.

Step 4: Check your CMS layout. In WordPress, the charset is set in the theme's header.php file. In most CMS tools, the main layout controls the charset for all pages. Fix it once in the layout and each page gets the fix.

Common mistakes when fixing this

Placing the charset tag too late. The meta charset tag must show within the first 1024 bytes of the HTML. If it comes after many large meta tags, stylesheets, or scripts, the browser may miss it. Put it first in the head.

Setting one encoding but saving the file in another. If your meta tag says UTF-8 but the file is saved as ISO-8859-1, text will still break. Make sure the tag matches the real file encoding.

Relying on the server header alone. While the Content-Type header works, it can be changed by proxies or CDNs. Include the meta tag in your HTML as a safe fallback.

How to verify the fix

After adding the charset tag, run a new SiteCurl scan. The charset issue should vanish. For a manual check, view the page source (Ctrl+U or Cmd+Option+U) and confirm that <meta charset='utf-8'> shows at the top of the head block.

To check the server header, run curl -sI https://yoursite.com | grep -i content-type. You should see charset=utf-8 in the output.

The bottom line

Encoding is a one-line fix that stops garbled text across your whole site. Add <meta charset='utf-8'> as the first tag in your head block. It takes seconds and kills a whole class of display problems.

Example findings from a scan

charset declaration found: UTF-8

Missing charset declaration on /blog/post

Outdated charset encoding: ISO-8859-1

Frequently asked questions

What is character encoding?

Encoding tells the browser how to turn the raw bytes of your HTML file into readable text. UTF-8 is the standard for the modern web and covers nearly all letters in all tongues.

What happens if the charset declaration is missing?

The browser guesses the encoding. For plain English text, the guess is often right. But special chars like curly quotes, accented letters, emoji, and currency symbols often show as question marks or garbled symbols.

Can I check charset without signing up?

Yes. The free audit checks your home page for charset as part of a full seven-part scan. No signup needed. Results in under 60 seconds.

Is UTF-8 the only correct encoding?

UTF-8 is the best encoding for all new sites. Over 98% of the web uses it. Older encodings like ISO-8859-1 still work for basic Latin letters, but UTF-8 covers all tongues and symbol sets. There is no reason to use anything else for new content.

Check your encoding now