Part of the Technical Health audit

Check your character encoding declaration

A missing charset declaration can turn special characters into garbled symbols. SiteCurl checks that every page declares its encoding correctly.

No signup required. Results in under 60 seconds.

What this check does

SiteCurl looks for a <meta charset='utf-8'> tag (or its equivalent Content-Type header) on every page in your scan. If the declaration is missing, the browser guesses the encoding. That guess is often wrong, and the result is garbled text: question marks, black diamonds, or random symbols where your content should be.

The check verifies both the presence of the declaration and that it appears early in the HTML head. Browsers need to see the charset within the first 1024 bytes of the page. A declaration buried deep in the head may be ignored.

UTF-8 is the standard encoding for the modern web. SiteCurl flags pages that are missing the declaration or using an outdated encoding like ISO-8859-1 or Windows-1252.

How this shows up in the real world

Character encoding tells the browser how to translate the raw bytes of your HTML into readable text. Every letter, number, and symbol has a numeric code. The encoding defines which number maps to which character. If the browser uses the wrong encoding, the mapping breaks and text turns into nonsense.

UTF-8 covers virtually every character in every language, including accented letters, emoji, currency symbols, and CJK characters. It has been the dominant encoding on the web since 2008 and is now used by over 98% of all websites.

Without a charset declaration, browsers fall back to heuristics. They look at the bytes and try to guess the encoding. This works most of the time for plain English text, but fails as soon as you use a curly quote, an em dash, an accented name, or a currency symbol like the euro sign. The browser picks the wrong mapping and shows garbage characters.

The declaration needs to appear early in the document. Browsers start parsing HTML immediately. If the charset declaration comes after 1024 bytes, the browser may have already committed to the wrong encoding and need to restart parsing. This adds delay and can cause a visible flash.

Why it matters

Garbled text makes your site look broken. A visitor who sees question marks or diamond symbols instead of proper characters loses trust in your content. On a pricing page, a corrupted currency symbol can make prices unreadable.

Search engines also rely on correct encoding to index your content. If Google reads your page with the wrong encoding, your indexed content may contain garbage characters. This can prevent your pages from appearing in search results for queries that include special characters.

The fix is a single line of HTML. Adding the charset meta tag takes seconds and prevents a class of display issues that can affect every page on your site.

Who this impacts most

Multilingual sites are most vulnerable. Sites serving content in French, German, Spanish, or any language with accented characters will show corrupted text if the charset is wrong or missing.

E-commerce sites using currency symbols (EUR, GBP, JPY) need correct encoding so prices display properly. A garbled price is worse than no price at all.

Sites with user-generated content (comments, reviews, forums) are at risk because users paste text from word processors that include curly quotes and special characters.

How to fix it

Step 1: Add the charset meta tag. Place <meta charset='utf-8'> as the first element inside your <head> tag. It must come before any other elements, including the title tag.

Step 2: Verify your server headers. Check that your server sends Content-Type: text/html; charset=utf-8 in its response headers. This is a backup that works even if the meta tag is missing. Most web servers and frameworks set this by default.

Step 3: Save files as UTF-8. Make sure your HTML files are actually saved in UTF-8 encoding, not ISO-8859-1 or Windows-1252. Most modern text editors default to UTF-8, but older files may use legacy encodings. Re-save them as UTF-8 if needed.

Step 4: Check your CMS template. In WordPress, the charset is set in the theme's header.php file. In most CMS platforms, the main layout template controls the charset for all pages. Fix it once in the template and every page gets the correction.

Common mistakes when fixing this

Placing the charset declaration too late. The meta charset tag must appear within the first 1024 bytes of the HTML. If it comes after several large meta tags, stylesheets, or scripts, the browser may miss it. Put it first in the head.

Declaring one encoding but saving the file in another. If your meta tag says UTF-8 but the file is saved as ISO-8859-1, characters will still break. Make sure the declaration matches the actual file encoding.

Relying on the server header alone. While the Content-Type header works, it can be overridden by proxies or CDNs. Include the meta tag in your HTML as a reliable fallback.

How to verify the fix

After adding the charset declaration, run another SiteCurl scan. The charset issue should disappear. For a manual check, view the page source (Ctrl+U or Cmd+Option+U) and verify that <meta charset='utf-8'> appears at the top of the head section.

To check the server header, run curl -sI https://yoursite.com | grep -i content-type. You should see charset=utf-8 in the output.

The bottom line

Character encoding is a one-line fix that prevents garbled text across your entire site. Add <meta charset='utf-8'> as the first element in your head tag. It takes seconds and eliminates an entire class of display problems.

Example findings from a scan

charset declaration found: UTF-8

Missing charset declaration on /blog/post

Outdated charset encoding: ISO-8859-1

Frequently asked questions

What is character encoding?

Character encoding tells the browser how to convert the raw bytes of your HTML file into readable text. UTF-8 is the standard encoding for the modern web and covers virtually every character in every language.

What happens if the charset declaration is missing?

The browser guesses the encoding. For plain English text, the guess is usually correct. But special characters like curly quotes, accented letters, emoji, and currency symbols often display as question marks or garbled symbols.

Can I check charset without signing up?

Yes. The free audit checks your home page for charset declaration as part of a full seven-category scan. No signup needed. Results in under 60 seconds.

Is UTF-8 the only correct encoding?

UTF-8 is the recommended encoding for all new websites. Over 98% of the web uses it. Older encodings like ISO-8859-1 still work for basic Latin characters, but UTF-8 covers every language and symbol set. There is no reason to use anything else for new content.

Check your encoding now