UTF-8
UTF-8 is a way to store text in computers as a numbers, computers can only understand numbers — it tells the computer how to save and read characters, including English letters, emoji, and characters from any language (like Urdu, Arabic, Chinese, etc.).
🔸 What does UTF-8 stand for?
Unicode Transformation Format – 8-bit
🔍 Why UTF-8 is special:
- It can store every character in every language
- It uses 1 to 4 bytes to represent characters:
- English letters = 1 byte ✅
- Emoji or Urdu/Arabic = 2–4 bytes ✅
How many characters can UTF-8 handle?
UTF-8 supports up to 1,114,112 unique characters (from U+0000
to U+10FFFF
) — more than a million!
That covers:
- All languages 🌍
- Emojis 😄
- Ancient scripts 🏺
- Math symbols ➕
- And more