cybershark5886
Baseband Member
- Messages
- 24
Is it possible to decode Unicode symbols (In *.DAT files) from a UTF-8 format?
I have looked through certain .DAT files on my computer and I noticed alot of nonsense characters (like control characters, numbers, and a few intelligible alphabet characters) and some even show up as blocks (open rectangles). I assume that there used to be code there in its place and then what I see is the result of encryption... right?
I've been looking up on certain UTF formats (I think windows 98 runs off of UTF-8, right?) and its limitations when it tries to process Unicode characters outside of its character mapping range.
For instance a Hebrew or Chinese character might be ᣮ, and since its too big a number to fit into the ASCII/UTF-8 (what's the difference?) charaster set, it breaks down the number into several readable peices (which can number up to 4) and they appear as (4) individual (ASCII readable) characters instead of the single chinese character. So "1 Chinese charcater" (might) = (4 charcaters) A7£æ. A7£æ doesn't make sense on it's own, but with a decoder (I assume) it might combine these 4 charcters and recognize that it had been previously out of range of the original character set and translate it into the chinese character set again.
That's what I'm guessing anyway. What I DON'T know is where to find a decoder that might make sense out of "A7£æ" and turn it into its original character.
I'm not suggesting though that Windows programmers programmed their .DAT files in chinese characters though. But why else would the .DAT file hold all those blocks and strings of nonsense? What is really behind those characters? It it really just control characters setting perimiters for the file or is it an encryption of something else? Do I even need a decoder to find out what was behind that string of nonsense characters?
I'd appreciate any speculation on why .DAT files look like they do. It's bugging me to death. Thnx.
I have looked through certain .DAT files on my computer and I noticed alot of nonsense characters (like control characters, numbers, and a few intelligible alphabet characters) and some even show up as blocks (open rectangles). I assume that there used to be code there in its place and then what I see is the result of encryption... right?
I've been looking up on certain UTF formats (I think windows 98 runs off of UTF-8, right?) and its limitations when it tries to process Unicode characters outside of its character mapping range.
For instance a Hebrew or Chinese character might be ᣮ, and since its too big a number to fit into the ASCII/UTF-8 (what's the difference?) charaster set, it breaks down the number into several readable peices (which can number up to 4) and they appear as (4) individual (ASCII readable) characters instead of the single chinese character. So "1 Chinese charcater" (might) = (4 charcaters) A7£æ. A7£æ doesn't make sense on it's own, but with a decoder (I assume) it might combine these 4 charcters and recognize that it had been previously out of range of the original character set and translate it into the chinese character set again.
That's what I'm guessing anyway. What I DON'T know is where to find a decoder that might make sense out of "A7£æ" and turn it into its original character.
I'm not suggesting though that Windows programmers programmed their .DAT files in chinese characters though. But why else would the .DAT file hold all those blocks and strings of nonsense? What is really behind those characters? It it really just control characters setting perimiters for the file or is it an encryption of something else? Do I even need a decoder to find out what was behind that string of nonsense characters?
I'd appreciate any speculation on why .DAT files look like they do. It's bugging me to death. Thnx.