Base 64 tutorial

Itsme

In Runtime
Messages
227
I apologise for how it is split - there is a 10000 character limit, so this would not fit into one post.

Part I of IV
Well, I was having a lot of trouble learning b64 (mainly because all of the tutorials that I found weren't correct or thorough), so I decided (after learning what I believe the correct way is) to write a tutorial to help those who were in the same position as I was a while back….

Table of contents:

Intro
Beforehand
Things you should know

-Encoding b64
1-1 Writing the sentence(s)
1-2 Convert the text to 8-bit binary
1-3 Convert to 6-bit binary
1-4 Convert to decimal
1-5 Convert the decimal to ASCII, and put it all together
1-6 What's with the ='s?

-Decoding b64
Beforehand
2-1 Get rid of those pesky equal signs
2-2 Get the b64 decimal value
2-3 Convert to 6-bit binary
2-4 Convert 6-bit to bit stream, then convert to 8-bit
2-5 Convert to decimal number
2-6 Look it up


. . . . .

Base64 Encoding and Decoding Tutorials
By: Itsme


<intro_to_the_tutorial>

Well, I have been meaning to make this tutorial for some time, but I decided to write a tutorial to teach binary first, because knowledge of binary is needed for base64 encoding/decoding. Then, after breaking my wrist, I was really bored, so I decided to start on it. Then, after realizing that the whole tutorial was incorrect (after writing it, of course), I had to start over. And yes, it took a long time to type with one hand ;) Now you're probably thinking: “Just teach me base64! I have a life you know, and I don't care about your wrist!” So here it is. Enjoy!

</intro_to_the_tutorial>



Beforehand: (Before doing either tutorial…)
Well, you don't really have to do any of this, but it may help you…a lot. I would recommend that you get out a piece of paper and pen/pencil, or open up your favorite text editor on you computer. You are going to need to write down, type, or remember a whole lot of information, so that's why I recommend writing/typing it. Trust me, it makes it a lot easier. Also, you should know that if I use the term “b64”, that means “base64”. In all of the examples, I was using a text editor, though it is not required.


Some things to know (about base64):
Base64 uses 64 (how did you guess?) characters to encode strings. ‘A-Z', ‘a-z', ‘0-9', ‘+', and ‘/' are the 63 real characters (without the ‘', of course), and the ‘=' sign is the padding, but we will get into that a little later. I just want to show you the chart of those characters and their b64 values, but I don't want you to be too confused before the tutorial actually starts…anyway, here is the chart:

Code:
0  A    17 R    34 i    51 z
1  B    18 S    35 j    52 0
2  C    19 T    36 k    53 1
3  D    20 U    37 l    54 2
4  E    21 V    38 m    55 3
5  F    22 W    39 n    56 4
6  G    23 X    40 o    57 5
7  H    24 Y    41 p    58 6
8  I    25 Z    42 q    59 7
9  J    26 a    43 r    60 8
10 K    27 b    44 s    61 9
11 L    28 c    45 t    62 +
12 M    29 d    46 u    63 /
13 N    30 e    47 v    (pad) =
14 O    31 f    48 w
15 P    32 g    49 x    
16 Q    33 h    50 y

You'll be using this a few times, so if I tell you to look at the b64 chart (yeah, what an original name), you'll know where to look. Right there.

-Another semi-interesting fact is that b64 is used to encode many e-mails etc. when sent. This is done because the computer might accidentally drop the last two bits if 8-bit binary is used, so when encoded into b64 it uses 6-bit (you'll learn about that later). The computer is less likely to drop those.

-Sometimes b64 is not used because of one big reason. The ratio of characters used is a 4/3 ratio, so when encoded, more space is needed which can slow down the time it takes the message to be sent. And now for the tutorial…

-b64 is NOT encryption, which is what most people think. You are actually encoding when you use b64. You're probably thinking, “What's the difference?” Well, the difference is that encryption takes a key, and encoding does not. That's not a big difference, but at least you can now correct people when they say they encrypted something in b64.

. . . . .
 
Part II of IV
Encoding Base64


Step 1-1: Writing the sentence(s)
Alright, I really hope that you can figure this one out yourself, but I'll go over it just in case. The first thing that you should do is write the sentence(s) or whatever string(s) you want to convert. A little note when you're writing it is to leave a little space between all the letters/symbols/etc. That way you will be able to write the other stuff (you'll know what that is later) below it and have enough room. In this tutorial, we will NOT be using the normal boring “hello” as the test word. Instead we'll use the word “food,” because I'm hungry. (Note “food” is without the quotes, and it has a lowercase “f”. And yes, that stuff does matter.) This is what my text editor looks like so far (It's best if you use a monospace font [i.e. Lucida Console] in the text editor of your choice):

Code:
f         o         o         d

-There are 9 spaces after each letter. That usually will work great (in monospace).



Step 1-2: Convert the text to 8-bit binary
Alright. For this step, you will need to know 8-bit binary (that is the normal kind with 8 digits, if you didn't know). I am not going to teach you how in this tutorial, so you either need to visit my other tutorial, or learn it from somewhere else. I wrote a tutorial about this, and it can be found ***here*** IF YOU USE THIS TUTORIAL, YOU HAVE TO MAKE THIS LINK TO MY BINARY TUTORIAL THAT YOU ALSO MUST HAVE! If you already know binary, go ahead and convert the word (food), write in the conversion, then go on to the next step. My text editor now looks like this:

Code:
f         o         o         d
01100110  01101111  01101111  01100100



Step 1-3: Convert to 6-bit binary
This is a fairly simple step. There are only two easy sub-steps:

-First, take all of the sets of 8-bit binary that you got from the characters above, and put them together. My text editor now looks like this:

Code:
f         o         o         d
01100110  01101111  01101111  01100100
01100110011011110110111101100100

-Now split that into 6-bit binary, which is – you guessed it – binary in sets of 6 digits. Now I really hope that you don't need to hear this, but yes, there is a certain way to do this. You can't just randomly take digits out. Sorry. You have to start at the beginning of the whole long string of binary (bit stream, as its called in computer terms), and split it from there. Yeah, you can copy and paste it, and that is actually a good idea to do before you break it up. Copy and paste. By doing that you know that the code is exactly the same, and if it is really long, when you retype it you may make a little mistake, which is impossible with copy and paste (well, not impossible, but lets not go there). As you probably know, one little mistake in binary could screw up the whole thing. So just copy and paste. My text editor looks like this now: (Note that the comments [beginning with the //] are there for your information only. You don't need them in your conversion. This is true for all examples in my tutorial)

Code:
f         o         o         d
01100110  01101111  01101111  01100100      //8-bit Binary
01100110011011110110111101100100            //Bit Stream
011001  100110  111101  101111  011001  00  //6-bit Binary

-Don't worry if there are any extra numbers that don't fit into a group of 6. Those will be dealt with later.



Step 1-4: Convert to decimal
-Now what we have to do is take those sets of 6-bit binary and convert them back into decimal numbers. You do that the same way as you would with 8-bit binary, but it only has 6 digits, so the largest number is a 32 rather then 128. You do this the exact opposite way that you would convert to binary; you take the binary, and add it all up. For example, the first set in the example is 011001. To convert, you do this (you should know what I'm doing if you know binary and/or read my binary tutorial):

Code:
011001  //One set of 6-bit binary

16
 8      //Add together
+1
--
25      //Answer

-Convert all of the 6-bit binary into decimal numbers. But wait, what if you have a last group that doesn't consist of six digits? Well, you add zeros to the back of them. Not to the front, but to the back. You add the number of zeros needed to make that group a group of six. Like I have been saying, add them to the back – to the right. Now, in our example it doesn't matter because the last two numbers are zeros, so no matter where you put the other four that need to be added, it will stay zero, but with anything but zeros this will not be true, which is why I stress putting them on the correct side, the right. After this is done, my text editor looks like this (and your paper or text editor should as well…obviously):

Code:
f         o         o         d
01100110  01101111  01101111  01100100
01100110011011110110111101100100
011001  100110  111101  101111  011001  000000
25      38      61      47      25      0

-Don't forget the 0 at the end on the fifth line! It is easy to forget when it's a zero, but it is still a needed digit!



Step 1-5: Convert the decimal to ASCII, and put it all together
You're almost done, but you have to convert the last set of numbers that you got after converting to decimal from 6-bit binary back into ASCII text. The easiest way to do this (unless you memorized the b64 chart [that's the one above, and yes, I would consider memorizing that somewhat weird, only somewhat because it goes in order starting with A, and because I know it ;)]) is to go to the top of this tutorial where the chart is and look at it. I think that the chart is pretty self-explanatory, but for those of you who can't tell, there are four columns, each of which has a number and then a letter next to it. (The letter is to the right of the number. Remember that so you don't accidentally pick the wrong one!)

-Convert the last decimal numbers into ASCII text. To convert, you look up the decimal number that you got from the 6-bit binary (it should be a number 0-63) and then write the letter (or number or symbol) that is to the right of it on the b64 chart above. After the final conversions, my text editor looks like this:

Code:
f         o         o         d
01100110  01101111  01101111  01100100
01100110011011110110111101100100
011001  100110  111101  101111  011001  000000
25      38      61      47      25      0
Z       m       9       v       Z       A

-There are a few things to remember. First of all, it DOES matter if you have capital letters versus lowercase. For example, a capital “C” has the b64 value of 2, while a lowercase “c” has the decimal value of 28.

-Next we will pretty much finish it off. We just take out all of the extra spaces (we don't want the decoder to spit out weird and incorrect answers, do we?) and see what it looks like. Once done, my string looks like this:

Code:
f         o         o         d                //Original String
01100110  01101111  01101111  01100100         //8-bit Binary
01100110011011110110111101100100               //Bit Stream
011001  100110  111101  101111  011001  000000 //6-bit Binary
25      38      61      47      25      0      //b64 Dec Value
Z       m       9       v       Z       A      //b64 Char Value
Zm9vZA                                         //String

-The very last line is the encoded string in b64, except for a small thing that will be discussed in the next “step”. Wow…look at that, food = Zm9vZA. So now you can go around your house and tell someone that you want some Zm9vZA because you are hungry. That will get some looks…or maybe you'll get a thermometer in your mouth…I don't know. And I don't take any responsibility…but wait. Does food really = “Zm9vZA”? No…not exactly. There is still one thing you must do…



Step 1-6: What's with the ='s?
OK, so you have the conversion of the word “food”, except for one thing that you keep hearing about. This “thing” is the use of one or more equal signs at the end of the converted string. This part confuses a lot of people, so I am going to explain it as well as I can. OK. b64 works in groups of four, meaning that if the string that you get at the end doesn't have a multiple of four characters in it, you need to add the padding character until it does. (Remember what that is? The equal sign!) I think that this may be a little easier to show with examples, so here goes:

-If you have the b64 encoded string: H4cK
Then you have exactly 4 characters in it (H, 4, c, and K). Therefore you don't need to add any padding characters (=), because it is a multiple of four. The same would be true if you had 8, 12, 16, etc., etc., etc. characters.

-If you have the b64 encoded string: Hax0rS
Then you have only 6 characters in it (H, a, x, 0, r and S). Therefore you need to add two padding characters (=) so that it looks like this (because the next multiple of four is eight): Hax0rS==

-Note that the above examples are NOT real. You can't decode them and get anything that would make sense, because they aren't real…


That's it…you now know how to encode in b64!



. . . . .
 
Part III of IV
. . . . .



Decoding Base64


Beforehand: (Before the decoding tutorial…)
Beforehand you must either know binary (how to encode and decode things), or have read the encoding tutorial above, which should have shown you how to use binary. In simpler terms – learn binary! Also, you should know how to encode base64 (from either reading the tutorial above or from prior knowledge), because I am not going to go into as much detail in this tutorial as I did in the last one. You should also know that decoding b64 is encoding it backwards! If you were to follow the steps backwards and opposite of what some say, you would have decoded it. But I'm going through it anyway…because I'm bored…



Step 2-1: Get rid of those pesky equal signs
Well, by now you should know fairly well how to encode strings into base64, but if you don't, you should go above and read the encoding tutorial. Anyway, to decode b64 code, the first thing that you have to do is get rid of the equal sign(s) that may or may not be at the end of the base64 code. If you have a program that decodes b64, you would have to leave those annoying equal signs or else the program would go crazy and spit out a really weird answer. BUT, you are not a computer program. Therefore, you can get rid of them, and unless you have some kind of eye addiction to equal signs, you will be fine. No, really, you'll be fine.

-Take the encoded string, and if you are using a text editor – here comes the hard part – go to the end of the line and press the backspace key until there are no more equal signs. The example that we are going to be decoding in the decoding tutorial is: bW9uZXk=
My text editor looks like this:

Code:
bW9uZXk=
b         W         9         u         Z         X         k

-I really hope you see what I did – deleted the equal sign. That's it! Well, actually I also left space so that I can fit everything again…but that's it.



Step 2-2: Get the b64 decimal value
All you have to do now is get the b64 decimal value from the b64 chart (above). If you have read the encoding tutorial, then this is pretty self-explanatory.

-Get all the decimal values. Remember that the letter is to the right of the number. My text editor now looks like this:

Code:
bW9uZXk=
b         W         9         u         Z         X         k
27        22        61        46        25        23        36



Step 2-3: Convert to 6-bit binary
To do this you need to know how to encode into base64 and you need to know binary. If you don't know how to encode, please read the tutorial above. If you don't know binary, please go to step 1-2 and click the link that leads to my other tutorial. You can learn binary through that tutorial. If you do know binary and how to encode b64, continue reading.

-First take each b64 decimal number and convert it to 6-bit binary. You must remember, however, to convert it into 6-bit and NOT 8-bit binary. If you were to convert it into 8-bit binary, you would get some weird answer, and it is possible that you wouldn't even be able to fully convert the b64 string back into normal text. When converting to 6-bit, you would do pretty much the same thing as you would for 8-bit, but leave off the top two digits. If the encoding and first 2 decoding steps have been done correctly, the first two digits of 8-bit binary shouldn't contain a 1; they should both be zeros, so it wouldn't matter when you leave them off anyway. So just don't convert it to 8-bit, and then you won't have to worry about anything that I said in the last (almost run-on) sentence.

-Convert each b64 decimal number back into 6-bit binary. Here is an example of converting the first number:

Code:
b
27
011011

-Once all of the conversions have been made, my text editor looks like this:

Code:
bW9uZXk=
b        W        9        u        Z        X        k
27       22       61       46       25       23       36
011011   010110   111101   101110   011001   010111   100100



Step 2-4: Convert 6-bit to bit stream, then convert to 8-bit
Alright. I'm really going for it this time. I'm doing two things in this step. (Just like I did above). Anyway, the first thing that we're going to do is convert the 6-bit binary into one bit stream. After doing so, my text editor looks like this:

Code:
b        W        9        u        Z        X        k
27       22       61       46       25       23       36
011011   010110   111101   101110   011001   010111   100100
011011010110111101101110011001010111100100

-Once again I copied and pasted and just got rid of the spaces with the delete button. This way, like I said above, you are sure that (unless your computer is really screwy) the two lines are the same.

-Now we are going to do the second thing. Split the bit stream into 8-bit binary. All that means is put a space or two (I prefer two) between every eighth digit. Remember, you should know that there might be extra digits left over. They are the ones that had to be put on when the string was encoded. If none were left over, that means none were added. If there are any left over, they should only be zeros and they should come off the end. These can be deleted (however, if there are any ones left over, you did something wrong when encoding or decoding. My text editor now looks like this:

Code:
b        W        9        u        Z        X        k
27       22       61       46       25       23       36
011011   010110   111101   101110   011001   010111   100100
011011010110111101101110011001010111100100
01101101  01101111  01101110  01100101  01111001

-Note that I got rid of the two extra zeros on the last line.



Step 2-5: Convert to decimal number
OK. You're almost done. Now you're going to get the ASCII decimal number from the 8-bit binary. You should already know how to do this, but if not, then click the link in step 1-2 to see my binary tutorial. Anyway, I am going to show in the next example how to find the decimal value of the first set of 8-bit binary:

Code:
01101101
64
32
 8
 4
+1
--
109

-Once I find the ASCII decimal number for each 8-bit binary number, my text editor looks like this:

Code:
b        W        9        u        Z        X        k
27       22       61       46       25       23       36
011011   010110   111101   101110   011001   010111   100100
011011010110111101101110011001010111100100
01101101  01101111  01101110  01100101  01111001
109       111       110       101       121



Step 2-6: Look it up
This is the last and easiest step of all. You need to go to www.asciitable.com (unless you memorized the ASCII table…which I hope you haven't) and look all the decimal numbers up.

-The ASCII table has four large columns which each contain five mini-column things. You will be using the first and last mini-columns in all of the large columns. What you do is first take the number that you got after you converted the 8-bit binary. Then look it up under the mini-column called “Dec” (that's the first mini-column in each big column). Follow the row until you get to the last mini-column that is labeled “Chr”. “Chr” stands for character, and “Dec” stands for decimal. So see what you're doing now? You're changing the DECimal into a ChaRacter. Wow! After this is done for the five decimal numbers that we have, my text editor looks like this:

Code:
b        W        9        u        Z        X        k
27       22       61       46       25       23       36
011011   010110   111101   101110   011001   010111   100100
011011010110111101101110011001010111100100
01101101  01101111  01101110  01100101  01111001
109       111       110       101       121
m         o         n         e         y

-Now, to make it look a bit nicer:

Code:
b        W        9        u        Z        X        k
27       22       61       46       25       23       36
011011   010110   111101   101110   011001   010111   100100
011011010110111101101110011001010111100100
01101101  01101111  01101110  01100101  01111001
109       111       110       101       121
m         o         n         e         y
money

-That's it! The encoded word was decoded to reveal money. And whenever money is revealed and it's yours, that's a good thing. So now you not only know how to decode b64, but you made money in the process!

. . . . .

. . . . .
 
Part IV of IV
. . . . .

. . . . .



<challenges>

I decided last minute to add a couple of challenges for you to try. There is no prize, and I'm not watching you, so I wont know (and really wont care) if you cheat on these. They are for you to try to make sure that you know what you are doing, and to show you what a good teacher I am. (Yeah, right!) Anyway, the answers can be found right below my name below in font size one and a font color that is somewhat hard to see. You should highlight it and copy and paste it into a text editor so that you can read it. I just don't want you to accidentally see it and then not try the challenges. This way, you'll probably only see it if you copy it into a text editor and make the font bigger). Now for the challenges:

1. Encode “Tutorials” (Without the “ and ”, of course)
2. Encode “Monkeys rule!!!” (Without the “ and ”, of course)
3. Decode “Y29tcHV0ZXI=” (Without the “ and ”, of course)
4. Decode “SSBhbSB0aXJlZA==” (Without the “ and ”, of course)

</challenges>


<extra_notes>

Well, now it looks like you can go and brag to all of your friends about how you can encode and decode base64! Or you can give them messages in b64 that say, “Decode it” and when they ask what it says, you can truthfully reply, “Decode it”. Or not.

Please feel free to contact me through PM, or post (e-mail isn't the best, because I might accidentally throw it out, however if you tell me that you want to e-mail me, I will look out for it. Contact me with any errors found in either tutorial so that I can fix them promptly. Also feel free to send any suggestions, constructive criticism, or maybe a “Good Job!” to that address. Thanks a lot.

Also, please tell me what you think of this tutorial. Was it informative? Could you understand it easily? Was it funny? Boring? What would you rate it on a scale of 1-10? How could I make it better? Etc. Thanks again.

</extra_notes>

Thank you for your time, and happy encoding,
Itsme










Answers:
1. VHV0b3JpYWxz 2. TW9ua2V5cyBydWxlIQ==
3. computer 4. I am tired




*EDIT(S):
Whoops…found a little error in the tutorial and fixed it
 
Thanks :). I was working on a python b64 encoder...but I never finished it...lol. Oh well. Maybe you'll finish yours :)

Itsme
 
>>> from base64 import *
>>> encodestring("Tutorials")
'VHV0b3JpYWxz\n'

im nothing but a filthy cheat :'(

damn pythons easy :D
 
well...that worked well, huh? you could sell it and prolly make money. lol
Just add an input to it so that someone can enter a string when the proggie is running ;) that way its at least. heh

Itsme
 
Back
Top Bottom