identifying file types

BobLewiston

In Runtime
Messages
182
Within a C# program, is there any way to tell if a file is a text file or not? I mean a real way, not just basing your conclusion on a file name extension.
 
That depends on what level of complexity you want to involve in your program. Even windows doesn't get all bit-level on you to determine the file type, it simply uses the extension and then maps that extension to an application via the registry. You could develop patterns for certain files or attempt to convert them to base64 and look for certain strings of data to determine the type, for example: the base64 of a jpg typically starts with "/9j/"
here is an example of part of a base64 string for a jpg:
Code:
/9j/4AAQSkZJRgABAQEBLAEsAAD/2wBDAAYEBQYFBAYGBQYHBwYIChAKCgkJChQODwwQFxQYGBcU
FhYaHSUfGhsjHBYWICwgIyYnKSopGR8tMC0oMCUoKSj/2wBDAQcHBwoIChMKChMoGhYaKCgoKCgo
KCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCj/wAARCAHgAoADASIA
AhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQA
AAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3
ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWm
p6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEA
AwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSEx
BhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElK
U1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3
uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwDqGYgg
YHQdh6U3cfQfkKV+o+g/lTa7TnHbj6L+Qpdx9F/IU2lpiF3H0H5CjcfQfkKSikA4MfRfyFG4+g/I
U2lpi3F3H0H5Cl3H0H5Cm0tAChj6D8hS7j6L+QptAoC47cfRfyFLuPoPyFNooAduPov5CjcfQfkK

The complete base64 is quite large, and it would be difficult to parse through any further than the first several characters.

I'm not about other files types, but unless the difference is REALLY important, I would just stick with the extension. Alternatively you could check for multiple extensions, like if someone attempted to circumvent a file upload like renaming bad.exe to bad.exe.tmp or bad.tmp. Hope this helps.
 
Back
Top Bottom