Data Server at Lowest Cost

We Multiply You

Solid State Member
Messages
11
Storage Server at Lowest Cost

Hi,

I posted this in the "Hardware" part of this forum but didn't get any answer which got me to think maybe it is because I might have posted it in the wrong place, so I'm trying again here:

I'm looking for the technical answer for a need of mine that I hope could be brought by members of this forum.

I would like to build a huge "hard drive" at lowest possible cost per unit of storage. Each of these modules will serve as a big external hard drive for a server and exceed a capacity of 20Tb (their is no limit of capacity and the capacity will be determined by the optimal storage/$ ratio).

The only technical restriction is that the configuration should not limit the transfer rate of each physical hard drive (that is, for example, that the processor should be able to support maximum simultaneous transfer to each of the hard drives of the computer, or that the number of hard drives should not exceed the capacity of the computer to transfer data (if this "big drive" should be connected by Gigabyte ethernet, then there should be no more drive than 1000 divided by their individual transfer speeds in Mb/sec).

As you may understand, I have absolutely no computer knowledge at all and have no idea if what I'm looking for is even possible in a very cost efficient manner, but the big picture is building a big computer with as many internal drives as possible (or otherwise as long as it doesn't limit performance) and use this computer as a one big hard drive that is put on the network or plugged in via firewire 800 for example).
I have absolutely no idea what OS to use for this (the only OS I know is Snow Leopard Server which I know would do fine for this but which I cannot use because of the cost it would imply (I would need to use an apple computer with a limited number of internal drives which would cost be about ten times more per unit of storage than what I am looking for).

As of which internal drives to use for this I am open to any suggestion but I would be guessing a WD 2Tb Caviar Green which I believe is the second cheapest drive/Gb after seagate's counterpart which sadly has a very bad reliability as opposed to other seagate drives.

The idea is to have as much storage as possible for a price/gb slightly above the one of the internal hard drives themselves, without limiting their individual transfer speeds. With respect to cost, energy efficiency is regarded but only if this will make a difference of final cost after less than one year of 24/7 use.

Thank you in advance for your contributions. Feel free to make any suggestion and ask any questions if you don't really understand what I'm looking for.
 
Re: Storage Server at Lowest Cost

Look into a network attached storage device. Its basically an external hard drive bay with network capabilities.
 
Re: Storage Server at Lowest Cost

I know but it ends up being about twice as expensive per unit of storage than just building a big PC with a lot of internal hardrives in.

I have yet another question by the way: Is it possible to have more than 26 inetrnal drives in a PC? What name does it give to the drive that comes after "Z"?

How much power should I have with respect to the number of drives I have?

What OS should I use?
 
Re: Storage Server at Lowest Cost

I doubt you would be able to even fit 26 hard drives into any case save for a rack of servers, which seems beyond your price point. You'd probably be better off with a pedestal server chassis. This chassis can fit 8 drives but you could probably jury rig it to 14. As for power, probably not too much if you use Caviar Greens and I have no clue of the operating system, theres most likely a linux distro for this application though.
 
Re: Storage Server at Lowest Cost

As you may understand, I have absolutely no computer knowledge at all
First off - this is a very non standard project, are you sure you want to attempt it? I'd have a crack at something like this if I needed to, but only because I know how to set up a custom linux distro to tweak to deal with this sort of setup, I know the ins and outs of RAID configurations and what's best used where in terms of performance and redundancy, and if the whole thing packs up I can have a good crack at fixing it. If you really have no idea about computers, make sure you've got someone on hand who does. There's only a limited amount of help that can physically be given over a forum.

Regardless, I'll continue with my recommendations. What sort of budget are you working with here? You've kind of danced around hard figures which would be very useful. Building this sort of a system isn't impossible at all, but you're not going to do it for a couple of hundred quid. In fact with 20TB of storage the hard drives alone will likely cost over a grand, there's no way around that!

The approach I'd recommend would be to grab a good RAID card with at least 8 ports and then whack some high capacity drives in there. You could use 2TB drives but if you go for the 8 port card that won't give you close to 20TB of storage. If you want it to be expandable beyond its original capacity, I'd spend a bit more, grab a 16 port RAID controller and fill it up with 3TB drives. Don't forget you'll want to run this configuration in RAID 5 or 6 for redundancy (otherwise you'll lose everything if one drive packs up) so you'll lose the capacity of a single drive (or 2 drives if you go for 6 - this has the advantage that 2 drives can fail without any data loss though.)

Going for a 16 port controller with RAID 6 (what I'd recommend) with 3TB drives would give you a maximum of around 40TB of storage after formatting overheads which is around double your needs. This won't be that cheap though. The controllers themselves cost the best part of a grand (3ware 9650SE-16ML 16 Port Storage controller (RAID.. | Ebuyer.com for instance) and that many 3TB hard drives will likely set you back the best part of around 3 grand (though obviously you don't need all the drives to start with.) You'll want to run the OS off an SSD and put plenty of memory in there.

As for a case, the cheapest, quickest and perhaps easiest way would be to build one yourself. If you're handy with metal then great, otherwise wood will be fine. Just make sure everything is adequately cooled - the hard drives are green yes, but that many will need proper cooling otherwise you'll significantly decrease their lifespan.

One final note, with this much storage, even with RAID, be sure to back all your important stuff up elsewhere and look hard at the physical security of the build. The thing that people tend to forget in terms of backups when dealing with this much data is that if someone steals the box, or a fire occurs then it's game over - however resilient your hard drives happen to be!

Good luck with the project - ask if you want advice or clarification on anything else :)
 
Hello "We Multiply You"

Indeed what you're asking for does sound...(to be respectful)...different. But hey, go big or go home. lol

The machine you want to build sounds a server with a hard drive farm installed in it. It's simply just a cage with drive after drive filled with data. The largest server tower I know of that supports a lot of drives and is fairly inexpensive, is an equipment rack server case. It holds up to 11 full sized 3.5" hard drives. I have built a couple of servers and it all depends on what your network needs, so if you would like my suggestions I need some input...

How many clients or computers on the network?

Is it a Local Area Network (LAN - briefly computers connected to a switch including the server) or a Wide Area Network (WAN - briefly switches and computers connected to routers connected to other routers)?

What's your estimated budget on the entire project?

Which OS are you most comfortable using? (Windows, Linux/Unix, Solaris, Novell, Macintosh)

What sort of environment are you in? (Business/enterprise or home multimedia/home lab)

What computers and/or equipment do you already possess?

Is it just a "Big Ole" server that you want or an entire network infrastructure to go with it since you mentioned performance?

Other than that I have decent idea of you're looking for. It's also called a SAN rather than just one "stand-alone server". A SAN is a Storage Area Network that houses multiple servers with multiple hard drive configurations in them.
 
Re: Storage Server at Lowest Cost

Thanks a lot for your reply Berry 120.

I was thinking of using two "Supermicro AOC-SAT2-MV8 8 Channel 300MB/S Per Channel 64 Bit Pci-X Interface Serial ATA Adapter" along with the standard ports on the motherboard which would make 20-22 SATA Drives per Unit (40-44 Tb with 2Tb drives). I would use a software based RAID 1.

I have no idea what the PCI bus transfer speed limits are, which I would need to take in account in choosing a motherboard.

I don't either know what power supply to choose for such configuration, nor what kind of processor to choose.
The AMD Athlon II X4 Quad-Core looks like a processor with a very good price/computing power ratio, however, despite of the computing requirements of software based raid, this might be an overkill. If it is comes the question if I should stick with it and try to increase the number of drives served by unit or just choose a smaller processor and have a larger number of systems. In financial terms it seems a better solution to use the most cost efficient processor and saturate it with the maximal computing requirements per quantity of storage than to create more systems with smaller processor. In technical terms, however, this might not be a good solution as creating a system with more than 100Tb of storage is likely to come with a few issues. First, I might not find a motherboard supporting 4/5 or more of the previously mentioned SATA adapter cards. Then I'm wondering if systems are designed (wether on the software side or on the hardware side) to handle that many drives. At last I would need more than Gigabit ethernet to satisfy the maximal networking needs of such a system. I could use two of them but I don't know if that is possible and if it is how data transfer is split between the two.

The case will be custom built in order to fit in a energy efficient cooling conglomerate, and won't cost more than 30$ per unit at most (or 50$ for units of 40-50 drives).

My concerns are more focused on profitability and cost per unit of storage than on raw costs. Spending 10K for a server is not a problem for me, however spending 100$ or more per Tb is. My expenses in building up such systems will come with the income which will finance them. All I need is good prospects in terms of final costs per unit of storage.

I don't have any experience with linux at all but I learn fast so if that is the solution I should use I will.

These storage systems are to be used by different servers. These other servers will be designed for very high speed and computing power and the storage systems accessed for storage applications. The storage might be accessed simultaneously by thousands of users. The processes through which they access it being handled by the servers designed for high computing power. The storage systems only being designed for the ability to serve all its hard-drives at their maximum transfer speeds.

Another question: Is it possible to overlay several software based RAID arrays?

I would need RAID 1 arrays for data redundancy, however, I would also need to split the data of all users among all the drives of the system in order to maximize performance. Indeed I would like to split the load equally among all drives so to avoid those random occurrences of having the users that have their data on the same drive to access it at the same time with 1/n drive transfer speeds for each user while other drives are unused. The ideal situation would be having all users benefiting of approximately the same transfer speeds (not that I limit but unlimited effective transfer speed) at each moment of time, given, of course, that they all have equal systems and network speeds. That would mean that the transfer speed each user could have would depend on the overall load of the data server rather than by random drive allocation of user data.

Also I would like to know if it is possible to mirror the data on different servers at different locations without it costing me more bandwidth than what I normally need.
For example, instead of having a RAID 1 at location 1, I could simply have a sever X at location 1 and a server X' (exact copy of server X) at location 2. By this mean I would have the exact same protection against drive failure related data loss but also a protection against physical security related threats like fire/theft/power breakdown/etc for the same price. Additionally I would maximize network efficiency by doing so. If the network access for server X is very high at one point of time compared to the one of other functionalities served by other servers then I will be limited by the network speeds of location 1 while the servers at location 2 might only be using a small share of the network speeds of that location. By having mirrored server locations my total bandwidth is the addition of the ones at each location, at each point of time, and just like the drives I am limited by my total network capacity rather than by random server locations.
This mirroring, however would need to be designed in such a way as to use exactly the same (or almost the same) bandwidth than what would normally be used. It cannot be designed in a way that data is received and written at location 1 and then mirrored at location 2 as this would cost me double the bandwidth and have absolutely no pros on the performance side. I would need the data to be read and written simultaneously at both locations at the same time. That means if a user at location 3 wants to save data on the servers he would write the data at both to location 1 and 2 at the same time, and read it at the same time. This would give him transfer speeds equivalent as writing on a RAID 1 array over the network. However it cannot be him that sends the data to the two locations as this would imply that he sends the information twice thus dividing his own transfer speed by two. The transfer would need to be dispatched in a way that does not divide bandwidth at any location (if that is possible).

Hello "We Multiply You"

Indeed what you're asking for does sound...(to be respectful)...different. But hey, go big or go home. lol

The machine you want to build sounds a server with a hard drive farm installed in it. It's simply just a cage with drive after drive filled with data. The largest server tower I know of that supports a lot of drives and is fairly inexpensive, is an equipment rack server case. It holds up to 11 full sized 3.5" hard drives. I have built a couple of servers and it all depends on what your network needs, so if you would like my suggestions I need some input...

How many clients or computers on the network?

Is it a Local Area Network (LAN - briefly computers connected to a switch including the server) or a Wide Area Network (WAN - briefly switches and computers connected to routers connected to other routers)?

What's your estimated budget on the entire project?

Which OS are you most comfortable using? (Windows, Linux/Unix, Solaris, Novell, Macintosh)

What sort of environment are you in? (Business/enterprise or home multimedia/home lab)

What computers and/or equipment do you already possess?

Is it just a "Big Ole" server that you want or an entire network infrastructure to go with it since you mentioned performance?

Other than that I have decent idea of you're looking for. It's also called a SAN rather than just one "stand-alone server". A SAN is a Storage Area Network that houses multiple servers with multiple hard drive configurations in them.

Hi supernerd,

Thank you very much for your answer. Because I didn't get any answer on this part of the forum I thought I maybe had posted at the wrong place and opened a similar topic here: Storage Server at Lowest Cost

You'll find most of the answers to your questions in my other threat but I will answer them here as well.

How many clients or computers on the network?:
Thousands, the storage server will be used by high performance servers which themselves will be accessed by thousands of clients via the internet.

What's your estimated budget on the entire project?
There is no budget for the project but a rather a question of profitability. In large volumes I would like to be able to end up with a total cost of less than 40$/Tb.

Which OS are you most comfortable using?
The only OS I know is Snow Leopard Server which I like a lot (though I haven't used all its functionalities yet).

What sort of environment are you in? I am starting a project which will grow itself to a large business size if successful and finance its own development along the way.

What computers and/or equipment do you already possess? I have a Macbook Air, a Mac Mini Server, an 17" iMac i7 and an ASUS F3J PC with windows 7.

Is it just a "Big Ole" server that you want or an entire network infrastructure to go with it since you mentioned performance? Its an entire infrastructure. What I want to do here is just one of several units of storage accessed by several different servers, several different applications, and many many different clients (both from the local network and internet).

Thanks again for you answer and for trying to help me. I will post this on my second thread so to bring this information there as well.
 
Hi supernerd,

Thank you very much for your answer. Because I didn't get any answer on this part of the forum I thought I maybe had posted at the wrong place and opened a similar topic here: http://www.computerforums.org/server-administration/storage-server-lowest-cost-104220.html

You'll find most of the answers to your questions in my other threat but I will answer them here as well.

How many clients or computers on the network?:
Thousands, the storage server will be used by high performance servers which themselves will be accessed by thousands of clients via the internet.

What's your estimated budget on the entire project?
There is no budget for the project but a rather a question of profitability. In large volumes I would like to be able to end up with a total cost of less than 40$/Tb.

Which OS are you most comfortable using?
The only OS I know is Snow Leopard Server which I like a lot (though I haven't used all its functionalities yet).

What sort of environment are you in? I am starting a project which will grow itself to a large business size if successful and finance its own development along the way.

What computers and/or equipment do you already possess? I have a Macbook Air, a Mac Mini Server, an 17" iMac i7 and an ASUS F3J PC with windows 7.

Is it just a "Big Ole" server that you want or an entire network infrastructure to go with it since you mentioned performance? Its an entire infrastructure. What I want to do here is just one of several units of storage accessed by several different servers, several different applications, and many many different clients (both from the local network and internet).

Thanks again for you answer and for trying to help me. I will post this on my second thread so to bring this information there as well.
 
Re: Storage Server at Lowest Cost

As a heads up, don't post the same thing more than once - it just makes things annoying as the information is spread all over the place. Perhaps the other thread can be closed? If you've posted in the wrong section a mod will come along and move it.

You say you would use RAID 1 - is this just because you've heard it's good for redundancy or because you've made an informed choice? You lose HALF the capacity of your drives with this setup and it only provides resilient protection against losing one drive. Compare that with RAID 6 which gives you foolproof protection against losing 2 drives whilst only losing the capacity of 2 drives. It's a much better option IMO.

I'd also question your controller choice - you realise that PCI-X is not the same as PCI express right? And most motherboards nowadays don't support the former?

A last point - you seem to be going in the direction of creating several RAID arrays and trying to join them together. This is messy at best and potentially lethal at worse (what happens when the controller dies on the motherboard and you can't get a replacement because it's 10 years down the line and no-one makes it anymore?)
 
Re: Storage Server at Lowest Cost

Mac OS X Server doesn't support Software RAID 6 so I don't have any experience with it. I don't understand, if each data has a duplicata somewhere then no matter how it is duplicated the system has to use twice the capacity doesn't it?

Is there a similar SATA adapter working on PCI Express?

I don't want to use hardware based RAID so if I do overlay RAID arrays it will be software based.
 
Back
Top Bottom