The guestbook is open.
http://users.skynet.be/binabik
External memory from my trainings. I used to teach Business Objects, Internet Development and Hardware at Xylos NV (http://www.xylos.com)
I put a small website online for our son Jarne. There are some pictures from his birthcard there, plus a photogallery. Enjoy.
So as it turns out, Raid 5 is a very good Raidlevel, but it is not very good at writing. Raid 1+0 is very good, but it is rather expensive. Is that all there is to it ? No. If you have read one of my previous articles, you might have read that a harddrive’s speed is influenced by Latency and Seek. If you combine that story with this particular story, then you might find out there is more to it.
The two main ways of communicating with a disk are Sequential and Random. In a sequential environment, large chunks of data are being transferred (e.g. : 64K or 128K or bigger) This occurs most on file and print servers. In a random environment, larege amounts of small chunks of data are being transferred to the disk (e.g. 4k) This occurs most on a database server. If you think a little further on this, you might consider that Raid 5 has to do about twice the work for writing a chunk of data compared to Raid 1+0 (Raid 5 has to read existing data, read existing parity, calculate the parity, write new data, write new parity).
In that case you might come to the same conclusion as Database manufacturers and database gurus. Raid 5 is not a good thing to combine with a database. It will flood the Raid 5 with small chunks of data and that poor Raid 5 has to do twice the work a Raid 1+0 has to do at writing.
A clear conclusion is : Avoid Raid 5 in a database environment. But most people forget that Exchange is also a database.
My recommendation : If you are going to install a server, and you are considering to install 3 disks in Raid 5 and put your exchange server on it — why not consider a little further, add 1 extra disk, use Raid 1+0 in stead. The result is remarkable : Twice the speed, Twice the fault tollerance, twice the speed at reconstruction, more than twice the speed during a disk failure.
So you want to use popups ? Of course you do, opening a window is a right for everybody — but you might have discovered, that there are things around called popup blockers.
Both Internet Explorer and Mozilla Firefox include tools to block popups, furthermore, software exists that blocks all popups. As a programmer, it is impossible to take these last into account.
So, what is the difference between good and bad popups.
When an html document is opened in the browser, much depends on the zone where a page is opened. If a document is opened in the zone of the local computer, there is a small difference between Internet Explorer and Firefox. Internet Explorer will consider all scripting in a page that is launched in the Local Computer Zone as a potential hazard, and will show a warning.
Firefox does not do this. Once the page is put on the internet, this problem does not exist (example).
A good popup, is launched by a click-event. or a doubleClick event
A bad popup is launched by an onLoad event, onMouseOver event, onContextMenu event or onMouseOut event
This means that most “accidental” calls of popups are blocked. The only one that surprised me somewhat is the onContextMenu. But then again, right clicking a link is something we do to force
the opening of a link in a different target window, so it makes sense.
Ask 10 system engineers, which is the best Raid level and 8 will answer : Raid 5. The other two will ask the smart question :“for which application ?”
This is the start of a somewhat long article I think. The discussion is long and interesting.
When you try to evaluate the different Raid levels, you should be concerned with the kind of data transfers that will occur on it. Is it going to be a system with mostly Sequential transfers like a File-Server ? or is it going to be a system with mostly Random transfers like a Database Server ? Or is it going to be a mixture, and in that mixture, which kind of transfer occurs most ?
First, lets assume we are going to perform 400 Writes and 600 reads on a series of Raidsets. This means that the system will be working with mostly Reads, but still quite some Writes to deal with as well.
What would be the result for a Raid 0 ? (afterwards we can compare with Raid 1+0 and Raid 5
Let’s assume we have three Raid 0 Raidsets. each totalling in 72 GB Net Storage
In a Raidset with 2 disks, the total number of Reads and Writes gives you the number of IO’s so in this case, the number of IO’s (for 400 writes and 600 reads) is 1000 IO’s
When you only have 2 disks, you split those IO’s across the 2 disks, giving a total of 500 IO’s /disk
When we have 3 disks, we gain some speed because the total number of IO’s can be split across more disks. This gives you now 250 IO’s / disk
With 8 disks, the result will be 125 IO’s / disk
2 disks of 36GB = 1000/2 = 500 IO’s per disk
4 disks of 18GB = 1000/4 = 250 IO’s per disk
8 disks of 9GB = 1000/8 = 125 IO’s per disk
And what about Raid 1+0 ?
In a Raid 1+0, everything that has to be written has to be written twice. Reads can be performed across all disks.
Since we now have to take into account the writing, the number of IO’s to process is (400*2)+600 = 1400 IO’s / 4 = 350 IO’s /disk.
With 8 disks, this will give you 175 IO’s /disk With 16 disks : 87 IO’s /disk
4 disks of 36GB = 1400 /4 = 350 IO’s per disk
8 disks of 18GB = 1400/8 = 175 IO’s per disk
16 disks of 9GB = 1400/16 = 87 IO’s per disk
And Raid 5 ?
Raid 5 is a different story. In raid 5, each time you write something, the parity has to be calculated and written to the disks. But more importantly, to be able to write the parity and the data, a Raid 5 has
to read the data and parity that exists on the disk already – change it and write the data. The result is that the number of IO’s on Raid 5 for this same writing is much higher.
For 600 Reads and 400 writes, Raid 5 will take (4 * 400) + 600 = 2200 IO’s
3 disks of 36GB = 2200 / 3 = 733 IO’s / disk
5 disks of 18GB = 2200/5 = 440 IO’s / disk
9 disks of 9 GB = 2200/9 = 244 IO’s / disk
So, it is true, that 3 disks in Raid 5 is not as expensive to have fault tollerance… but is it really cheap ? If I add 1 extra disk to it and create a Raid 1+0 in stead of a Raid 5, I get the same capacity – I double my speed and I double my fault tollerance. (in a Raid 1+0 with 4 disks, when you have a second disk failing, you still have a 2/3 chance your system is up and running. With Raid 5, your system is down.
More on a next issue.
For years I’ve been trying to explain the difference between these two, only to find a lot of people thinking it is the same thing. Here is the difference:
In a Raid 0+1 system, first a Raid 0 is applied on the disks. The result of this operation is speed. Next, on top of the Raid 0 (usualy done by software) a second Raid is applied, which is a Raid 1. The result of this second Raid is for security. When you look at this configuration, you might think everything is nice and dandy, but this is actually the worst kind of config, and is never put on hardware Raid controllers. This is very bad. First : Security – When one disk fails, the system has 2 disks down (both halves of a Raid 0) this means that only the remaining Raid 0 is still on-line. If a disk fails in that Raid 0 you are in SH.T. This means that only the second disk of the failed Raid 0 can fail (1 chance out of 3) and your system is still working. Second : Speed. Again, if the system goes bad, and a disk fails, you are working at half the speed. And when you restore, the system has to copy the RaidSet from one Raid 0 onto the other Raid 0 (all disks have to work).
A much cleaner way to do this is Raid 1 + 0 :
Two Raid 1 are created. Across those two Raid 1, a Raid 0 is created. The result may not be obvious at first, but if you fail a disk, the system still works on 3 disks(faster) if a disk fails, 2 chances out of 3, your system is still ok. When you are rebuilding, it just copies 1 disk (of the failed mirror) to the other.
This is the kind of Raidset which is usually implemented in hardware Raid controllers (if yours does not… get another).
HP (Compaq) has been advertising for years that they were using 0+1 whereas they were really using 1+0 which is far superior.
You can tell very easily if your system is using 0+1 or 1+0 .. if you remove a disk and put it back.. if only 2 disks are working to perform the restore : 1+0 .. otherwise 0+1.
In Body and In Report.
A nice quirk in Business Objects, is the calculation in %. It adds an extra column to your table, which contains a formula :
=<Sales revenue>/Sum(<Sales revenue>) ForAll <Year>
which is based on the ForAll operator. The result is the following table :
When you add a dimension to this table (let’s say the Quarter)
I don’t know, but I think my counting still works.. it now says 100 % where it should be closer to 450 %. This is due to the fact that the context operator is still the same. It should now be
=<Sales revenue>/Sum(<Sales revenue>) ForAll (<Year> , <Quarter>). But Business Objects does not update the formula. What you can do now is remove the column with the calculation and reinsert it. But hey, there is a better way :
Insert the following formula :
=<Sales revenue> In Body /Sum(<Sales revenue>) In Report
This means that BO will now take the sales revenue in the Body of the table which is 2.6 million for the first, 2,27 million for the second item etc. and divides that number by the total Sales Revenue. I wonder why the people who built this program did not use this particular formula. I found it in their own documentation. Oh.. before I forget..here is the result.
More on a next item !
Foreach and Forall allow you to add or remove dimensions to or from a calculation.
When you look at the table to the left, you will see that the third column shows the same as the column next to it. The third column contains Min(<Sales revenue>), still it returns the same, which is logical. The context of the calculation is just “Year”. This means that The minimum Sales Revenue based on one number returns only that number.
With the ForEach operator, we can now add a dimension to the context. The formula is now : Min(<Sales revenue> foreach <Quarter>). The minimum is now calculated per Quarter as well. Returning the following table :
This means, that (for the calculation) the quarter has been included in the calculation.
Forall is the exact opposite. If you have a table containing Year, Quarter,Month, Sales Revenue, then min(<Sales Revenue> forall <Month>) will remove Quarter from the equation and return the minimum by year and Quarter only.
Next issue :
In Body and In Report, which are interesting when you use percentages.