Carl Stalhood

Wednesday 22 May 2013

datstore & data collector

http://blog.srinfotec.com/?p=653
What is Data Store?
It stores the static information, when you make any changes into farm like:
1• Farm configuration information
2• Published application configurations
3• Server configurations
4• Citrix administrator accounts
5• Printer configurations
What is Data Collector?
A data collector is a server that hosts an in-memory. Database that maintains dynamic information about the servers in the zone.
1 Such as server loads
2 Session status,
3 Published applications, users
4 Connected and license usage.
5 Data collectors receive incremental data Updates and queries from servers within the zone. Data collectors relay information to all other data collectors in the farm. By default, the first server in the farm functions as the data collector.

How big can a zone get?

There's real no technical limit that would limit the number of Presentation Servers that can be in one zone. In fact the product will support up to 512 servers in a single zone right out of the box, and a quick registry key change can let you go higher than that. In the real world, you'll probably have other reasons (multiple sites, etc.) to split such a large group of servers into multiple zones long before that.
Then again, if you have 1000 or 1500 servers in the same data center, there's no real reason you can't have them all in one or two zones. It's just a matter of looking at the traffic patterns. For instance, do you want one single data collector updating 1000 servers whenever you make a change to the environment (one zone), or do you want two data collectors to each update only 500 servers (one zone).

Zone Strategy

Now that we've discussed what zones are, how they work, and the mechanics of what happens when you create multiple zones, let's talk about strategy. You'll have to decide:
  • How and where you break up your farm into zones
  • Whether you create dedicated data collectors

How many zones?

The main decision you'll need to make is how many zones you'll have. (Or, put another way, where your zone boundaries are going to be.) Like everything else in your environment, there are several things to take into consideration when making this decision:
  • Where your users are
  • How your users connect
  • How your farm database is setup
Remember, the primary purpose of creating a new zone is to create an additional data collector. Additional data collectors mean that you can put a data collector closer to your users, essentially "pre-caching" the session information near the users that they need to start their sessions.
That said, keep in mind that data collectors also help to distribute configuration changes throughout your farm, as each data collector sends changes to all of the servers in its zone. Imagine you have a single farm across two sites. Each site has about 50 servers.
Figure xx [one farm, two sites, 50 servers on each site]
If you create a single zone, whenever a configuration change is made to your farm, the zone data collector will push that change (via the IMA protocol port 2512) to all of the servers in the farm, meaning that change will go across the WAN 50 times (once to each server.)
Figure xx [show the same environment as the previous figure, with ZDC on one side pushing the change out to all the servers individually]
On the other hand, if you split this environment into two zones, any configuration change would only traverse the WAN once, since the data collector in the zone that made the change would only have to push it out to the data collector in the other zone. Then it's up to that data collector to push the change out to the servers in its zone.
Figure xx [ show this ]
In the previous example, it's pretty easy to see that you'd want to split this farm into two zones. In fact, even if you had only five servers at each site, you'd still probably want to divide this farm into two zones. (Although with only five servers in each site, you probably wouldn't have dedicated data collectors.
Advantages of each site being its own zone
  • User connections could be faster
  • Updates only go across the WAN once (to the data collector), instead of many times (once to each server).
With several advantages to splitting your farm into multiple zones, why not just split everything? Unfortunately, there's a point at which this doesn't become feasible. Remember how this "zones" section began, with a quick historical look at the Program Neighborhood service from the MetaFrame 1.x days? That was not scalable because every server updated every other server--an architecture which quickly led to too many connections between servers. The same thing can happen today between zones. Every session event must be transmitted by one data collector to the data collectors in every other zone. So if you have a Presentation Server farm that spans 40 physical site, you can't possibly make 40 separate zones, because every time a user logged on, logged off, connected, or disconnected, the data collector from their zone would have to update all 39 data collectors in the other zones!
Disadvantages of making a lot of zones
  • Data collectors need to send all session events to all other data collectors. More zones means more data collectors More data collectors means more traffic for every session event.
From a strategic standpoint, you have to balance the number of zones that you create. Citrix has never gone on the record with a hard recommendation of how many zones you can have. In all honesty it depends on your environment. (It "depends" based on the same reasons discussed previously when discussing whether you should build a dedicated data collector--number of user connections, number of applications, etc.)
That said, it's probably safe to say that you can build three or four zones no problem. And it's probably also safe to say that once you get above ten zones, you need to really do some testing to make sure your environment can handle it. Anything in-between is probably doable, but again you'd want to test to make sure.
Remember, one of the main testing points needs to be whether your servers and the architecture of the Citrix products can handle what you want to do. None of this stuff puts huge amounts of bits on the WAN. It's more about whether you can build an environment that's responsive and scalable enough for your needs.