The business case for Exchange 2007 – part IV

Another installment in a series of posts outlining the case for going to Exchange 2007. Previous articles can be found here.

GOAL: Make flexible working easier

“Flexible Working” might mean different things to differing organisations – some might think of mobile staff who turn up at any office with a laptop, sit at any free desk and start working – others might imagine groups of workers who can work from home part- or even full-time. Whatever your definition is, there’s no doubt that the technology which can enable these scenarios has evolved in great strides in recent years.

RPC Over HTTP – magic technology, even if the name isn’t

The “Wave 2003” of Exchange Server 2003/Outlook 2003/Windows XP SP2/Windows Server 2003 brought to the fore a technology which wasn’t really new, but needed the coordination of server OS, server application, client OS and client applications to make it available: if you’ve been using or deploying RPC/HTTP, you’ll know exactly what it does and why it’s cool. If you haven’t deployed it, the name might mean nothing to you… in short, the way in which Outlook talks to Exchange Server when you’re on the internal network, can be wrapped up within a secure channel that is more friendly to firewalls – hence “tunneling” that protocol (RPC) inside a stream of data which your firewall can receive (HTTP, or more correctly, HTTPS).

What this means in practice is that your users can connect in to your environment using a widely-supported network mechanism (ie HTTPS), and without requiring a Virtual Private Network connection to be established in the first place. This manifests itself in the fact that as soon as a user’s PC finds a connection to the internet, Outlook will attempt to connect to your network using HTTPS, and if it succeeds, will become “online” with Exchange and (if they’re using the default “cached mode” of Outlook) will synchronise changes between Outlook and Exchange since the client was last online.

image

A sometimes overlooked benefit of using regular internet protocols to connect the client & servers together, is that the communication will be able to leave one protected network, traverse the unprotected internet within a secure channel, then enter a second protected network. This means that (for example) your users could be connected to a customer or partner’s own internal network, but be able to go through that network’s firewall to reach your Exchange server. If you required a VPN to be established to connect Outlook and Exchange, then it almost certainly won’t be possible to use a protected network as your starting point, since the owners of that network will not allow the outbound connections that VPN clients use, but will allow outbound connections on HTTPS.

Now, RPC/HTTP was part of Outlook and Exchange 2003, however it’s been improved in Exchange 2007 and is easier to get up and running. If you’re also using Outlook 2007, the client configuration is a whole lot simpler – even if it’s the first time a user has ever connected to Exchange, all they may need to know is their email address and password, and Outlook will be able to find the Exchange server and configure itself using whatever default you’ve set. The technology behind the ease of configuration is called the Autodiscover Service, and the whole area of “connecting over the internet” functionality has also been given a more descriptive (to the non-techies, anyway) term: Outlook Anywhere.

From an end-user point of view, this technology is almost silent – for remote laptop users working at home, they often just start up their laptop, which connects automatically to a home wireless network and out to the internet, then Outlook just goes straight to Exchange and they’re online. Deploying this technology in Microsoft saw the volume of VPN traffic reduce dramatically, and the calls to the help desk concerning remote access dropped significantly too.

NET: Using Outlook 2007 and Exchange 2007 together simplifies the provision of remote access to remote users, particularly when using Outlook in “cached mode”. This configuration reduces, or even removes, the need to provide Virtual Private Network access, which could make the user experience better and save management overhead and expense.

Web client access instead of Outlook

Another element of flexible or remote working might be to use the web to get to email – maybe your remote users just want to quickly check email or calendar on their home PC, rather than using a laptop. Maybe there are workers who want to keep abreast of things when they’re on holiday, and have access to a kiosk or internet cafe type PC. Or perhaps your users are in their normal place of work, but don’t use email much, or don’t log-in to their own PC?

Outlook Web Access has been around for a number of versions of Exchange, and just gets better with every release. The 2007 version has added large areas of functionality (like support for the Unified Messaging functionality in Exchange, or huge improvements in handling the address book), meaning that for a good number of users, it’s as functional as they’d need Outlook to be. It’s increasingly feasible to have users accessing OWA as their primary means of getting to Exchange. One possible side benefit here is a licensing one – although you’d still be required to buy an Exchange Client Access License (which gives the user or the device the rights to connect to the server), you won’t need to buy Outlook or the Microsoft Office suite.

Outlook Web Access not only gives the web-user the ability to use email, calendar etc, but it can also provide access to internal file shares and/or Sharepoint document libraries – where the Exchange server will fetch data from internal sources, and display to the reader within their browser. It can also take Office documents and render them in HTML – so reading a spreadsheet or document could be done on a PC with no copy of Office available, or simply can be read without needing to download a copy of that document for rendering client-side in an application.

It’s possible to control what happens to attachments within OWA – some organisations don’t want people to be able to download attached files, in case they leave copies of them on public PCs like internet cafes – how many users would just save the document to the desktop, and maybe forget to delete it? Using server-side rendering of documents, all traces of the document will be removed when the user logs out or has their connection timed out.

Even for predominantly office-based users, OWA can provide a good way of getting to mail from some other PC, without needing to configure anything or log in to the machine – in that respect, it’s just like Hotmail, where you go to a machine and enter your username and password to access the mail, rather than having to log in to the whole PC as a given users.

If you deploy Outlook Anywhere (aka RPC/HTTP), you’ll already have all the infrastructure you need to enable Outlook Web Access – it uses the same Exchange Client Access server role (in fact, in Microsoft’s own deployment, “Outlook Anywhere” accounts for about 3/4 of all the remote traffic, with the rest being made up of OWA and Exchange Activesync).

NET: Outlook Web Access gives a very functionally-rich yet easy to use means of getting to data held on Exchange and possibly elsewhere on the internal network, in a secure means of communications to an external web browser. OWA 2007 has replicated more of Outlook’s functionality (such as great improvements to accessing address books), such that users familiar with Outlook will need little or no training, and users who don’t have Outlook may be able to rely on OWA as their primary means of accessing mail.

Mobile mail with ActiveSync

Exchange 2003 SP2 and an update to Windows Mobile 5 introduced the first out of the box “push mail” capability for Exchange, which forms part of the Microsoft Exchange Activesync protocol that’s also licensed to a number of other mobile device vendors. This allows Exchange to use the same infrastructure that’s already in place for Web access and for Outlook Anywhere, to push mail to mobile devices and to synchronise other content with them (like calendar updates or contact information). The Exchange Activesync capability in Exchange 2007 has been enhanced further, along with parallel improvements in the new Windows Mobile 6 client software for mobile devices.

Now it’s possible to flag messages for follow-up, read email in HTML format, set Out of Office status, and a whole ton of other functional enhancements which build on the same infrastructure described above. There’s no subscription to an external service required, and no additional servers or other software – reducing the cost of acquisition, deployment, and (potentially) in TCO. Analyst firm Wipro published some research, updated in June 2007, looking into TCO for mobile device platforms in which they conclude that Windows Mobile 5 and Exchange Activesync would be 20-28% lower in cost (over 3 years) than an equivalent Blackberry infrastructure.

NET: Continuing improvements in Exchange 2007 and Windows Mobile 6 will further enhance the user experience of mobile access to mail, calendar, contacts & tasks. Overall costs of ownership may be significantly lower than alternative mobile infrastructures, especially since the Microsoft server requirements may already be in place to service Outlook Anywhere and Outlook Web Access.

A last word on security

Of course, if you’re going to publish an Exchange server – which sits on your internal network, and has access to your internal Active Directory – to the outside world, you’ll need to make sure you take account of good security practice. You probably don’t want inbound connections from what are (at the outset) anonymous clients, coming through your firewall and connecting to Exchange – for one, they’ll have gone through the firewall within an encrypted SSL session (the S part of HTTPS) and since you don’t yet know who the end user is, an outsider could be using that connection as a way of mounting a denial of service attack or similar.

Microsoft’s ISA Server is a certified firewall which can be an end-point for the inbound SSL session (so it decrypts that connection), can challenge the client to authenticate and can inspect that what is going on in that session is a legitimate protocol (and not an attacker trying to flood your server with traffic). The “client” could be a PC running Outlook, a mobile device using Activesync or a web browser trying to access Outlook Web Access. See this whitepaper for more information on publishing Exchange 2007 onto the internet using ISA.

The Return of Exchange Unplugged

In late 2005, to prepare for Exchange 5.5 going out of support (and to help customers understand what was involved in moving up to Exchange 2003), we did a really well received tour of the country arranged around the theme of “Exchange Unplugged“.

We all wore “tour T-shirts” (in fact, every attendee got one), and keeping with the theme, I even carried my acoustic guitar and provided musical accompaniment at the start of each session. The nearest I’ll ever get to being paid to play music, I don’t doubt.

Anyway: we’re doing it all again! With 8 “gigs”, session topics titled:

  • Warm up act & welcome
  • Architecture Acapella
  • Migration Medley
  • Email & Voicemail Duet
  • Mobility Manoeuvres in the Dark
  • Y.O.C.S. (that’s about Office Communication Server).

… it’s clearly no ordinary event. Come along and see Jason try to squeeze into the tour shirt without looking like Right Said Fred, or find out if the YOCS session is presented wearing a stick-on handlebar moustache and leather hat.

Dates:

The business case for Exchange 2007 – part III

This is a continuation of an occasional series of articles about how specific capabilities of Exchange 2007 can be mapped to business challenges. The other parts, and other related topics, can be found here.

GOAL: Lower the risk of being non-compliant

Now here’s a can of worms. What is “compliance”?

There are all sorts of industry- or geography-specific rules around both data retention and data destruction, and knowing which ones apply to you and what you should do about them is pretty much a black art for many organisations.

The US Sarbanes-Oxley Act of 2002 came to fruition to make corporate governance and accounting information more robust, in the wake of various financial scandals (such as the collapse of Enron). Although SOX is a piece of US legislation, it applies to not just American companies, but any foreign companies who have a US stock market listing or who are a subsidiary of a US parent.

The Securities Exchange Commission defines a 7-year period for retention of financial information, and for other associated information which forms part of the audit or review of that financial information. Arguably, any email or document which discusses a major issue for the company, even if it doesn’t make specific reference to the impact on corporate finance, could be required to be retained.

These requirements understandably can cause IT managers and CIOs to worry that they might not be compliant with whatever rules they are expected to follow, especially since they vary hugely in different parts of the world, and for any global company, can be highly confusing.

So, for anyone worried about being non-compliant, the first thing they’ll need to do is figure out what it would take for them to be compliant, and how they can measure up to that. This is far from an easy task, and a whole industry has sprung up to try to reassure the frazzled executive that if they buy this product/engage these consultants, then all will be well.

NET: Nobody can sell you out-of-the-box compliance solutions. They will sell you tools which can be used to implement a regime of compliance, but the trick is knowing what that looks like.

Now, Exchange can be used as part of the compliance toolset, and in conjunction with whatever policies and processes the business has in place to ensure appropriate data retention is put in place, and that there is a proper discovery process that can prove that something either exists or does not.

There are a few things to look out for, though…

Keeping “everything” just delays the impact of the problem, doesn’t solve it

I’ve seen so many companies implement archiving solutions where they just keep every document or every email message. I think this is storing up big trouble for the future: it might solve an immediate problem of ticking the box to say everything is archived, but management of that archive is going to become a problem later down the line.

Any reasonable retention policy will specify that documents or other pieces of information of a particular type or topic need to be kept for a period of time. They don’t say that every single piece of paper or electronic information must be kept.

NET: Keep everything you need to keep, and decide (if you can) what is not required to be kept, and throw it away. See a previous post on using Managed Folders & policy to implement this on Exchange.

Knowing where the data is kept is the only way you’ll be able to find it again

It seems obvious, but if you’re going to get to the point where you need to retain information, you’d better know where it’s kept otherwise you’ll never be able to prove that the information was indeed retained (or, sometimes even more importantly, prove that the information doesn’t exist… even if it maybe did at one time).

From an email perspective, this means not keeping data squirreled away on the hard disks of users’ PCs, or in the form of email archives which can only be opened via a laborious and time consuming process.

NET: PST files on users’ PCs or on network shares, are bad news for any compliance regime. See my previous related post on the mailbox quota paradox of thrift.

Exchange 2007 introduced a powerful search capability which allows end user to run searches against everything in their mailbox, be it from Outlook or a web client, even a mobile device. The search technology makes it so easy for an individual to find emails and other content, that a lot of people have pretty much stopped filing emails and just let them pile up, knowing they can find the content again, quickly.

The same search technology offers an administrator (and this would likely not be the email admins: more likely a security officer or director of compliance) the ability to search across mailboxes for specific content, carrying out a discovery process.

Outsourcing the problem could be a solution

Here’s something that might be of interest, even if you’re not running Exchange 2007- having someone else store your compliance archive for you. Microsoft’s Exchange Hosted Services came about as part of the company’s acquisition of Frontbridge a few years ago.

Much attention has been paid to the Hosted Filtering service, where all inbound mail for your organisation is delivered first to the EHS datacentre, scanned for potentially malicious content, then the clean stuff delivered down to your own mail systems.

Hosted Archive is a companion technology which runs on top of the filtering: since all inbound (and outbound) email is routed through the EHS datacentre, it’s a good place to keep a long-term archive of it. And if you add journaling into the mix (where every message internal to your Exchange world is also copied up to the EHS datacentre), then you could tick the box of having kept a copy of all your mail, without really having to do much. Once you’ve got the filtering up & running anyway, enabling archiving is a phone call away and all you need to know at your end is how to enable journaling.

NET: Using hosted filtering reduces the risk of inbound malicious email infecting your systems, and of you spreading infected email to other external parties. Hosting your archive in the same place makes a lot of sense, and is a snap to set up.

Exchange 2007 does add a little to this mix though, in the shape of per-user journaling. In this instance, you could decide you don’t need to archive every email from every user, but only certain roles or levels of employee (eg HR and legal departments, plus board members & executives).

Now, using Hosted Archive does go against what I said earlier about keeping everything – except that in this instance, you don’t need to worry about how to do the keeping… that’s someone else’s problem…

Further information on using Exchange in a compliance regime can be seen in a series of video demos, whitepapers and case studies at the Compliance with Exchange 2007 page on Microsoft.com.

Keep the Item count in your mailbox low!

I’ve been doing a little digging today, following a query from a partner company who’re helping out one of their customers with some performance problems on Exchange. Said customer is running Exchange 2000, and has some frankly amazing statistics…

… 1000 or so mailboxes, some of which run to over 20Gb in size, with an average size of nearly 3Gb. To make matters even worse, some users have very large numbers of items in their mailbox folders – 60,000 or more. Oh, and all the users are running Outlook in Online mode (ie not cached).

Now, seasoned Exchange professionals the world over would either be shrugging saying that these kind of horror stories are second nature to them (or fainting at the thought of this one), but it’s not really obvious to the average IT admin *why* this kind of story is bad news.

When I used to work for the Exchange product group (back when I could say I was still moderately technical), I posted on the Exchange Team blog (How does your Exchange garden grow?) with some scary stories about how people unknowingly abused their Exchange systems (like the CEO of a company who had a nice clean inbox with 37 items, totalling just over 100kb in size… but a Deleted Items folder that was 7.4Gb in size with nearly 150,000 items).

Just like it’s easy to get sucked into looking at disk size/capacity when planning big Exchange deployments (in reality, it’s IO performance that counts more than storage space), it’s easy to blame big mailboxes for bad performance when in fact, it could be too many items that cause the trouble.

So what’s too many?

Nicole Allen posted a while back on the EHLO! blog, recommending 2,500-5,000 maximum items in the “critical path” folders (Calendar, Contacts, Inbox, Sent Items) and ideally keep the Inbox to less than 1,000 items. Some more detail on the reasoning behind this comes from the Optimizing Storage for Exchange 2003 whitepaper…

Number of Items in a Folder

As the number of items in the core Exchange 2003 folders increase, the physical disk cost to perform some tasks will also increase for users of Outlook in online mode. Indexes and searches are performed on the client when using Outlook in cached mode. Sorting your Inbox by size for the first time requires the creation of a new index, which will require many disk I/Os. Future sorts of the Inbox by size will be very inexpensive. There is a static number of indexes that you can have, so folks that often sort their folders in many different ways could exceed this limit and cause additional disk I/O.

One potentially important point here is that any folder, when it gets really big, is going to take longer to process when it fills up with items. Sorting or any other view-related activity will take longer, and even retrieving items out of the folder will slow down (and hammer the server at the same time).

Oh, and be careful with archiving systems which leave stubs behind too – you might have reduced the mailbox size, but performance could still be negatively affected if the folders have lots of items left.

Plain text, RTF or HTML mail?

Here’s an interesting question that I was asked earlier today; I can’t offer a definitive answer, but these are my thoughts. If you have any contradictory or complimentary comments, please comment or let me know.

“Can RTF/HTML Mail be as safe as plain text with regard to viruses/malware etc?”

Theoretically, I think plain text will always be safer since there’s less work for the server to do, and there’s no encoding of the content other than the real basics of wrapping up the envelope of the message (eg taking the various to/from/subject fields, encapsulating the blurb of body text, and turning it into an SMTP-formatted message).

Where things could get interesting is that plain text still allows for encoding of attachments (using, say, MIME or UUENCODE), which could still be infected or badly formed – so the risk level of attachments is technically the same (although in an RTF or HTML mail, the attachment can be inline with the text, which might mean the user is more likely to be lured into opening it, if it’s malicious).

There may be some risks from a server perspective in handling HTML mail which mean that a badly formed message might be used to stage a denial of service on the server itself. I heard tell of a case a few years ago when a national newsletter was sent out with a badly formed HTML section, and when the Exchange server was processing the mail to receive it, the store process crashed (bringing Exchange to its knees in an instant).

The downsides with that scenario were:

  • The message was still in the inbound queue, so when the store came back online, it started processing the message again and <boom>
  • This newsletter was sent to thousands of people, meaning that any company that had at least one person receiving that mail, had some instant server-down until they identified the offending message and fished it out of the queue.

This bug in Exchange was identified & fixed, but there’s always the theoretical possibility that since the formatting of an HTML message is more complex, there could be glitches in handling the message (in any email system).

Plain text mail is ugly and so lowest-common-denominator, it’d be telling everyone to save their documents as .TXT rather than .DOC or .PDF.

RTF mail works OK internally, but doesn’t always traverse gateways between Exchange systems, and isn’t supported by anything other than Outlook (ie mailing a user in Domino, they won’t see the rich text).

HTML mail may be slightly larger (ie to do the same content as you’d do with RTF takes more encoding and it’s sometimes a bit bigger as a result), but it’s much more compatible with other clients & servers, offers much better control of layout and traverses other email systems more smoothly.

I’d say HTML mail is the obvious way to go. Anyone disagree?

Living the dream with Office Communicator 2007

I’ve been a long-time fan of instant messaging and pervasive “presence”, especially the cultural changes it allows organisations to make in order to communicate and collaborate better. As a result, I’ve been really interested to see what’s been happening with Office Communications Server (the soon-to-be-released successor to Live Communications Server).

Around 6 weeks ago, I joined an internal MS deployment of full-voice OCS, meaning that my phone number was moved onto the OCS platform so now I’m not using the PBX at all. It’s been a remarkably cool experience in a whole lot of ways, but it really hits home just how different the true UC world might be, when you start to use it in anger.

I’ve been working from home today, and the fact that my laptop is on the internet (regardless of whether I’m VPNed into the company network), the OCS server will route calls to my PC and simultaneously to the mobile, so I can pick them up wherever. As more and more people are using OCS internally, it’s increasingly the norm to just hit the “Call” button from within Office Communicator (the OCS client) or from Outlook, and not really care which number is going to be called.

brettjo on a Catalina

Here, I was having a chat with Brett and since we both have video cameras, I just made a video call – I was at home so just talked to the laptop in a speakerphone type mode, Brett was in the office so used his wired phone, which was plugged into the PC:

(this device is known internally as a “Catalina” and functions mainly as a USB speaker/microphone, but also has some additional capabilities like a message waiting light, a few hard-buttons, and a status light that shows the presence as currently set on OCS).

It’s a bit weird when you start using the phone and realise that you’re not actually going near a traditional PBX environment for a lot of the interaction. Calling up voice mail, as delivered by Exchange Unified Messaging, is as easy as pressing the “call voice mail” button in Communicator – no need to provide a PIN or an extension number, since the system already knows who I am and I’ve already authenticated by logging in to the PC.

When I use this, the “call” goes from my PC to OCS, then from the OCS server directly to the Exchange server, all as an IP data stream and without touching the traditional TDM PBX that we still have here. A third party voice gateway allows for me to use OCS to call other internal people who are still homed on the PBX system, and to make outbound calls.

Microsoft’s voice strategy of “VoIP As You Are” starts to make a lot of sense in this environment – I could deploy technology like OCS and Exchange UM and start getting immediate benefit, without needing to rip & replace the traditional phone system, at least not until it’s ready for obsolescence.

Here’s an idea of what kind of system is in place – for more information, check out Paul Duffy’s interview with ZDNet’s David Berlind.

The business case for Exchange 2007 – part II

(This is a follow on to the previous post on measuring business impact, and the first post on the business case for Exchange 2007, and are my own thoughts on the case for moving to Exchange 2007). It’s part of a series of posts which I’m trying to keep succinct, though they tend to be a bit longer than usual. If you find them useful, please let me know…)

GOAL: Reduce the backup burden

Now I’m going to start by putting on the misty rose-tinted specs and think back to the good old days of Exchange 4.0/5.x. When server memory was measured in megabytes and hard disk capacity in the low Gbs, there were much lower bottlenecks to performance than exist today.

Lots of people deployed Exchange servers with their own idea of how many users they would “fit” onto each box – in some cases, it would be the whole organisation; in others, it would be as many users as that physical site would have (since good practice was then to deploy a server at every major location); some would be determined by how many mailboxes that server could handle before it ran out of puff. As wide area networks got faster, more reliable and less expensive, and as server hardware got better and cheaper, the bottleneck for lots of organisations stopped being about how many users the server could handle, and more about how many users was IT comfortable in having the server handle.

On closer inspection, this “comfort” level would typically come about for 2 reasons:

  • Spread the active workload – If the server goes down (either planned or unplanned), I only want it to affect a percentage of the users rather than everyone. This way, I’d maybe have 2 medium-sized servers and put 250 users on each, rather than 500 users on one big server.
  • Time to Recovery is lower – If I had to recover the server because of a disaster, I only have so many hours (as the SLA might state) to get everything back up and running, and it will take too long to restore that much data from tape. If I split the users across multiple servers, then the likelihood of a disaster affecting more than one server may be lower, and,  in the event of total site failure, the recovery of multiple servers can at least be done in parallel.

(Of course, there were other reasons, initially – maybe people didn’t believe the servers would handle the load, so played safe and deployed more than they really needed… or third party software, like Blackberry Enterprise Server, might have added extra load so they’d need to split the population across more servers).

So the ultimate bottleneck is the time it takes for a single database or single server’s data to be brought back online in the event of total failure. This time will be a function of how fast the backup media was (older DAT type tape backup systems might struggle to do 10Gb/hr, whereas a straight-to-disk backup might do 10 or 20 times that rate), and is often referred to in mumbo-jumbo whitepaper speak as “RTO” or Recovery Time Objective. If you’ve only got 6 hours before you need to have the data back online, and it takes 20Gb/hr to recover the data from your backup media, then at a maximum you could only afford to have 120Gb to be recovered and still have a hope of meeting the SLA.

There are a few things that can be done to mitigate this requirement:

  • Agree a more forgiving RTO.
  • Accept a lower RPO (Recovery Point Objective is, in essence, the stage you need to get to – eg have all the data back up and running, or possibly have service restored but with no historical data, such as with dial-tone recovery in Exchange).
  • Reduce the volume of data which will need to be recovered in series – by separating out into multiple databases per server, or by having multiple servers.

Set realistic expectations

Now, it might sound like a non-starter to say that the RTO should be longer, or the RPO less functional – after all, the whole point of backup & disaster recovery is to carry on running even when bad stuff happens, right?

It’s important to think about why data is being backed up in the first place: it’s a similar argument to using clustering for high availability. You need to really know if you’re looking for availability, or recoverability. Availability means that you can keep a higher level of service, by continuing to provide service to users even when a physical server or other piece of infrastructure is no longer available, for whatever reason. Recoverability, on the other hand, is the ease and speed with which service and/or data can be brought online following a more sever failure.

I’ve spoken with lots of customers over the years who think they want clustering, but in reality they don’t know how to operate a single server in a well-managed and controlled fashion, so adding clusters would make things less reliable, not more. I’ve also spoken with customers who think they need site resilience, so if they lose their entire datacenter, they can carry on running from a backup site.

Since all but the largest organisations tend to run their datacenters in the same place where their users are (whether that “datacenter” is a cupboard under the stairs or the whole basement of their head office), in the event that the entire datacenter is wiped out, it’s quite likely that they’ll have lots of other things to worry about – like where the users are going to sit? How is the helpdesk going to function, and communicate effectively with all those now-stranded users? What about all the other, really mission critical applications? Is email really as important as the sales order processing system, or the customer-facing call centre?

In many cases, I think it is acceptable to have a recovery point objective of, within a reasonable time, delivering a service that will enable users to find each other and to send & receive mail. I don’t believe it’s always worth the effort and expense that would be required to bring all the users’ email online at the same time – I’d rather see mail service restored within an hour, even if it takes 5 days for the historical data to come back, compared to 8 hours for restoring any kind of service which included all the old data.

How much data to fit on each server in the first place

Microsoft’s best practice advice has been to limit the size of each Exchange database to 50Gb (in Exchange 2003), to make the backup & recovery process more manageable. If you built Exchange 2003 servers with the maximum number of databases, this would set the size “limit” of each server to 1Tb of data. In Exchange 2007, this advisory “limit” has been raised to 100Gb maximum per database, unless the server is replicating the data elsewhere (using the Continuous Replication technology), in which case it’s 200Gb per database. Oh, and Exchange 2007 raises the total number of databases to 50, so in theory, each server could now support 10Tb of data and still be recoverable within a reasonable time.

The total amount of data that can be accommodated on a single server is often used to make a decision about how many mailboxes to host there, and how big they should be – it’s pretty common to see sizes limited to 200Mb or thereabouts, though it does vary hugely (see the post on the Exchange Team blog from a couple of years ago to get a flavour). Exchange 2007 now defaults to having a mailbox quota of 10 times that size: 2Gb, made possible through some fundamental changes to the way Exchange handles and stores data.

Much of this storage efficiency now derives from Exchange 2007 running on 64-bit (x64) servers, meaning there’s potentially a lot more memory available for the server to cache disk contents in. A busy Exchange 2003 server (with, say, 4000 users), might only have enough memory to cache 250Kb of data for each user – probably not even enough for caching the index for the user’s mailbox, let alone any of the data. In Exchange 2007, the standard recommendation would be to size the server so as to have 5Mb or even 10Mb of memory for every user, resulting in dramatically more efficient use of the storage subsystem. This pay-off means that a traditional performance bottleneck on Exchange of the storage subsystem’s I/O throughput, is reduced considerably.

NET: Improvements in the underlying storage technology within Exchange 2007 mean that it is feasible to store a lot more data on each server, without performance suffering and without falling foul of your RTO/SLA goals.

I’ve posted before about Sizing Exchange 2007 environments.

What to back up and how?

When looking at backup and recovery strategies, it’s important to consider exactly what is being backed up, how often, and why.

Arguably, if you have a 2nd or 3rd online (or near-online) copy of a piece of data, then it’s less important to back it up in a more traditional fashion, since the primary point of recovery will be another of the online copies. The payoff for this approach is that it no longer matters as much if it takes a whole weekend to complete writing the backup to whatever medium you’re using (assuming some optical or magnetic media is still in play, of course), and that slower backup is likely to be used only for long-term archival or for recovery in a true catastrophe when all replicas of the data are gone.

Many organisations have sought to reduce the volume of data on Exchange for the purposes of meeting their SLAs, or because keeping large volumes of data on Exchange was traditionally more expensive due to the requirements for high-speed (and often shared) storage. With having more memory in an Exchange server due to it being 64-bit, the hit on I/O performance can be much lower, meaning that a 2007 server could host more data with the same set of disks than an equivalent 2003 server would (working on the assumption that Exchange will have historically hit disk I/O throughput bottlenecks before running out of disk space). The simplest way to reduce the volume of data stored on Exchange (and therefore, data which needs to be backed up and recovered on Exchange), is to reduce the mailbox quota of the end users.

In the post, Exchange mailbox quotas and ‘a paradox of thrift’, I talked about the downside of trying too hard to reduce mailbox sizes – the temptation is for the users to stuff everything into a PST file and have that being backed up (or risk being lost!) outside of Exchange. Maybe it’s better to invest in keeping more data online on Exchange, such that it’s always accessible from any client (unlike some archiving systems which require client-side software, thereby rendering the data unaccessible to non-Outlook clients), not replicated to users’ PCs when running in Cached Mode, and not being indexed for easy retrieval by either the Exchange Server or by the client PC.

NET: Taking data off Exchange and into either user’s PST archive files, or a centralised archiving system, may reduce the utility of the information by making it less easy to find and access, and could introduce more complex data management procedures as well as potential additional costs of ownership.

Coming to a datacenter near you

An interesting piece of “sleeper” technology may help reduce the discussions of backup technique: known simply as DPM, or System Center Data Protection Manager to give it its full title. DPM has been available for a while and targeted at backing up and restoring file server data, but the second release (DPM 2007) is due soon, and adds support for Exchange (as well as Sharepoint and SQL databases). In essence, DPM is an application which runs on Windows Server, that is used to manage snap-shots of the data source(s) it’s been assigned to protect. The server will happily take snaps at timely intervals and can keep them in a near-line state or archive them to offline (ie tape) storage for archival.

DPM 2007-05 graphic B

With very low cost but high-capacity disks (such as Serial-Attached SCSI arrays or even SATA disks deployed in fault-tolerant configurations), it could be possible to have DPM servers capable of backing up many Tbs of data as the first or second line of backup, before spooling off to tapes on an occasional basis for offsite storage. A lot of this technology has been around in some form for years (with storage vendors typically having their own proprietary mechanisms to create & manage the snapshots), but with a combination of Windows’ Volume Shadowcopy Services (VSS), Exchange’s support for VSS, and DPM’s provision of the back-end to the whole process, the cost of entry could be significantly lower.

NET: Keeping online snapshots of important systems doesn’t need to be as expensive as in the past, and can provide a better RTO and RPO than alternatives.

So, it’s important to think about how you backup and restore the Exchange servers in your organisation, but by using Exchange 2007, you could give the users a lot more quota that they’ve had before. Using Managed Folders in Exchange, you could cajole the users into keeping this data more free of stuff they don’t need to keep, and to more easily keep the stuff they do. All the while, it’s now possible to make sure the data is backed up quickly and at much lower cost than would have been previously possible with such volumes of data.

Exchange mailbox quotas and a ‘paradox of thrift’

The study of economics throws up some fantastic names for concepts or economic models, some of which have become part of the standard lexicon, such as the Law of Diminishing Returns, or the concept of opportunity cost, which I’ve written about before.


thrift.gifThough it sounds like it might be something out of Doctor Who, The Paradox of Thrift is a Keynesian concept which basically says that, contrary to what might seem obvious, saving money (as in people putting money into savings accounts) might be bad for the economy (in essence, if people saved more and spent or invested less, it would reduce the amount of money in circulation and cause an economic system to deflate). There’s a similar paradox to managing mailbox sizes in Exchange – from an IT perspective it seems like a good thing to reduce the total volume of mail on the server, since it costs less to manage all the disks and there’s less to backup and restore.


Ask the end users, however, and it’s probably a different story. I’ve lost count of how many times I’ve heard people grumble that they can’t send email because their mailbox has filled up (especially if they’ve been away from the office). End users might argue they just don’t have time to keep their mailbox size low through carefully ditching mail that they don’t need to keep, and filing the stuff that they do.



I guess it’s like another principle in economics – the idea that we have unlimited wants, but a limited set of resources with which to fulfil those wants & needs. The whole point of economics is to make best use of these limited resources to best satisfy the unlimited wants. Many people (with a few exceptions) would agree that they never have enough money – there’ll always be other, more expensive ways to get rid of it.


It’s important to have a sensible mailbox quota or the paradox of being too stingy may come back and bite you. Some organisations will take mail off their Exchange servers and drop it into a central archive, an approach which solves the problem somewhat but introduces an overhead of managing that archive (not to mention the cost of procurement). I’d argue that it’s better to use Managed Folders facilities in Exchange to manage the data.


The true paradox of mailbox quota thrift kicks in if the users have to archive everything to PST files, then you’ve just got the problem of how to make sure that’s backed up… especially since it’s not supported to have them stored on a network drive (though that doesn’t stop people from doing it… Personal folder files are unsupported over a LAN or over a WAN link). Even worse (from a backup perspective) is that Outlook opens all the PST files configured in its profile, for read/write. So what this means is that every one of the PST files in your Outlook profile gets its date/time stamp updated every time you run Outlook.


This of course means that if you’re storing your PSTs on a network share (tsk, tsk), and that file share is being backed up every night (as many are), then your PSTs will be backed up every night, regardless of whether the job is incremental/differential or full. I’ve seen large customers (eg a 100,000+ user bank) who estimate that over 50% of the total data they back up, every day, is PST files. Since PSTs are used as archives by most people, by definition the contents don’t change much, but that’s irrelevant – the date/time stamp is still updated every times they’re opened.


So as well as losing any benefit of single-instance storage by leaving the data in Exchange (or getting the users to delete it properly), you’re consuming possibly massive amounts of disk space on file servers, and having to deal with huge amounts of data to be backed up every night, even if it doesn’t change.


If you had an Exchange server with 1,000 users, and set the mailbox quota at 200Mb, you might end up with 75% quota usage and with 10% single instance ratio, you’d have about 135Gb of data on that server, which would be backed up in full every week, with incremental or differential backups every night in between (which will be a good bit smaller since not all that much data will change day to day).


If each of those users had 1Gb of PST files (not at all extraordinary – I currently have nearly 15Gb of PSTs loaded into Outlook! – even with a 2Gb quota on the mailbox, which is only 30% full), then you could be adding 1Tb of data to the file servers, hurting the LAN performance by having those PSTs locked open over the network, and being backed up every day… Give those users a 2Gb mailbox quota, and stop them from using PSTs altogether, and they’d be putting 1.2Tb worth of data onto Exchange, which might be more expensive to keep online than 1Tb+ of dumb filestore, but it’s being backed up more appropriately and can be controlled much better. 


So: don’t be miserly with your users’ mailbox quotas. Or be miserly, and stop them from using PSTs altogether (in Outlook 2003) or stop the PSTs from getting any bigger (in Outlook 2007).

The business case for Exchange 2007

(this is a follow on to the previous post on measuring business impact, and are my own thoughts on the case for moving to Exchange 2007)

There are plenty of resources already published which talk about the top reasons to deploy, upgrade or migrate to Exchange 2007 – the top 10 reasons page would be a good place to start. I’d like to draw out some tangible benefits which are maybe less obvious than the headline-grabbing “reduce costs”/”make everyone’s life better” type reasons. I’ll approach these reasons over a number of posts, otherwise this blog will end up reading like a whitepaper (and nobody will read it…)

GOAL: Be more available at a realistic price

High availability is one of those aims which is often harder to achieve than it first appears. If you want a really highly-available system, you need to think hard not only about which bits need to be procured and deployed (eg clustered hardware and the appropriate software that works with it), but the systems management and operations teams need to be structured in such a way that they can actually deliver the promised availability. Also, a bit like disaster recovery, high availability is always easier to justify following an event where not having it is sorely missed… eg if a failure happens and knocks the systems out of production for a while, it’ll be easier to go cap-in-hand to the budget holder and ask for more money to stop it happening again.

Example: Mrs Dalton runs her own business, and like many SMBs, money was tight when the company was starting up – to the extent that they used hand-me-down PC hardware to run their main file/print/mail server. I had always said that this needed to be temporary only, and how they really should buy something better, and it was always something that was going to happen in the future.

Since I do all the IT in the business (and I don’t claim to do it well – only well enough that it stops being a burden for me… another characteristic of small businesses, I think), and Mrs D is the 1st line support for anyone in the office if/when things go wrong, it can be a house of cards if we’re both away. A year or two after they started, the (temporary) server blew its power supply whilst we were abroad on holiday, meaning there was no IT services at all – no internal or internet access (since the DHCP server was now offline) which ultimately meant no networked printers, no file shares with all the client docs, no mail (obviously) – basically everything stopped.

A local PC repair company was called in and managed to replace the PSU and restore working order (at a predictably high degree of expense), restoring normal service after 2 days of almost complete downtime.

Guess what? When we got back, the order went in for a nice shiny server with redundant PSU, redundant disks etc etc. No more questions asked…

Now a historical approach to making Exchange highly available would be to cluster the servers – something I’ve talked about previously in a Clustering & High Availability post.

The principal downside to the traditional Exchange 2003-style cluster (now known as a Single Copy Cluster) was that it required a Storage Area Network (at least if you wanted more than 2 nodes), which could be expensive compared to the kind of high-capacity local disk drives that might be the choice for a stand-alone server. Managing a SAN can be a costly and complex activity, especially if all you want to do with it is to use it with Exchange.

Also, with the Single-Copy model, there’s still a single point of failure – if the data on the SAN got corrupted (or worst case, the SAN itself goes boom), then everything is lost and you have to go back to the last backup, which could have been hours or even days old.

NET: Clustering Exchange, in the traditional sense, can help you deliver a better quality of service. Downtime through routine maintenance is reduced and fault tolerance of servers is automatically provided (to a point).

Now accepting that a single copy cluster (SCC) solution might be fine for reducing downtime due to more minor hardware failure or for managing the service uptime during routine maintenance, it doesn’t provide a true disaster-tolerant solution. Tragic events like the Sept 11th attacks, or the public transport bombs in cities such as London and Madrid, made a lot of organisations take the threat of total loss of their service more seriously … meaning more started looking at meaningful ways of providing a lights-out disaster recovery datacenter. In some industries, this is even a regulatory requirement.

Replication, Replication, Replication

Thinking about true site-tolerant DR just makes everything more complex by multiples – in the SCC environment, the only supported way to replicate data to the DR site will be to do it synchronously – ie the Exchange servers in site A write data to their SAN, which replicates that write to the SAN in site B, which acknowledges that it has received that data, all before the SAN in site A can acknowledge to the servers that the data has successfully been written. All this adds huge latency to the process, and can consume large amounts of high-speed bandwidth not to mention duplication of hardware and typically expensive software (to manage the replication) at both sides.

If you plan to shortcut this approach and use some other piece of replication software (which is installed on the Exchange servers at both ends) to manage the process, be careful – there are some clear supportability boundaries which you need to be aware of. Ask yourself – is taking a short cut to save money in a high availability solution, just a false economy? Check out the Deployment Guidelines for multi-site replication in Exchange 2003.

There are other approaches which could be relevant to you for site-loss resilience. In most cases, were you to completely lose a site (and for a period of time measured at least in days and possibly indefinitely), there will be other applications which need to be brought online more quickly than perhaps your email system – critical business systems on which your organisation depends. Also, if you lost a site entirely, there’s the logistics of managing where all the people are going to go? Work from home? Sit in temporary offices?

One practical solution here is to use something in Exchange 2003 or 2007 called Dial-tone recovery. In essence, it’s a way of bringing up Exchange service at a remote location without having immediate access to all the Exchange data. So your users can at least log in and receive mail, and be able to use email to communicate during the time of adjustment, with the premise that at some point in the near future (once all the other important systems are up & running), their previous Exchange mailbox data will be brought back online and they can access it again. Maybe that data isn’t going to be complete, though – it could be simply a copy of the last night’s backup which can be restored onto the servers at the secondary site.

Using Dial-tone (and an associated model called Standby clustering, where manual activation of standby servers in a secondary datacenter can bring service – and maybe data – online), can provide you a way of keep service availability high (albeit with temporary lowering of the quality, since all the historic data isn’t there) at a time when you might really need that service (ie in a true disaster). Both of these approaches can be achieved without the complexity and expense of sharing disk storage, and without having to replicate the data in real-time to a secondary location.

Exchange 2007 can help you solve this problem, out of the box

Exchange 2007 introduced a new model called Cluster Continuous Replication (CCR) which provides a near-real-time replication process. This is modelled in such Cluster Continuous Replication Architecture

a way that you have a pair of Exchange mailbox servers (and they can only be doing the mailbox role, meaning you’re going to need other servers to take care of servicing web clients, performing mail delivery etc), and one of the servers is “active” at any time, with CCR taking care of the process of making sure that the copy of the data is also kept up to date, and providing the mechanism to automatically (or manually) fail over between the two nodes, and the two copies of the data.

What’s perhaps most significant about CCR is (apart from the fact that it’s in the box and therefore fully supported by Microsoft), is that there is no longer a requirement for the cluster nodes to access shared disk resources… meaning you don’t need a SAN (now, you may still have reasons for wanting a SAN, but it’s just not a requirement any more).

NET: Cluster Continuous Replication in Exchange 2007 can deliver a 2-node shared-nothing cluster architecture, where total failure of all components on one side can be automatically dealt with. Since there’s no requirement to share disk resources between the nodes, it may be possible to use high-speed, dedicated disks for each node, reducing the cost of procurement and the cost & complexity of managing the storage.

Exchange 2007 also offers Local Continuous Replication (LCR), designed for stand-alone servers to keep 2 copies of their databases on different sets of disks. LCR could be used to provide a low-cost way of keeping a copy of the data in a different place, ready to be brought online through a manual process. It is only applicable in a disaster recovery scenario, since it will not offer any form of failover in the event of a server failure or planned downtime.

Standby Continuous Replication (SCR) is the name given to another component of Exchange 2007, due to be part of the next service pack. This will provide a means to have standby, manually-activated, servers at a remote location, which receive a replica of data from a primary site, but without requiring the servers to be clustered. SCR could be used in conjunction with CCR, so a cluster which provides high availability at one location could also send a 3rd replica of its data to a remote site, to be used in case of total failure of the primary site.

 The key point is “reasonable price”

In summary, then: reducing downtime in your Exchange environment through clustering presents some challenges.

  • If you only have one site, you can cluster servers to share disk storage and get a higher level of service availability (assuming you have the skills to manage the cluster properly). To do this, you’ll need some form of storage area network or iSCSI NAS appliance.
  • If you need to provide site-loss resilience (either temporary but major, such as a complete power loss, or catastrophic, such as total loss of the site), there are 3rd-party software-based replication approaches which may be effective, but are not supported by Microsoft. Although these solutions may work well, you will need to factor in the possible additional risk of a more complex support arrangement. The time you least want to be struggling to find out who can and should be helping you get through a problem, is when you’ve had a site loss and are desperately trying to restore service.
  • Fully supported site-loss resilience with Exchange 2003 can only be achieved by replicating data at a storage subsystem level – in essence, you have servers and SANs at both sites, and the SANs take care of real-time, synchronous, replication of the data between the sites. This can be expensive to procure (with proprietary replication technology not to mention high speed, low latency network to connect the sites – typically dark fibre), and complex to manage.
  • There are manual approaches which can be used to provide a level of service at a secondary site, without requiring 3rd party software or hardware solutions – but these approaches are designed to be used for true disaster recovery, not necessarily appropriate for short-term outages such as temporary power failure or server hardware failure.
  • The Cluster Continuous Replication approach in Exchange 2007 can be used to deliver a highly-available cluster in one site, or can be spanned across sites (subject to network capacity etc) to provide high-availability for server maintenance, and a degree of protection against total site failure of either location.

NET: The 3 different replication models which are integral to Exchange 2007 (LCR, CCR and SCR) can help satisfy an organisation’s requirements to provide a highly-available, and disaster-tolerant, enterprise messaging system. This can be achieved without requiring proprietary and expensive 3rd party software and/or hardware solutions, compared with what would be required to deliver the same service using Exchange 2003.

 

Topics to come in the next installments of the business case for Exchange 2007 include:

  • Lower the risk of being non-compliant
  • Reduce the backup burden
  • Make flexible working easier

Outlook 2007 signatures location

Following my post about .sig files, I had cause to dig around looking for where Outlook actually puts the Signature files. I came across a post which Allister wrote a little while ago, but it’s such a useful little tip that it’s worth repeating…

Basically, Outlook 2007 offers a nice simple editor UI for building your own signatures, however it’s complicated by the need to present the signature in HTML (the default format for mail now), Rich Text Format (aka RTF, the original Exchange/Outlook format dating back to the mid 90s) and plain old Text (with total loss of colour, formatting etc).

image

Outlook actually generates 3 versions of the sig, to be used for the different formats. In some cases – notably with plain text – the end result following the conversion isn’t quite what you’d want… my nicely formatted .sig about comes out a bit mangled, as

Ewan Dalton |     | Microsoft UK | ewand@microsoft.com |+44 118 909 3318 Microsoft Limited | Registered in England | No 1624297 | Thames Valley Park, Reading RG6 1WG

so it may be necessary to do a bit of tweaking to the formats. Do bear in mind, that if you do edit the sig files directly, then go back into the menu in Outlook and make some other change, your original tweaks will be over-written.

Anyway, you could find the signature files in something akin to:

[root]\Users\[username]\AppData\Roaming\Microsoft\Signatures

(there may not be a \Roaming folder, depending on how your environment is set up, or it may be in \Documents and Settings\ and under an Application Data folder depending on your version of Windows).