Web Applications in the age of Azure

Increasingly, I’ve found myself turning to Microsoft’s Azure platform for new web projects. The convenience of on-demand infrastructure, combined with a usage based payment model is compelling; but what makes Azure shine are the rapidly expanding set of supporting services available.

When you fist start writing Azure WebSites, you quickly run into complexities that are often unfamiliar to developers who only ever targeted on-prem IIS servers. You can’t rely on the local file-system, sites span multiple server instances, you have to ship dependencies along with your application’s code, and so on.

These issues aren’t new for those who develop for web-farm environments, but even the smallest Azure WebSite forces you to consider complexities of scale –issues that plagued only the largest of projects in the past.

Fortunately, Azure provides a complete set of services to address those complexities, and using them is fairly easy. Caching services handle shared session state between instances of your application. Azure storage substitutes for the local file system. Azure WebJobs handle background processing and scheduled tasks.

There are also services analogous to most of the common external dependencies; Azure Search, Azure DocumentDb, Azure SQL, Azure AD, etc. Some of these are so similar to their on-prem equivalents that using them is largely transparent, but most act more like a ultra-modern replacements for their older, on-prem cousins.

If you need a service Azure doesn’t offer directly, you can always spin up an Azure VM. The Azure Gallery has pre-built VMs for tons of common services, or you can roll out a standard Linux or Windows VM and install whatever third party software you need.

It took me a little time to learn my way around the Azure platform. At first, I resented the additional effort, especially in small projects. After the rather modest initial learning curve though, I have found that Azure provides far more than simple replacements for the same old services I’ve always used. They offer a lot of additional value that I’ve never had easy access to before.

Here is an example. The first time one of my Azure applications needed to handle user file uploads, I found it cumbersome to use Azure Storage. I had to provision a storage account, pull down the nuget packages, then configure my application to talk to it. It only took about 20 minutes the first time, but it seemed like a hassle compared to just calling into System.IO.

After getting it all hooked up though, I discovered that Azure storage wasn’t difficult to use, and it eliminated a lot of common problems. I no longer had to worry about mapping server paths, dealing with file and folder permissions, or file contention issues. It just works, and it scales without thought.

Still later, I came across the need to write information to a queue so a background process could act on it later. Normally I’d have to create a table in my database, or drop custom files on the file system to share information with the background process –or worse, deal with Microsoft Message Queues (I still have nightmares). After having setup Azure storage for the file uploads though, I also had access to Azure Storage Queues at my finger tips.

I had similar results when I had to setup a cache service for handling session state. It was a tad inconvenient, but when I needed to cache data I’d fetched from the SQL database, I already had a super-easy, super-fast, cloud-scale caching service right there waiting for me.

Sure, I can setup local servers and services that can do any of these things, but in Azure I don’t have to. These services are already there, all I have to do is turn them on.

It has gotten to the point where I really miss Azure whenever I’m working with on-prem applications. I wish IIS could run WebJobs, and that my local ADFS server supported OpenID Connect. I want a search server on-prem that doesn’t require voodoo-devil-magic to setup and maintain. Working outside of Azure has become an inconvenience.

Dev Diary – Scaling search for an Azure WebSite

azuresearchAdding real search capabilities to a custom web application is never easy. Search is a complex and deeply specialized area of development, and the tools available to us regular developers are monstrously complex.

With TicketDesk 2, I used the popular Lucene.net library to provide search. Ported from Apace Lucene (Java), this is the core technology that powers almost every popular search service, appliance, and search library on the market.

Once you’ve tackled the initial learning curve, Lucene.net isn’t all that difficult to leverage in a simple system like TicketDesk. It is freakishly fast, super flexible, and is a powerful search solution –not quite Google good, but close enough for most applications.

The problem with Lucene is that the design revolves around indexes stored on a traditional file system. There are 3rd party extensions that let you store the indexes in a database or in the cloud, but internally these all mimic the behaviors of a file system –that’s just how Lucene works.

You can have many components querying an index at the same time, but only one can write to an index at a time. Normally this single-writer limitation isn’t a huge problem. You code your application so it creates just one writer instance, then share it with any components that want to make an index update. As long as you keep things synchronous, it tends to work fine.

And here lies the problem. TicketDesk 2.5 and 3.0 are designed to run at scale, and will ship ready for deployment to the cloud as an Azure WebSite. In this scenario, there can be several instances of the application running at the same time, each needing to write to a single, shared Lucene index.

I spent a full week trying to find a way around the single-writer problem. WebSites in Azure shouldn’t write to their filesystems. Anything written locally is volatile, and vanishes whenever Azure automatically moves the site to a different host. So, I started with the AzureDirectory library for Lucene, which lets you store the search index in Azure blob storage. This works well, and gives Lucene a stable place to store shared indexes in the cloud.

The second problem was keeping multiple web site instances from writing to the index at the same time. Even though the index is in blob storage, Lucene still demands an exclusive write lock. Each websites can see when the index is locked by another writer, but there isn’t a way to know if the lock is legitimate, or an orphaned lock left behind when some other instance went down unexpectedly.

The only easy solution is to make sure there is a separate application to handle all index writes, and that there is only a single instance of that application running. You can scale the websites or other clients, just don’t scale the index writer application.

WebJobs were designed specifically for handling background on behalf of Azure WebSites, so I started there. Each website would queue index updates to an Azure Storage Queue, then the WebJob could come along and process the queue in the background. But WebJobs scale with the websites, so if you have multiple websites, you also have multiple webjobs. Hopefully, in the future MS will give us the ability to scale webjobs independent of the websites they service.

So the only remaining solution would be an old fashioned worker role. They scale independently –or in this case, can be instructed not to scale. This works well, but I just don’t like the solution. Effectively, the worker role ends up being a half-ass, custom search server. It costs a decent amount of money to run a separate worker role instance, plus it complicates the deployment and management of the entire application.

Failing to find a way to continue in Azure with custom Lucene indexes without a centralized search server, I figured I’d just design TicketDesk to take advantage of the existing Azure native solution –Azure Search Services. It is easy to code against (relatively), and there is a free tier that should be suitable for most smaller shops. For larger shops, the costs of a paid Azure Search tier is still reasonable when compared to the costs of a dedicated worker role.

So, out of the box, TicketDesk 2.5 will include at least two search providers; a Lucene provider for on-premise single instance setups, and native Azure Search for cloud deployments. I will eventually add an alternative for on-premise webfarms, and non-azure cloud VMs. In the meantime though, you could still scale in your own data-center by using Azure Search remotely, or stick with Lucene and manually disable the search writer on all but one instance of the site in the webfarm.

One additional note of interest: Azure Search is still in preview, and it doesn’t have an official client library for .Net yet. There are two 3rd party client libraries though; Reddog.Search and Azure Search Client Library. Both are free as NuGet pacakges, but only Reddog.Search has a public open source repository. Also, Reddog has a management portal you can run locally, or install as an Azure WebSite extension.


Reddnet.net – Leaving Google Apps and DreamHost for Azure and Office 365

siteI’ve owned reddnet.net a long time. Back in the 90’s I hosted in my own basement data-center™.com, but eventually the costs became problematic. So, I switched to 3rd party hosting providers. Since then, I’ve bounced from provider to provider, never being satisfied with any of them.

My needs are simple. I have a custom domain, a little personal blog, and a few email accounts. I don’t want to spend a ton of cash on the services, nor much time on administrative tasks. At the same time though, this is the heart of my personal online identity. It needs to perform reliably.

A few years ago, after yet another of my hosting providers decayed into oblivion, I decided split my web and email hosting to different providers.

Email is the most painful service to move, so I decided to move it to Google apps. Google let non-corporate organizations, like me, host at Google Apps for free. They are a stable company, and handle email exceptionally well. So, I figured using Google might eliminate my biennial email migration hell.

For the web site, I chose DreamHost, one of the “premier” WordPress partners. Sadly, DreamHost just plain sucks. Their server performance is abysmal, and the network latency makes me wonder which African country hosts their data center –and if it’s powered by hamsters, or a dung-burning furnace. On the plus side, it is reasonably cheap. My blog isn’t exactly popular, so I could live with the sub-optimal service for a while.

In the years since that move, I’ve grown increasingly frustrated with Google. They killed off “free” Google Apps hosting. I’m grandfathered into the plan, but as new services roll out or old ones get upgraded, us free-loaders are last to see an update –if we get updated at all.

Clearly, they want us to buy into a business tier plan. I don’t mind paying for my services, as long as the services are worth it, but Google has given me serious doubts about the value of their services going forward.

Their war against Microsoft has put customers, like me, in the cross-fire. They killed active sync for gMail while sabotaging key APIs across their other services. They refuse to write native apps for Windows 8 or Phone 8 at all –which wouldn’t be bad if they didn’t also interfere with 3rd party apps that try to bring Google’s services to Microsoft’s platforms.

As a Microsoft developer, and Windows and Windows Phone user, Google’s services –especially the Google Apps services– are nearly useless outside a web browser.

The value of using modern web based software services is the ability for it to become an integral part of the entire computing experience –across all platforms, devices and applications. Google seems to disagree.

I’m not claiming Microsoft is an innocent victim here. Microsoft’s legal extortion of licensing revenue from android was a real dick move, for example. But Microsoft doesn’t put its customers on the front-line. Microsoft encourages apps for Apple and Google products, often writing their own native applications when necessary. They certainly never obstruct my ability to use one of their services just because they don’t like the device I chose. They don’t play games with their APIs to sabotage their products on other platforms.

So, it got to the point where I only had two good options. Pay for a subscription to Google apps, or pay for Office 365. My primary concern is making sure I have email services for my domain. The rest of Google Apps or Office 365 are just nice-to-have extras.

Aside from my reservations about Google’s commitment to open, cross-platform integration, what tipped the scales firmly towards a move to Office 365 was Microsoft Azure. Azure is the cloud services platform backing Office 365, in the same way that Google App Engine backs Google Apps.

A move to Office 365 implicitly sets up my domain in Azure, which gives me the opportunity to reunify my web and email services under one provider again. Better still, Azure is a platform that I understand and work with professionally on a regular basis.

I could have hosted my website on Google Apps Engine too, but honestly it isn’t a platform I understand well, and the setup for WordPress there is not painless. On Azure, you just pick WordPress from the web site gallery and it’s done –stupid easy.

Unlike my past hosting providers, Azure’s prices scale very smoothly based on usage. Hosting a simple WordPress site, like mine, costs about $14/month. This is slightly more than a traditional 3rd party WordPress provider, but it performs significantly better too.

And the best part is that, as a subscriber to the Microsoft Developer Network (MSDN), I get $100 a  month of credit to spend on Azure resources. This doesn’t count towards Office 365 licenses, but it effectively makes the web hosting free, and leaves plenty of credit for other projects.

On my old setup, Google was free, while DreamHost ran about $100 a year… and I was unhappy with both. After the switch, Azure is free, while Office 365 runs $120/year (because I need two licenses at $60/ea).

Bottom line — for an extra $20 a year, I get access to high-performance personal web hosting on a platform I know and trust, first-class email, and I regain the seamless service integration across my desktop and phone devices.