January 10, 2012 1

The Importance Of Logging

By admin in PHP

Until recently I had underestimated the value that can be gained by implementing an effective logging strategy.

The most common benefit to be gained is of course that it makes your application easier to debug. As a developer you have probably tailed an apache error log when trying to debug an issue. This technique is just as good for debugging your application providing a good log exists. Having a good application log can show you at a glance where errors are occurring in your application, from technical issues like a database connection going away, to unexpected conditions being reached in the business logic, logs are invaluable.

In production logs will be used by first and second line support teams to diagnose live issues. Good descriptive log lines ensure that the right people get ‘called in the middle of the night’ to deal with a production issues. Logs help to diagnose the cause of production issues after they have occurred, they are invaluable when trying to establish the root cause of an incident or when trying to produce an issue that you can’t seem to replicate in your development environment.

Finally logs can be used to log events that occur within your application for monitoring purposes. This technique is called ‘evented logging’ . By treating each line in a log file as an event it’s possible to monitor logs for events in real time. Events could be technical: API calls, database interactions, page requests. More interestingly you can log business events: Purchases, customer interactions or cash deposits.

Using this technique it’s possible to monitor business events in real time and present them graphically to stake holders. You can see the impact a change on your live platform has on your business metrics. Has a new release dramatically reduced the number of new registrations per minute? This could point to an unexpected problem in the registration process caused by this change. An excellent tool for this (and a subject for it’s own post) is Graphite.

Bellow is an example of a common logging format used by apache.

127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326

In this single line we can see lots of useful information: the ip address of the request, the user’s ID (frank), the date of the request, the request made, the response and the size of the object returned (example credit: Wikipedia).This provides all the information required to log individual requests to a server. Every line in the file represents a request and the format is consistent across lines.

Application logging requires slightly different information. Heres a line to demonstrate :

URGENT c68f39913f901f3ddf44c707357a7d70  [10/Oct/2000:13:55:36 -0700] "Database connection lost in DB.php Line 4"
DEBUG c68f39913f901f3ddf44c707357a7d70  [10/Oct/2000:13:56:00 -0700] "Making call to fulfilment service"

In an application log we are aggregating numerous different pieces of information from multiple simultaneous requests so some additional meta data is required. In these lines we can see a log level, a unique hash identifying which request has generated the log line, the date time and a textual log line.

The log level is required because we may want to adjust which lines are written to the log file. Extensive debug level logging can be expensive to run in production (as each log line is a file operation). A unique hash is required to identify each request because a request may generate multiple log lines and requests are coming in to the application simultaneously. When analysing logs at a later date we may wish to isolate the lines from individual requests.

Ultimately you can decide based on your applications needs the content and format of your logs. The golden rule is it must be absolutely consistent, logs are most useful when they can be parsed at a later to time to yield information.

An evented log may look like this:

URGENT c68f39913f901f3ddf44c707357a7d70  [10/Oct/2000:13:55:36 -0700] BUSEVENT.registration "Customer Registration"
DEBUG c68f39913f901f3ddf44c707357a7d70  [10/Oct/2000:13:56:00 -0700] TECHEVENT.APICall.fulfilment "Fulfilment API Called"

It has many common attributes to the standard application log line but it also includes an easily parsed ‘.’ delimited string determining the type of event that has occurred. The string can easily be parsed and the appropriate event logged on whatever system you are using to display events to your stake holders.

There are many logging libraries you can make use of to make logging easy, they are often components in frameworks but there are also stand alone libraries. Of course it’s also easy to implement your own logging library, however you may be accused of re-inventing the wheel.

Some popular libraries:

Zend Log (part of Zend Framework)
Mono Log (Stand alone logging library used in Silex and Symfony 2)

That was my quick introduction to logging and why it’s important. Go forth and log!

Tags: , ,

January 2, 2012 2

Integrating Phabric Into Your Behat Dev / Test Cycle

By admin in behat

In this post I’ll look at how to manage the use and maintenance of Phabric within a team of business analysts, testers and developers. Phabric is a dynamic fixture creation tool I created for use with the testing framework Behat. It allows you to embed fixture creation into your Gherkin feature files and scenarios. If you havent encountered the tool before please look at the Github page for more information.

In a recent conversation with Marcello Duarte we were discussing ways in which Behat and Phabric could be used by BA’s and testers to create scenarios that set up the state of a system and then test interactions based on that state.

Marcello’s main concern was that because Phabric works by mapping database tables to textual tables in Gherkin scenarios it will always be tightly linked to the underlying database implementation. This causes a number of problems:

1. BA’s and testers shouldn’t have to have an encyclopaedic knowledge of the database structure. In many companies the test teams are primarily ‘black box’ testing a system, they have no knowledge of the implementation within the system just its outputs given set inputs.

2. This approach doesn’t play well with an emergent design. Phabric assumes an existing database that is mapped via configuration files and is ready to be populated when a test is run. In an ideal world testers and BA’s should be able write a large body of tests for features before they are implemented by the development team.

Phabric has a number of features that can be used to mask the complexity of an underlying database implementation and keep the ‘Given’ steps of a scenario nice and simple. The following example illustrates this process.

In this example we will use a fictional company who organise events. The companies application has one database table ‘Event’. The event table has eight columns, they are named in an ugly way, some require specific data entry formats (eg MYSQL DateTime) and not all columns are necessary when creating a test event.

Event Table

event_id
event_name
event_start_date
event_end_date
event_in_progress
event_premium
event_location
event_capacity

The goal of this exercise is to make it easy for testers to write scenarios that involve populating this table. This can be done using the following Phabric features: changing the column names, using default values for non critical columns and data transforming functions to make input data look friendly.

The following example scenario shows these features in action:

Feature: Event page rendering test
         Checks that the various features of the event page render correctly. 

Scenario: Basic Event Is Displayed Correctly.
    Given the following event exists
        | Name              | Start Date | End Date |
        | BDD For Beginners | 12/12/12   | 13/12/12 |
    When I got to the "home" page
    Then I should see the "BDD For Beginners" event

We can see here that:
1. Names of columns are changed to something more readable than ‘event_start_time’
2. Not all columns are required. Under the hood default values are used when columns are not included in the Given step.
3. The values for start and end dates are not in the very verbose MySQL date format.

Note: Examples of how to acheive this with Phabric are available in the projects README and examples at GitHub

Developers can write the mappings for database tables, including sensible default values, column name transformations and column data transformations. It’s important that some kind of documentation exists for the test team. This could be an article on a project wiki or a text file in the projects repository. It should include:

1. Friendly names used to refer to data eg Name / event_name
2. Which data is required
3. What format should be used when entering data. eg – a bit column might be represented by the text ‘YES’ / ‘NO’.
4. A couple of examples of sample scenarios.

The additional work required by developers to create (and maintain) this documentation is quickly repaid by a more self sufficient test team capable of producing data driven scenarios independently of the development team. After the initial structure is in place testers can even add additional fields to Gherkin. The test will fail because the field wont map to the database implementation but the developer working on the feature can add to the existing implementation. They should then add this to the documentation.

Taking this idea one step further and looking at the issue of emergent design, is it possible for test teams to drive the development using Gherkin Tables to describe the desired entities in the system? Testers working closely with business analysts (or whomever it is that has the responsibility for creating stories) could define entities in the system up front.

One way of achieving this would be to write Gherkin tables representing the entities as part of the test creation process. When the tests are handed to developers they could then create a schema and Phabric mapping. They then continue to develop the feature as normal against the tests. Another way would be to produce a list of the required attributes of an entity up front and pass this to the development team ahead of test creation. They would then use this list to create schema and mapping files.

I hope this simple example has gone some way to illustrate how to use Behat and Phabric within an organisation to stream line the process of test creation. You may have noticed that the example conveniently glossed over relational data. Use of relational data requires some additional thought and development and will be the subject of my next blog post in this series.

Tags: , , ,

January 1, 2012 2

Using Behat In A Large Multi Disiplined Team

By admin in behat

Behat is a relatively new testing framework, many developers are using it in personal projects and teams are beginning to use it for production projects as well. Recently a number of people have asked me about how Behat testing is integrated into larger projects involving teams of developers, testers and other stake holders. Common questions include:

How does a story get from inception to implementation?
Which members of the team are responsible for maintaining Gherkin and Steps?
What are the benefits of using an integration testing tool over just unit testing?
What role does Behat testing play in the sign off process?

I’ll answer some of these questions in this post by looking at the reasons Behat was adopted at my previous employer SkyBet* and the workflows developed to support it’s use. We started using Behat at the v1.0 stage and over time hammered out a process to get a story from inception to build, giving each member of the production team responsibilities and putting a process in place for creating acceptance criteria.

Some background is required as to the kind of applications and products developed at Sky to understand why Behat and integration testing were deemed a necessary part of our testing plan. SkyBet have a number of customer facing websites including Sky Vegas, Sky Bet and Sky Poker. As you can imagine with a gaming website, handling customer interactions, data from gaming information providers as well as general CMS content complexity builds up quickly in many areas of the architecture. With this context it’s easy to see why testing that each level of the architecture integrates correctly is important. This accomplished this by testing all internal API’s consumed by the various Sky products using Behat.

Behat proved to be a great tool as it allowed all members of a multi disciplined team to understand what was the required outcome of a story and what features of that story were covered in the tests. This was a benefit provided by Gherkin, the human readable test description language at the heart of Behat.

What do I mean by a multi disciplinary team? This is a team composed of a number of different roles: The product owner, business analysts, testers and developers. It’s common to find teams like this in bigger organisations or software houses, in smaller organisations the roles still exist but members of a team may ‘wear many hats’.

I’ll now walk through the process of defining a story, testing it and developing it with reference to the roles listed above.

At the beginning of a cycle the product owner will define objectives (usually taking them from a product backlog). These objectives are then formalised into stories by the business analysts. This process involves producing a small requirements document with information about the feature and suggestions on key points of functionality to test. They are well placed to do this because of their in depth product knowledge and experience of the business domain.

This document is then passed to the tester who will own the testing process for the story. The tester then translates the requirements doc into Gherkin. In theory BA’s could write the Gherkin, it’s human readable, they should just be able to write what they expect the test to accomplish. The reason why this isn’t the approach taken is that it is the testing teams responsibility to maintain the Gherkin dialect used in the test suite. Their in depth knowledge of which steps to use to accomplish certain tasks (data creation and verification steps in particular) ensure maximum step re-use. The output of this stage is a battery of Gherkin scenarios designed to test all the behaviour of the feature under test. The task is then passed to the developer/s responsible for implementing the feature.

The developer is in a great position at the start of the development phase of the cycle as they have a well defined feature to develop. In later stages of a project when a large number of steps have already been implemented the Behat tests may even run without the development of additional steps. This is one of the key benefits of having experts in a projects ‘dialect’ of Gherkin, step reuse.
At this point the developer would implement any undefined steps and then use the tests to develop the feature against.

The final stage of the test / development cycle is formal acceptance of a feature by testers and then BA’s. Providing the feature was well defined in the early stages of the cycle a green Behat test suite should prove a feature is complete. In reality manual testing is often required to verify the scenarios have been interpreted and implemented correctly.

Thats a summary of the integration and acceptance process employed at SkyBet. It provides real benefits: the time taken to develop a feature decreased (less back and forth between the parties involved in test/dev) and decreased defects in production (continually running the tests at build time catches errors pre deployment).

I think the process defined here also stays true to the BDD principles which Behat extolls. Stories (and behaviour) are defined by stakeholders and are written in Gherkin a language all stake holders can understand. Developers are supplied with tests ahead of feature development so they can use them to produce the feature closely to the specification.

Are you using Behat in production or within a larger team? Let me know what you think of these workflows in the comments section.

* My relationship with SkyBet: Until recently I worked as a Software Engineer at SkyBet. A move to London to and contracting meant I finished working there in Jan 2012. They are an excellent company and are also recruiting!

Tags: , , ,

October 22, 2011 0

PHP Barcelona (PHPBC11) – Acceptance & Integration Testing With Behat

By admin in Uncategorized

Next weekend is the fourth edition of PHP Barcelona. I’m giving a two hour workshop on Acceptance & Integration Testing with Behat.

Behat is a relatively new testing tool and is centred round the Behavioural Driven Development methodology.

The workshop will consist of an introduction followed by a number of practical exercises. By the end of the workshop attendees will be able to write and execute Behat tests and should be ready to use Behat in their projects.

In order to gets the most out of the session I’ve asked attendees to set up LAMP and Behat before hand.

Heres What you need:

NOTE: If you are unable to meet these set up requirements I will be running a ‘Behat Setup Clinic’ on Friday. Follow me on Twitter for details.

1) Apache (pref with new VHOST Set up).
2) PHP 5.3 (Command line and Set up with Apache)
3) Behat (runnable from the command line)
Behat Quick Start
4) Java (required for Sahi)
Java Install
5) Sahi
Sahi Install

This seems like a lot of set up but in reality I expect most attendees working with PHP to have much of this set up already.

I look forward to seeing you at my workshop!

Tags: ,

October 10, 2011 0

Behat & XDebug: Debugging Steps.

By admin in Development

Occasionally when writing complicated steps in Behat it’s useful to debug them (perhaps to find out why your tests are failing!).

The Behat test runner is actually just a PHP script run from the command line. As such by declaring a special variable before running the test runner we can trigger a remote debugging session on your favorite IDE.

From the XDebug manual:

When running the script from the command line you need to set an environment variable, like:
export XDEBUG_CONFIG=”idekey=session_name”
php myscript.php
You can also configure the xdebug.remote_host, xdebug.remote_port, xdebug.remote_mode and xdebug.remote_handler in this same environment variable. All those configurable settings can also be set with normal php.ini settings.

Full Text Here

To trigger a session from the test runner we should use a command similar to the following:

XDEBUG_CONFIG="idekey=session_name" behat path/to/tests

For this to work a debugging session must be started in your IDE (eg – In Netbeans: Project Menu > debug).

Typing this over and over again can get irritating so the next logical step is to set up an alias.

Edit your .bash_profile or .bashrc file using your terminal like so:


> nano ~/.bash_profile

// Then add the following line:

alias behatxc='XDEBUG_CONFIG="idekey=session_name" behat'

Now when you want to run the steps and trigger a debugging session use the command: ‘bashxc’. When you want to run the test runner normally use the regular Behat command.

Tags: , , ,

September 21, 2011 2

Behat & XDebug: Testing & Debugging API Calls

By admin in Development

During office hours I work on a large PHP Application, we use a service oriented architecture and the application exposes a number of API’s. Recently we have been using Behat and writing BDD tests in Gherkin to test the behaviour of our various APIs.

When developing I always use XDebug’s remote debugging functionality to examine application flow and to debug issues as they arise. Our integration tests populate the app’s database with data and set up a number of other resources. If a test is failing its really handy to be able to trigger XDebug breakpoints and walk through the execution of the API call under test while this known state is set up.

Here is a quick example of how it’s done:

The Gherkin Test (with some contrived steps):

Feature: Xdebug example
         As a Behat user
         I want to debug API methods using xdebug
         To facilitate debugging failing tests
 
Scenario: Basic HTTP Request
      Given I am using Behat
      When I call the api method
      Then I should hit a breakpoint

And an example FeatureContext implimentation:

/**
 * Features context.
 */
class FeatureContext extends BehatContext
{
    /**
     * Initializes context.
     * Every scenario gets it's own context object.
     *
     * @param   array   $parameters     context parameters (set them up through behat.yml)
     */
    public function __construct(array $parameters)
    {
        // Initialize your context here
    }
 
    /**
     * @Given /^I am using Behat$/
     */
    public function iAmUsingBehat()
    {
        // Dummy given step
    }
 
    /**
     * @When /^I call the api method$/
     */
    public function iCallTheApiMethod()
    {
        // create a new cURL resource
        $ch = curl_init();
 
        // set URL and other appropriate options
        curl_setopt($ch, CURLOPT_URL, "http://behat-test.dev/api.php");
        curl_setopt($ch, CURLOPT_HEADER, false);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
 
        // The key to getting xdebug to break on breakpoints is this cookie
        curl_setopt($ch, CURLOPT_COOKIE, 'XDEBUG_SESSION=netbeans-xdebug');
 
        // grab URL and pass it to the browser
        $data = curl_exec($ch);
 
        // close cURL resource, and free up system resources
        curl_close($ch);
    }
 
    /**
     * @Then /^I should hit a breakpoint$/
     */
    public function iShouldHitABreakpoint()
    {
        // Dummy then step
    }
}

The important bit in this sample is the line:

curl_setopt($ch, CURLOPT_COOKIE, 'XDEBUG_SESSION=netbeans-xdebug');

This sets a cookie which is picked up by XDebug. Xdebug then attempts to connect to your IDE inline with your xdebug settings in php.ini. This example uses CURL for simplicity but it’s possible to add cookies to Zend Frameworks Http Client and any other good HTTP client class.

All you have to do now is open your IDE or debugging client and start a debugging session. Then drop to the command line and use the behat command to run your tests. When the step containing the HTTP call is executed xdebug should connect to your IDE and break at any point you set.

Over the next day or two I’ll be looking to see if a tighter integration is possible. It would be good to trigger a debugging session from a UI test (via Mink’s feature context file perhaps).

Happy debugging!

Tags: , ,

May 16, 2011 2

PHP Azure Contest Wrap Up

By admin in PHP Azure

Over the past couple of months I’ve participated in the #PHPAzureContest. As I was doing a large body of work for my dissertation it seemed like a good opportunity to try something new and get my project running on a new platform.

A Quick Recap Of The Product

The product is called ‘Twitter Sentiment Engine’. It looks at the sentiment towards search terms on Twitter. It first assembles a training sample of Tweets using the Twitter Search API and then monitors the search terms using the Twitter Streams API. A basic user interface is provided for monitoring the results of the analysis. A full explanation of the product can be found in this post.

At The End Of The Competition What Shape Is The Product In?

The product itself is very much a proof of concept, I have put little work into the UI and instead concentrated on the sentiment calculation implementation and building the two RESTful API’s (public and private – both hosted on Azure) that gather the data from Twitter.

The project is in three tears. The API (hosted on Azure) marshals requests from the other two elements, the UI and the worker nodes.

The UI

The UI of the project is a basic interface to the data gathered by the components below. It allows new keywords to be tracked, tracked keywords to be monitored and shows graph and tabular output of the data gathered on tracked keywords.

Worker Nodes

Worker nodes are the work horse of the application. They make requests to The API which allocates them jobs to work on. Jobs consist of new keywords to track (which require a training sample to be gathered) or an instruction to create and monitor a Twitter stream for a tracked keyword.

The Fulfilment API

The fulfilment API is my first attempt at creating a fully RESTful API using APP.  It glues the UI and worker nodes together. It is responsible for provisioning incoming requests to track new keywords, exposing processing jobs to worker nodes (sample gathering and stream monitoring) and exposing data to the UI to display.

This is the element of the architecture hosted on Azure. It also makes use of a SQL Azure database to sotre data gathered on keywords and data used in maintaining the state of a keyword (whether in sampling or tracking).

The diagram below shows some of the interactions:

Azure: The Good & The Bad

Component Used

I hosted the fulfillment API on an Azure extra-small compute instance and I used SQL Azure as the database for the project.

I found I need more control over the worker processes so these were hosted on a Windows Server 208 VPS on Rackspace. More on this later.

The Good

  1. SQL Azure was pretty fantastic. Microsoft have got it just right here. it ‘Just Works™’. It’s fully compatible with SQL Management Studio which compares favorably to the MySQL gui tools. It is also very easy to provision new databases and set up access white lists. The PDO_SQLSRV driver allows PHP developers to access SQLSRV in a familiar way. The Doctrine 2 project support this driver which allowed me to use Doctrine 2 as the DBAL / ORM for the project.  See here for my brief post on setting up an SQL Azure instance.
  2. The Windows Azure Command Line Tools For PHP are also a good innovation from the Microsoft Camp. The tools provide a mechanism for packaging PHP projects, testing them on Dev Fabric and then preparing them for a release on the live platform. In my opinion this is perforable to forcing developers to use Eclipse PDT to release project. The PHP IDE market is a lot more varied than the Microsoft /.net one (where VS is god). My criticism of the tools is that the learning curve is a little steep and documentation for packaging only the simplest of projects is available.
  3. The platform management console: Good Points. The Azure Manager has a mechanism allowing a quick seemles switch between staging and production environments.

The Bad

  1. Azure for PHP is a relatively new platform. The interoperability team are making great efforts to produce guides and tutorials on a wide range of subjects. While most of the time it was easy to find out the information I needed, frustratingly when encountering edge cases the guides and tutorials I needed wither were not there or I couldn’t find them using Google.  An example of this is how to offload PHP libraries to some other kind of storage (my attempt here) to reduce upload time, creating an extra small compute instance with command line tools and gaining access to PHP error logs (my attempt here).
  2. Total Deployment Time. Each time a change is made to a project it’s entire source must be re packaged and uploaded to Azure. With the ZF and D2 in the project this took in excess of 30 min for packaging and upload. While there is a testing version of Azure often subtle differences meant that it could take three or four attempts to get a deployment right. Deployments where further hampered by a sometimes clunky interface to the platform manager screen.

The Microsoft / PHP Experience In General

I was surprised and pleased to find that the Microsoft on PHP has a great community of people. A range of sites exist to help people find their feet on the MS platform.

Interoperability Bridge

Ubelly

Brian Swan’s Blog – Brain is MS PHP evangelist. He helped a lot with a few issues I had with the project.

Josh Holmes’s Blog – I met josh at PHPUK 2010 he has a few good PHP Azure Articles on his blog.

Juozas Kaziukėnas Blog – Jo Is a Doctrine 2 Core Contributor. He is responsible for interop with Windows.

In general I found that PHP on Microsoft to be a good platform to work on. As of PHP5.3, PHP is ‘feature complete’ on windows and runs well on IIS. The IIS control pannel is also really smooth giving a nice graphical interface to site management.

What’s Next For The Project?

Clearly the project is in it’s infancy, there is so much I’d like to do with it. The first step will to be re work the fulfillment API into a queue based architecture. Queues would allow a better and more reliable way to provision work between the worker processes. In addition it would allow a single point of contact with Twitter. This is more scalable than my current implementation of using Twitter Search API calls and Twitter Filter Streams both of which are rate limited.  The diagram bellow shows what this might look like.

Finally

Thanks to those who helped me along the way, I’ve enjoyed participating in the contest and I look forward to working on the Azure platform in the future.


Tags: ,

May 15, 2011 0

Offloading Libraries To Azure Blob Storage: Initial Experiments

By admin in PHP Azure, Uncategorized

In a previous post I’ve mentioned one of the big pain points on Azure for PHP developers is that it’s necessary  to upload the frameworks or libraries used in an application each time you push to Azure.

It was suggested that one way in which to get round this is to move your libraries to Azure blob storage. This isnt a supported solution but I thought I’d give it a go. At the moment it takers roughly 40 min to package my application and then upload it to Azure. Note: These are just initial experiments. The process outlined below doesn’t work yet (although it’s pretty close)!

Blob storage is almost a flat file system. It’s laid out in the following format:   account>>container>>blob.

By contrast most libraries have a very nested structure. In my example I’ll use Zend Framework. The class Zend_Log_Writter is stored in the file:   lib >>Zend >> Log >> Writer.php

The first logical step is to flattern the library into a single folder using periods to denote the previously nested structure of the files. EG: Zend.Log.Writter.php would be the name of a blob stored in the container ‘zend’. I achieved this using a simple command line script using php and the SPL iterators.

$args = $_SERVER['argv'];
 
$source = $args[1];
$dest = $args[2];
 
$rSourceDir = new RecursiveDirectoryIterator($source);
 
$rSourceDirItr = new RecursiveIteratorIterator($rSourceDir);
 
$transArray = array();
 
foreach($rSourceDirItr as $key => $file) /* @var $file SplFileInfo  */
{
 
    $realPath = $file->getRealPath();
 
    $flatPath = flatFileName($realPath);
 
    $destination = $dest . DIRECTORY_SEPARATOR . $flatPath;
 
    copy($realPath, $destination);
 
}
 
function flatFileName($path)
{
    $pos = strpos($path, 'Zend');
 
    $path = substr($path, $pos);
 
    $path = str_replace(DIRECTORY_SEPARATOR, '.', $path);
 
    return $path;
}

The script takes two arguments: the source (nested library directory) and a destination directory (for the flattened structure).

After this process it’s worth inspecting the flat folder to ensure the results are consistent to what you expected. After this I used a similar script to upload the files to Azure Blob Storage.

include 'conf.php';
 
// Set up Microsoft Azure SDK Autoloading
 
set_include_path(get_include_path() . PATH_SEPARATOR . AZURE_SDK);
 
require_once 'Microsoft/AutoLoader.php';
 
Microsoft_AutoLoader::Register();
 
$storageClient = new Microsoft_WindowsAzure_Storage_Blob(AZURE_HOST, AZURE_ACNAME, AZURE_PACCESS);
 
// Check Args are in order
 
$args = $_SERVER['argv'];
 
if(!isset($args['1']) || !isset($args['2']))
{
    die('invalid arguments supplied' . "\n");
}
 
$source   = $args[1];
$destCont = strtolower($args[2]);
 
if(!is_dir($source))
{
    die('source is not file');
}
 
if(!$storageClient->containerExists($destCont))
{
    $storageClient->createContainer($destCont);
}
 
// Folder should be flat
$recSourceDir = new RecursiveDirectoryIterator($source);
 
foreach($recSourceDir as $file)
{
    $storageClient->putBlob($destCont, $file->getFileName(), $file->getRealPath());
    echo "PUT - " . $file->getFilename() . "\n";
}

This is likely to take some time as each file is uploaded individually.

The next step in the process is to set up an autoloader to get required files from Azure.

class Flat_Autoloader
{
 
    /**
     * Blob Storage Class
     *
     * @var Microsoft_WindowsAzure_Storage_Blob
     */
    private static $blob;
 
    /**
     * Registers the autoloader
     */
    public static function Register($blob)
    {
        self::$blob = $blob;
        self::$blob->registerStreamWrapper('blob');
 
        return spl_autoload_register(array('Flat_AutoLoader', 'Load'));
    }
 
    /**
     * Load a class
     *
     * @param string $className Class name to load
     */
    public static function Load($className)
    {
        // Get the file name from the class name.
 
        $partArray = explode('_', $className);
 
        $containerName = trim(strtolower($partArray[0]));
 
        $flatClass = implode('.', $partArray) . '.php';
 
        echo $containerName . " " .  $flatClass;
 
        if(!self::$blob->blobExists($containerName, $flatClass))
        {
            return false;
        }
 
        require_once('blob://' . $containerName . '/' . $flatClass);
 
    }
 
}

The class takes an instance of the Azure Blog storage class from the Windows Azure SDK and on registration of the autoloader ensures that the ‘blob’ stream is available for future use. In the Load method the incoming class name is exploded and the first part of the class name (‘Zend’) is used as the container to look for the class file in. The blob name is the imploded class name (with ‘.’ substituting any instances of ‘_’).

The container / blob combination is the required.

Caveats

There are two main caveats to using this approach to autoload files on Azure.

1) The included files may not themselves use require / include statements as these will fail. This article outlined a method for removing all the require_once calls from Zend Framework.

As the blog above mentions there is one require_once statement in Zend_Application that cannot be removed. After initially trying to override methods in Zend Application that trigger Zend_Autoloader I gave up. I think inheriting from Zend Application and Zend_Application_Bootstrap_Bootstrap then overriding the relevant methods is possible but is a project for the future.


% cd path/to/ZendFramework/library

% find . -name '*.php' -not -wholename '*/Loader/Autoloader.php' \
-not -wholename '*/Application.php' -print0 | \
xargs -0 sed --regexp-extended --in-place 's/(require_once)/\/\/ \1/g'

2) Requiring a file from blob storage has a particularly high performance penalty as it is done over http. The next logical step is to use  a byte code cache like APC or the Azure equivalent Win Cache to ensure that after an initial load the file is not recalled from blob storage again.

Tags: , ,

May 13, 2011 1

PHP Application Logging On Azure

By admin in PHP Azure

In previous posts I’ve mentioned that once you get an app on Azure it’s often difficult to gain visibility on errors that occur on staging and production. The problem is compounded when (as in my Azure App) you have a number of remote workers processing tasks with little feedback on the results of each task.

To that end I’ve developed a little application to demonstrate using Azure’s blob storage to log events in your application to a blob you can retrieve and view.

The mini Zend Framework application consists of two parts: setting up an instance of Zend_Log and using it to log events and retrieving a list of the events and displaying them on screen.

The demo code is available here: https://github.com/benwaine/Application-Logging-On-Azure

Part One: Set Up

Blob storage is a means of writing files to a permanent cloud storage medium. The Azure cloud storage platform has an API. Using the API users can create containers and blobs. This of containers as folders and blobs of files. There are some differences, for example it’s not possible to nest containers. Although it is possible to using a naming convention to simulate a folder structure within a container.

Microsoft have supplied a comprehensive SDK for the Windows Azure Platform. This makes the task of interfacing with Blob storage a breeze.  The SDK is included in the source of the demo but you can download it and read about it on the projects site on codeplex.

The SDK provides a means of accessing blob storage using streams. Zend_Log supports a number of writing classes, one of which uses writes to streams. The of setting up Zend_Log and Zend_Log_Witter_Stream is accomplished in the applications Bootstrap class.

class Bootstrap extends Zend_Application_Bootstrap_Bootstrap
{
 
    protected function _initAutoload()
    {
        Microsoft_AutoLoader::Register();
    }
 
    protected function _initBlob()
    {
        $this->getResource('autoload');
 
        $opts = $this->getOption('azure');
        $blob = $opts['blob'];
 
        $storageClient = new Microsoft_WindowsAzure_Storage_Blob(
                        $blob['host'], $blob['acname'], $blob['paccess']);
        $storageClient->registerStreamWrapper('blob');
 
        return $storageClient;
    }
 
    protected function _initLog()
    {
        $opts = $this->getOption('azure');
        $blob = $opts['blob'];
 
        $storageClient = $this->getResource('blob');
 
        if(!$storageClient->containerExists($blob['logs']['container']))
        {
            $storageClient->createContainer($blob['logs']['container']);
        }
 
        if(!$storageClient->blobExists($blob['logs']['container'], $blob['logs']['log']))
        {
            file_put_contents($blob['logs']['stream'], "\n");
        }
 
        $writer = new Zend_Log_Writer_Stream($blob['logs']['stream']);
        $log = new Zend_Log($writer);
 
        $log->info('Logging Initialized');
 
        return $log;
    }
 
}

1) Ensure Zend Framework and the Microsoft Azure SDK are on the include path. (Tip: Add both to the library/ directory in the root directory).

2) Rename the example configuration file in the application/configs directory to ‘application.ini’. Replace the dummy settings with the details of your Azure subscription.

2) In the Bootstrap class we can see that first we add the Microsfot SDK autoloader to the autoloader stack.

3) On line 10 in the _initBlob() method a new instance of the Azure Blog storage client is initialised. It is returned by the method and is stored in Zend_Application’s resource registry for use later.

4) The _initLog() method on line 24 grabs the blob resource from the previous step and uses it to check if both the container and blob specified in the application.ini file actually exist. Note we can see that if the blob isn’t present it must be initialised, otherwise a error is thrown.

5) The Zend log is created and returned. The log is now part of Zend_Application’s resource registry and can be accessed from any action controller or even injected into your domain models to provide richer logging.

Part Two – Retrieving Logs

Retrieving the logs from blob storage is a piece of cake. In this simple application I put all the code in a controller action. It simply opens the file using the storage client set up in the bootstrap process, explodes the string into an array and assigns this to the view.

class IndexController extends Zend_Controller_Action
{
 
    public function init()
    {
        /* Initialize action controller here */
    }
 
    public function indexAction()
    {
        $bootstrap = $this->getInvokeArg('bootstrap');
        $storageClient = $bootstrap->getResource('blob'); /** @var $storageClient Microsoft_WindowsAzure_Storage_Blob **/
 
        $config = $bootstrap->getOption('azure');
        $project = $bootstrap->getOption('project');
        $fileStr =  file_get_contents($config['blob']['logs']['stream']);
        $logAr = explode("\n", $fileStr);
 
        $this->view->projectName = $project['name'];
        $this->view->logs = $logAr;
 
}

1) Options and resources are retrieved from the bootstrap.

2) file_get_contents is used to retrieve the content of the blob.

3) The string from the blob is exploded into an array.

4) Project name and log array are both assigned to the view.

5) In the view script the log array is iterated over and each line of output is echoed.

The result is a page full of log information.

Tips

Azure blob storage is accessed via a RESTful web service. While this is a great tool getting some visibility on what’s happening in your application while it’s on staging there is definitely a performance hit incured by the latency of talking to a remote service.

Even initialising the log incurs a performance penalty as the you can see that a check for both the container and the blob is made in the bootstrap. I recommend using this technique on staging and disabling it when in production.

In the future I’m going to add some ajax support so that the log refreshes in real time, similar to the way in which linux developers  offten use the tail -f command to watch logs during development.

Feel free to use the code and if you have an suggestions please comment.

Tags: , ,

April 2, 2011 0

Launching My PHP Application on Azure: Take Two (Success!)

By admin in PHP Azure

Today I successfully launched v1 of my application on Azure. Yesterday i was feeling quite disheartened and was faced with a difficult problem to solve in order to get my application running in the cloud. Luckily PHP on Windows veteran Juozas Kaziukėnas was around to lend me hand. As he never sleeps we were able to get an answer to my problem at around 3am. Thanks Joe!

The issue: The PHP configuration on on my local development environment included the sqlsrv driver. When using the Windows Azure Command Line Tools to package my application I used the –phpRuntime switch to specify a version of PHP to be shipped to the cloud to run my application on. Unfortunately when tuning the application on Azure Doctrine2 reported the sqlsrv driver missing.

The solution: It turns out the the php.ini Azure was using to run my application on azure contained absolute paths to the extension directory. Joe recommended changing the path to a relative one. I made the following changes to php.ini in both the local version of PHP used by IIS to run my site and also in the php.ini file located in the windows command line tool directory (in my case: “C:\Program Files\WindowsAzureCmdLineTools4PHP\res\php\runtime”). The version in the tools directory should be updated when specifying the –phpRuntime switch when packaging but I wanted to be sure.

extension_dir="c:\Program Files (x86)\PHP\v5.3\ext"

to:

extension_dir="./ext"

A further issue I mentioned yesterday was my inability to adjust the size of the VM used by Azure to host my application. I found the answer to this indirectly on Ben Lobaugh’s blog. In a recent post about customising the ‘ServiceDefinition.csdef’ file used by the command line tools to create an Azure package. In the post he demonstrates how to insert your own values and have them used by the tool to produce the service definition included in the package and used by Azure.

In short you need to edit the ‘ServiceDefinition.csdef’ located in the res directory of the command line tools (in my case: “C:\Program Files\WindowsAzureCmdLineTools4PHP\res\tool”).

The line:

<WebRole name="WebRole">

Changes to:

<WebRole name="WebRole" vmsize="ExtraSmall">

And the VM size is set to ExtraSmall when the package is uploaded to Azure (currently the only size you can run continuously under the free tier).

These two issues where the last ‘blockers’ preventing me from running my application on Azure. In the coming week I need to look into some optimizations:

1) Moving the application library folder onto blob storage

2) Using WinCache with Doctrine to ensure maximum performance

3) Setting up SSL and a domain name for the Azure instance.

I’ll be making it available for public view when i have cleared up the last few bugs surrounding security and API limits.

Tags: , ,