May 15, 2011 0

Offloading Libraries To Azure Blob Storage: Initial Experiments

By admin in PHP Azure, Uncategorized

In a previous post I’ve mentioned one of the big pain points on Azure for PHP developers is that it’s necessary  to upload the frameworks or libraries used in an application each time you push to Azure.

It was suggested that one way in which to get round this is to move your libraries to Azure blob storage. This isnt a supported solution but I thought I’d give it a go. At the moment it takers roughly 40 min to package my application and then upload it to Azure. Note: These are just initial experiments. The process outlined below doesn’t work yet (although it’s pretty close)!

Blob storage is almost a flat file system. It’s laid out in the following format:   account>>container>>blob.

By contrast most libraries have a very nested structure. In my example I’ll use Zend Framework. The class Zend_Log_Writter is stored in the file:   lib >>Zend >> Log >> Writer.php

The first logical step is to flattern the library into a single folder using periods to denote the previously nested structure of the files. EG: Zend.Log.Writter.php would be the name of a blob stored in the container ‘zend’. I achieved this using a simple command line script using php and the SPL iterators.

$args = $_SERVER['argv'];
 
$source = $args[1];
$dest = $args[2];
 
$rSourceDir = new RecursiveDirectoryIterator($source);
 
$rSourceDirItr = new RecursiveIteratorIterator($rSourceDir);
 
$transArray = array();
 
foreach($rSourceDirItr as $key => $file) /* @var $file SplFileInfo  */
{
 
    $realPath = $file->getRealPath();
 
    $flatPath = flatFileName($realPath);
 
    $destination = $dest . DIRECTORY_SEPARATOR . $flatPath;
 
    copy($realPath, $destination);
 
}
 
function flatFileName($path)
{
    $pos = strpos($path, 'Zend');
 
    $path = substr($path, $pos);
 
    $path = str_replace(DIRECTORY_SEPARATOR, '.', $path);
 
    return $path;
}

The script takes two arguments: the source (nested library directory) and a destination directory (for the flattened structure).

After this process it’s worth inspecting the flat folder to ensure the results are consistent to what you expected. After this I used a similar script to upload the files to Azure Blob Storage.

include 'conf.php';
 
// Set up Microsoft Azure SDK Autoloading
 
set_include_path(get_include_path() . PATH_SEPARATOR . AZURE_SDK);
 
require_once 'Microsoft/AutoLoader.php';
 
Microsoft_AutoLoader::Register();
 
$storageClient = new Microsoft_WindowsAzure_Storage_Blob(AZURE_HOST, AZURE_ACNAME, AZURE_PACCESS);
 
// Check Args are in order
 
$args = $_SERVER['argv'];
 
if(!isset($args['1']) || !isset($args['2']))
{
    die('invalid arguments supplied' . "\n");
}
 
$source   = $args[1];
$destCont = strtolower($args[2]);
 
if(!is_dir($source))
{
    die('source is not file');
}
 
if(!$storageClient->containerExists($destCont))
{
    $storageClient->createContainer($destCont);
}
 
// Folder should be flat
$recSourceDir = new RecursiveDirectoryIterator($source);
 
foreach($recSourceDir as $file)
{
    $storageClient->putBlob($destCont, $file->getFileName(), $file->getRealPath());
    echo "PUT - " . $file->getFilename() . "\n";
}

This is likely to take some time as each file is uploaded individually.

The next step in the process is to set up an autoloader to get required files from Azure.

class Flat_Autoloader
{
 
    /**
     * Blob Storage Class
     *
     * @var Microsoft_WindowsAzure_Storage_Blob
     */
    private static $blob;
 
    /**
     * Registers the autoloader
     */
    public static function Register($blob)
    {
        self::$blob = $blob;
        self::$blob->registerStreamWrapper('blob');
 
        return spl_autoload_register(array('Flat_AutoLoader', 'Load'));
    }
 
    /**
     * Load a class
     *
     * @param string $className Class name to load
     */
    public static function Load($className)
    {
        // Get the file name from the class name.
 
        $partArray = explode('_', $className);
 
        $containerName = trim(strtolower($partArray[0]));
 
        $flatClass = implode('.', $partArray) . '.php';
 
        echo $containerName . " " .  $flatClass;
 
        if(!self::$blob->blobExists($containerName, $flatClass))
        {
            return false;
        }
 
        require_once('blob://' . $containerName . '/' . $flatClass);
 
    }
 
}

The class takes an instance of the Azure Blog storage class from the Windows Azure SDK and on registration of the autoloader ensures that the ‘blob’ stream is available for future use. In the Load method the incoming class name is exploded and the first part of the class name (‘Zend’) is used as the container to look for the class file in. The blob name is the imploded class name (with ‘.’ substituting any instances of ‘_’).

The container / blob combination is the required.

Caveats

There are two main caveats to using this approach to autoload files on Azure.

1) The included files may not themselves use require / include statements as these will fail. This article outlined a method for removing all the require_once calls from Zend Framework.

As the blog above mentions there is one require_once statement in Zend_Application that cannot be removed. After initially trying to override methods in Zend Application that trigger Zend_Autoloader I gave up. I think inheriting from Zend Application and Zend_Application_Bootstrap_Bootstrap then overriding the relevant methods is possible but is a project for the future.


% cd path/to/ZendFramework/library

% find . -name '*.php' -not -wholename '*/Loader/Autoloader.php' \
-not -wholename '*/Application.php' -print0 | \
xargs -0 sed --regexp-extended --in-place 's/(require_once)/\/\/ \1/g'

2) Requiring a file from blob storage has a particularly high performance penalty as it is done over http. The next logical step is to use  a byte code cache like APC or the Azure equivalent Win Cache to ensure that after an initial load the file is not recalled from blob storage again.

Tags: , ,

Leave a Reply