Top
Welcome

Hi. My name is Simon Hawkins. Here I share ideas on technology & business. For work I run a software business in the UK. When the internets are off entertaining a 3 year old occupies quite a bit of the time.

Thanks for dropping by.

Links
Interesting articles
Twitter

Twitter Updates

    follow me on Twitter
    Powered by Squarespace
    Wednesday
    Feb242010

    Dark Ages 2.0

    This is a talk I gave at the Geek Night in Oxford. Complete Slides of the talk.

    A summary of the talk.

    With photographs and written word on paper backup/storage is passive, barring physical damage the content will be readable for years or even hundreds of years. With digital media backup/storage is active, if you don't keep testing it, duplicating & moving forward in media format your content will be unusable over time. Compare the box of photos and notes left in the loft for 50 years to a stack of CD-R. Will the discs be readable, will you even have a drive to read them? Chances are the paper will be. This is a problem for all of us as time passes and we collect more content in digital format.

    Full Talk

    The first dark age occurred around 410AD after the collapse of the Roman empire. The period last until shortly after 1000AD and historically very little is known about this period.

    I think we are creating a new dark age. Not like the first one, but a dark ages for family social history, this will be caused by a loss of data.

    I've been thinking about this a lot recently, as we now have a daughter. Taking photographs is now part of recording the family history for my daughters future generations. Currently I have around 8000 photos stored in iPhoto all tagged with descriptions, all of which is stored in it's proprietary database, hmmm.

    This is a recent phenomenon, the problem is occurring now and the effects of it will not be seen until the future. Photographing has been around since 1850, yet it's only since 2000 that digital cameras have been in common use.

    The two big issues are

    • Backups
    • Cataloging

     Backups/Storage

    We have a photo of my daughter's great great great grandmother, the picture is over 123 years old. All being well, will my daughter's great great great grandchildren in 123 years time be able to view the photographs we have taken now? The work to make sure this is possible will be much more than just putting photographs into a box like our ancesters did. It's this 'active' management that is going to be the cause of the 'Dark Ages 2.0'.

    Since photographs were invented around 1850 'backups' have been easy, just chuck the negatives and photos in a box. This is what most families have done and it's worked well for generations.

    Now we have digital backups to carry out. There are many media options for storing backups and over time these degrade and fall out of use. We have to continually recreate backups, test them and move them forward as new media formats are developed - a lot of 'active' management. 

    Just imagine a hard drive got 'stuck in the loft' like a box of photos might. After 50 years the photos would be fine. The hard drive? Even if it would spin up, would the data on the disk be readable, and with a USB connection finding a computer to plug it into might prove difficult. 50 Years ago personal computers didn't exist. The progress in the next 50 years will make it very difficult to read media from this era. I found a few old copies of Computer magazines in the back of a cupboard from around the mid 90s with floppy disks on the cover. I can still read the magazine, but I don't even own a 3.5" floppy drive any more to be able to read the disks.

    This is all something as technical people we understand, but even as technical people we know we probably don't back up enough. So what chance does an 'average person on the street' stand trying to keep on top of this. I personally know of non-tech friends who have lost photos, some of very important family history moments; weddings, babies, loved ones no longer with us. It's that digital backups are an 'active' process that causes the problem.

    Previously backing up/storage was a passive process, by putting things in a box it was done, they would always come out roughly in the same condition they went in. And this passive process has meant that family photos have survived down the generations, now the process is 'active' how much will survive over time?

    Meta-Data

    There are two options with where to store meta-data

    • In the image file
    • In a separate database

    Looking at our old photographs as an example of what to do - if we're lucky somebody wrote the date the photograph was taken and who is in the photograph straight on the back of the photograph. This has stood the test of time well as it's commonly the only way we know who features in the photographs. 

    Now image that instead of writing the meta-data on the back of the photograph they wrote a number. Then in a separate notebook next to that number they wrote all of the meta-data. Over time the photos and the notebook must always be kept together. What happens if the photos get shared out to different family members, would the notebook get split up or copied (by hand, unless it's in the last 30 years). Also if the notebook got lost then all of the meta-data for all of the photos would be lost for ever. Overall it doesn't sound a very good solution and thankfully most people just wrote on the back of the photographs.

    Many photo software packages do store the meta-data exactly as described above, in a separate proprietary database. iPhoto is one such program. When backing up photographs the database needs to be backed up as well. They must always be kept together and in sync. iPhoto provides options for this, but in 50 years time would iPhoto 59 be able to restore a backup from iPhoto 9 with all the meta-data restored? Also what other program would be able to load this proprietary database and extract the meta-data. How much work would be required to obtain this data. Also if this one database file becomes corrupt then all of the meta-data for every photograph will be lost.

    The other option is to store the meta-data within the image file; jpg or RAW (preferably though for archiving convert your RAW images to DNG as this open standard will stand much greater chance of continued support in the future). As the old adage goes in computing we like standards - that's why we have SO many of them. And so it is with image meta-data, with the following standards; EXIF, XMP, IPTC & MakerNotes. By using a combination of these standards it is possible to store all of the meta-data required about images directly within the image files.

    MakerNotes throws a bit of a spanner in the works as it's internal contents is not a defined standard but a 'binary blob' of data that records all of the camera details when the photo was taken. The ISO, Lens type, shutter speed etc. Unfortunately there is no standard for this data and each manufacture has come up with their own format, and not all of them are documented. If that wasn't bad enough the data has absolute references within it, so changes to the rest of the meta-data can corrupt the MakerNotes. Picassa is one such program that helpfully stores all tags and descriptions in the EXIF/XMP headers but unfortunately does not understand MakerNotes and therefore can corrupt this information in your images files.

    Two programs that do handle all meta-data within the image files and correctly handle the MakerNotes are DigiKam & Adobe Lightroom 2. DigiKam is open-source and works on Linux, Windows and Mac OS X.

    I think that for the storage of photographs a set of plain file system folders with images in jpg or dng format with all meta-data contained within the images will be the most resilient and most likely to endure archival system for the future.

    Even keeping to this very simple storage structure, frequent testing, duplication and transfer across media formats will be required to maintain the archives for future generations.

    What's your strategy?

     

    Since giving this talk an article along similar lines has been published in American Scientist called Avoiding a Digital Dark Age which is also worth a read.

    Tuesday
    Jan052010

    Javascript - The Final Big Language FBL

    JavaScript will not just be the NBL (Next Big Language) it will be the FBL - Final Big Language.

    Big statement, but I think the pieces are falling into place to make this happen and I think Node.js will be a big driver of this process. It will be the driver for JavaScript server side as Rails was for driving Ruby for server side development.

    Rails crystallised many great ideas in how to develop web applications and Ruby's design allowed this to be coded in a very clean way. Many of the ideas had been around for a while but it took DHH using Ruby to seed the community around a Rails. Just look how it's transformed web development over the past 6 years, and how it has influenced so many other frameworks in other languages.

    I see node.js as the seed for JavaScript on the server side. OK it's lower down the stack than Rails, but it's seeded the idea of what is possible with JavaScript on the server, just look at how interest is developing. Already frameworks taking the best of Rails/Django are starting to appear running on node.js, and the performance for such young frameworks hint at what will be possible in the near future.

    The crucial factor in JavaScript being the FBL is the server programming language now matches the client. Do not underestimate the impact of this. Since web development began we've moved through various languages server side...  PERL, Java, PHP, ASP, Python, Ruby, and many more. On the client side we've just had JavaScript since 1994 - 16 years! (ok Microsoft did have a go with VBScript in the browser, enough said).

    Once you can develop on the server and client side in one language, unless the client side changes, it would seem unlikely on the server side you would move on to another language. As a developer why would you go from working with one common language and common set of libraries covering both server & client side to learning a separate language when you are still going to be developing in JavaScript on the client side. The benefits of a new server side language would need to be substantial to break from having one consistent language.

    I'm not saying JavaScript is the 'best' language (however you define that), just that it will become very popular.

    node.js will lead this, event driven server side programming that by it's nature allows very high performance (even for such a new system), and JavaScript provides a very natural environment for callback based development. Take a look at the chat example to see how the environment provides such a natural fit and the code reduction that comes from this.

    JavaScript is being built into many technologies, just look at CouchDB using JavaScript for it's view language and as I've previously written, JavaScript in Yahoo's query language and many other services & obviously the browser.

    How about developing apps for iPhone & Android in JavaScript. No problem just take a look at Appcelerator

    The progress seems unstoppable, will we all be JavaScript developers in the future?

    Wednesday
    Jul082009

    Google Chrome OS - The most interesting bit

    Well Google have gone and announced they're creating their own operating system. This is big news in itself and is being covered all over the web, but I think the most interesting part of the announcement is where they state;

    Google Chrome running within a new windowing system on top of a Linux kernel.

    Google are not going to use X Window System!

    They're going to provide their own windowing system and they are going to open-source the code. This is huge, X has so many limitations and issues. Tweaks and workarounds like the Direct Rendering Infrastructure to take better advantage of video hardware are being made, but these are all hamstrung by they underlying architecture of X11.

    As one of the Apple Quartz developers, Mike Paquette explained about Apple's decision to develop Quartz rather than use X11;

    once Apple added support for all the features it wanted to include into X11, it would not bear much resemblance to X11 nor be compatible with other servers anyway.

    With Google providing a new window system, they have the opportunity to design from scratch a modern architecture for a windowing system. To take advantage of modern video hardware, deal with multiple displays, handle displays DPI correctly, font handling at the lowest levels. All without having to take account of a windowing architecture designed in the 80's where even the lowest power graphics chip of today would have been unimaginable.

    With a whole open source operating system based on this being released with at least one large example application (Google Chrome) maybe a real competitor to X Windows will emerge, and adoption by other projects will occur?

    I'm looking forward to seeing this happen.

    Saturday
    Jun062009

    Google App Script - bigger than Google Wave?

    Google made an announcement of a limited test of App Script at the same time that Google Wave was announced. Wave gained much of the attention but App Script has the potential to be far more important. Initially App Script is just available in Google Docs  spreadsheets for a limited number of users, but this will expand over time.

    So what is it?

    Well App Script allows you to write functions in Javascript that run directly on Google's servers. Just to repeat the code runs on Google servers not in the browser. This really is the next step in the 'programmable web'. So now instead of having to write a full Google App Engine application or set up Amazon EC2 instances with a full server environment you can just write a Javascript function and have it running on Googles servers. (Yahoo are also providing a service where you write Javascript that executes on their servers - YQL Execute)

    This will allow mash-ups on the server side, whereas currently mash-ups have resided within the browser. As more web services allow server-side scripting the web really will turn into a fully scriptable environment. Previously applications that would have required complete custom development will just require integrating together of sevices with scripting to provided the desired application.

    With Javascript becoming the universal language and JSON providing the standard for data transfer along with oAuth for security - all of the pieces are falling into place to provide a complete distributed development environment hosted within the cloud and based on the cloud, rather than the old model of a single development environment by a single supplier.

    The potential for this cannot be overestimated, Google Wave is fantastic but with App Scripting it will be just a messaging service that is tied into cloud hosted applications using many other services.

    Wednesday
    May272009

    The Big Switch

    This is a talk I gave at the Geek Night in Oxford. Slides of the talk.

    Changes of computing model are rare in our industry. In the roughly 50 years our industry has been in existence, we have had only 3 computing models, and we're just at the start of the 3rd. A change of computing model is a big thing.

    The first model of computing was Centralised : Mainframe & Supercomputer - Big glass walled rooms, very expensive, very reliable, with everything centrally managed.

     

     

    The second model of computing was client/server : The 1980's saw the emergence of personal computers and unix workstations. This model was built on the idea of relatively low cost, simple systems with typically one server dedicated to a single application. This flexibility and low cost along with the GUI on the client PC saw a huge growth in new applications.

     

    The mid 1990's saw the development of web applications, the start of the Internet Model. The number of users able to access these systems massively increased and the servers needed to be more scalable, reliable and have better management capabilities. Suddenly running a web app required the discipline of the old mainframe world. This would lead to the emergence of a cloud computing marketplace.

    This has many parallels to the use of electricity back in the 1900's. I recommend to everyone reading 'The Big Switch' by Nicolas Carr. Nicholas describes in great detail how before 1900 every manufacture was in 2 businesses. The product they were producing & energy production. Initially energy production was mechanical then electricity took over as a far more controllable source of energy. Electric energy had the unique property that it could be transmitted very efficiently, allowing the source of energy production to be distant from it's use. It also meant that production of electricity could be centralised.

    Initially companies could not imagine outsourcing the generating of electricity to another company. Factory owners knew that a glitch in the power supply would bring their production to a halt. At the turn of the century virtually all electric was privately produced. Thomas Edison at General Electric was making huge profits supplying all of the hardware to set-up these private electric plants.

    But two technologies where to change all of this, the steam turbine and alternating current.

    These two technologies allowed for massive power stations and distribution over far greater distances. These immense stations would be able to supply the demands of even the largest companies, and at far lower costs than possible before. A positive feedback occurred. As more customers are served the efficiencies increase, reducing prices, attracting more customers. By 1920 70% of all US electricity production was by the utilities, at the start of the century it was virtually zero.

    How does this relate to EC2 & App Engine?

    Using Amazon's EC2 (Elastic Computer Cloud) your working with machine images (just a disk image and config files) that can be run on servers completely separate from Amazon.

    This flexibility is where the generator from Aldi comes in. Standards in electricity means your electrical appliances can be plugged into the mains or you can pop to Aldi, buy a genny and plug your appliances into that. Or you can hire a genny from a hire company - whatever the appliance will work. Likewise using EC2 the option is always there to be able to run your service locally or at another hosting company with minimal effort to move the images around.

    Compare that with Google App Engine. Being tied to the App Engine datastore, and other requirements on their Python and Java environments means you are developing your app to run entirely on their system. For deployment of a live application your only option is App Engine. If you're developing a new product in your start-up would you feel comfortable betting your whole company on App Engine and the services it provides.

    With EC2 your just renting a machine to run your image and you can do that via many routes.

    You can see the effect of this today, with the number of commercial businesses using each system. Until their are other routes to hosting App Engine applications in a live environment I can not see App Engine exploding in business use.

    What does it mean for your Start-up / Business?

    If you are not considering using a hosted service, a competitor will be and they will end up providing the service at lower cost or making more margin - all the while not having to deal with the digital 'power supply' that your company is. They will be able to react quicker as they are running one business while you are running two.

    As were are at the start of a new model in computing there are going to be many options, some of which will be dead ends. For us today it means trying to keep your options open while taking advantage of this new model. You need to consider what your product is about. Which bits are unique, which bits are the energy production and which bits I can outsource to a utility company.

    Virtually all power production went to utilities in just a 20 year period. In IT 20 years is nearly half the life of our industry, the transition to a cloud future is going to be rapid.

    Eventually people began to trust power generation as the utilities proved themselves over time. Even the London Underground stopped generating their own electric, which then resulted in complete shut down of the Underground when there was a failure of the national grid to supply power.

    But as this shows it gets to the point where the risk is so low we just live with it and the odd occasion when it does fail. As the alternative of generating it ourselves or having a backup is no longer economically viable.

    As the reliability of the services increase to the point where you just 'plug' something in and it always works, the same will happen with the cloud. We will end up with very large utility companies with server farms offering services at a cost that is so low any other option will not be economically viable. Just as Edison was trying to hold back the inevitable;

    As Nicholas Carr puts it:
    "In the end the savings offered by utilities become too compelling to resist, even for the largest enterprises. The grid wins."

    Change leads to opportunity. By making the right choices the rewards can be great, just watch out for those other choices!

    References

    The Big Switch Our New Digital Destiny by Nicholas Carr ISBN 978-0-393-06228-1
    Irving Wladawsky-Berger : The emergence of a new model of computing

    Images

    Mainframe : http://www.flickr.com/photos/carrick
    Client/Server Room : http://www.flickr.com/photos/sylvar
    'Cloud' Server room : http://www.flickr.com/photos/torkildr
    OS/2 Image : http://en.wikipedia.org/wiki/File:OS2_2.0_upgrade_box.png
    Generator: http://www.flickr.com/photos/timdorr
    FerryBridge power station : http://www.flickr.com/photos/37117644@N00
    Large server room : http://www.flickr.com/photos/mrfaber
    Electricity Pylons : http://www.flickr.com/photos/sunpig