28/02/2012

Grog, a PaaS based on OSGi for Public Administrations and private networks

As a result of my 9-months thesis work, I now present Grog, a PaaS based on OSGi for Public Administrations and private networks.






In the field of information technology, the federation of resources in order to create faster and with lesser costs system with ever rising performances, is a widely adopted technique. Usually the pioneers in the field are the big corporations or organizations, which already own an internal network, and can improve their infrastructure by composing systems with an high degree of interoperability and fault tolerance while still maintaining a low total cost of ownership.
The concept of Grid computing is in that sense the major expression of such practice, allowing resources from billions of geographically sparse and network connected computers to be easily united as opposed to the idea of supercomputer, which brings to mind the image of a gigantic and freezing room, where a single huge machine as powerful as hungry for electricity, is housed.

Thanks to its dynamic nature and the ease with which new machines can join the network to raise the available total resources and computing power, the Grid was a point of reference in the expansion of many corporations, organizations and research centres. All these entities had the privilege to own or access such a resource and use it to reach their goals.
Eventually, as often happens in the computer world, the access to Grid networks was widened, although with some limitations, to include even private citizens or small companies; this concept of "Grid computing for the masses" has been marketed under the name Cloud computing.

The spreading and popularity of the Cloud was immediately exponentially raising, until the point that today it is often mistakenly addressed as an alternative to the Grid or even replaces the term.

Today the access to the Cloud is offered to the users through Service Delivery Platforms or Service Delivery Frameworks and is divided in three levels of use, depending on the given resources and control types. In the simplest one, called Infrastructure as a Service (IaaS), a certain amount of basic resources (computing power, memory, etc.) is made available to the user in the form of virtual machines to be freely customized; the middle level, Platform as a Service (PaaS), grants the user access to a development platform which includes a vast variety of tools to develop applications to be published inside the network. In the last level, Software as a Service (SaaS), complete applications are made available to the users via the network itself, easing and improving accesses and licences management and allowing the use even from machines which do not have the required hardware to run them.
For all these levels there are multiple offers available, each with its own features, usually more focused on a certain aspect rather than generic goals; the most successful ones are: Amazon EC2, VMware vCloud, Microsoft Azure, Google AppEngine, Google Services and SalesForce.

This thesis was developed with the collaboration of Engineering Ingegneria Informatica, a company with nationwide coverage here in Italy, which actively cooperates in Cloud-themed projects, even on a European scale. Among them, we can cite VENUS-C, part of the Seventh Framework Programme initiative started from the European Union, ETICS, Eucalyptus and OpenNebula.

The thesis goal was to design a Service Delivery Platform to be used by the Italian Public Administration (PA); focal points then were (i) the reuse - enforced by the regulation D.Lgs. 7 marzo 2005, n. 82 -, (ii) the need to easily and rapidly create and compose services in a standard way and (iii) the need to offer a set of unified services to the users, both PA employees and private citizens.
The concept of reuse is vital in such scenario, as it turns out from an analysis performed by Engineering and the city of Bologna which highlighted how the information system of a large city can easily count more than 200 applications; many of those perform essentially the same operations as the ones installed in other cities systems. During the analysis for example, 179 programs were inspected in Bologna only, while medium-sized towns can easily count more than 100 applications.

While considering this scenario, we noticed that such a system could, with slight modifications, fulfil the requirements of those considering to build a private network too.

Of all the offerings available today however, none presented the tools needed to satisfy our requirements, as they would focus either on a different level or on a non-cooperative model in which each user controls and manages only his applications, allowing the access under paid subscriptions.
During the phase of analysis of the available offerings and the state of the art moreover, we discovered that similar problems had already been studied:

  • OSGi4Cloud analyses and tries to solve an issue tied to the use of OSGi in distributed systems, but unluckily it bases its concept on a dead technology (JXTA);
  • OSGi Alliance itself analysed the issues tied to the remote use of their protocol and proposed, at the end of last year, a draft where the problems were discussed along with some ideas about how they could be solved, but without providing a real specification;
  • jClouds, a library with OSGi support, is mainly focused on a network interaction point of view, something more on a IaaS level;
  • OSPaaS claims to realize a PaaS system based on OSGi for e-learning and on-line teaching, but the document describing the specification is not freely available to consult.

None of these platforms was adaptable to our case. Thus the decision was to (i) design our own PaaS platform (ii) to be integrated with any Cloud network, (iii) which allows the creation and deployment of services accessible by all users and (iv) easily composed into new ones, from a Web 2.0 point of view of mash-ups, Open Innovation and Open Data.

Our work, after a thorough analysis of the state of the art, development possibilities and existing market, led us to the decision to create a new platform, which we named Grog, using only open source tools and components and based on protocols such as OCCI and OSGi.

The Open Cloud Computing Interface (OCCI) is a set of open, community-driven specifications published through the Open Grid Forum, which define how the service providers can offer their resources via a standard interface. OCCI is based on the World Wide Web fundamentals, using the solid REST (REpresentational State Transfer) approach to create an extensible interaction model with IaaS services.

The Open Services Gateway initiative (OSGi) is a specification for the management of complex Java applications under the form of interconnected modules, such as the integrated development environment Eclipse or the application server WebSphere, and has recently emerged as the de facto standard for such programming way. Its enormous success is due to the fact that it eases the dynamic management of the various components and automatically resolves the code dependencies even at runtime. Furthermore, it allows the seamless composition of code from different authors and allows the parallel usage of multiple versions of the same code.
These features are well suited for the the Cloud idea of dynamically allocating and de-allocating resources in a network infrastructure; at PaaS level these resources are services and applications.

A huge obstacle we faced come from OSGi's nature itself, since all the operations we just described are well supported only when the components and code locations are well known.
To overcome the problem, in the past have been proposed multiple specifications aimed at transporting the OSGi ideas onto a distributed environment, but they all focused on the remote communication between the components and maintained the assumption that the resources are identifiable and can be located at any time, without actually providing a precise way to complete the operation.

One of the principal points of our work was thus to create the tools needed to effectively transport the OSGi specification onto the Cloud, aiming at the analysis of a solution which could seamlessly perform the services location and dynamic binding.
We then designed an hybrid P2P-centralized communication system, which allows us to have a good fault tolerance while still retaining a high degree of control, onto which implement a protocol to (i) realize the communication between the various platform components, (ii) propagate the information about active services and (iii) dynamically locate them.



The image (by Alessio Deidda) above represents a schematic of the platform's communication architecture, showing how the P2P communication can coexist with the centralized one.
 
Introducing our thesis contents, in the first chapter we will analyse the state of the art regarding the Cloud and the three Service Delivery Platform types mentioned before, introducing some of the currently available offerings before giving a brief description of OCCI's and OSGi's features.

The second chapter contains the description of our architecture and system, Grog, introducing the various components before analysing them more in the detail. There also some use cases which we identified during our initial analysis.

In the third chapter we will further descend into detail, describing the tools and technologies used and the choices made during the prototype's implementation process.

Conclusions, future developments and a real life use case, are all written in the fourth chapter.

Finally the appendix contains a series of code examples illustrating the multiple ways to use OSGi which we analysed during the development of our architecture.

You can download the full thesis here and download the implementation here. Source code is available on Github.

NOTE: Dependencies and components installation and platform's prototype configuration are described here: Grog setup and configuration

8 comments:

  1. Please would it be possible to make the thesis and implementation available on a site which does not require a Windows live account to access? Dropbox is a good, free option for example.

    ReplyDelete
    Replies
    1. Sorry, my bad, I used the wrong link. It should now be accessible to everyone, without requiring a Live ID access. Dropbox is indeed good but it offers me 2GB of free storage space, while Skydrive gives me 25GB..

      Delete
  2. Replies
    1. Thank you for your interest, have a nice day

      Delete
  3. Stefano.

    Nice Thesis!

    Surprised you didn't mention us (Parmus). As many in the industry are aware - Paremus have had a model drive OSGi based PaaS since 2006. The implementations is evolving - and we are starting to be copied - but the basic architectural principles remain the same - and un-equaled in any of the solutions you mention (i.e. combination of Recovery Oriented Compute, Model Driven and use of Stigmergy).

    We're also always been OBR centric - believing that dynamic assembly of the software by each endpoint - is the correct approach.

    With respect to protocols - you didn't mention DDS. Again my opinion - but I think this is a great foundation for PaaS / Datacentric messaging.


    Best Wishes

    Richard
    Paremus CEO.

    ReplyDelete
  4. Greetings and thanks for your reply.

    Well, to say the truth, you were not cited because we didn't even know you existed until now, same goes for the DDS protocol you mentioned; which is unfortunate as I'm sure we could have found more inspiration from your work.

    You see, our project did not exactly start as a well defined idea, we were just asked to "design a SDP for the Public Administration" and then given absolute freedom. But at the time we didn't even know what a SDP was, let alone IaaS, PaaS or SaaS.

    Our main sources of inspiration were what we found Googling around, such as Cloud Foundry, Google AppEngine, Heroku and some Eclipse OSGi Webinars (but only after we found out that OSGi was at the base of the Eclipse platform).

    The choice to build a PaaS came later in the process and almost naturally, as we decided that both the IaaS and SaaS approaches would not satisfy our requirements.

    About OBR, we personally did not like it very much, but we didn't even have so much time to thoroughly test all the ideas which came to our mind hence we might as well be missing something there.

    Best regards

    ReplyDelete
    Replies
    1. Hi Stefano,

      Great work!

      Just one remark : Reading the "Available PaaS Offering" chapter, I don't really get why did you decide to re-write your solution from scratch rather than extending Cloud Foundry with a OSGi DEA ?

      Best regards,

      Florian CHAZAL

      Delete
    2. Cheers,

      that's a fair question and to be honest it is the solution I would have preferred. One requirement though, was that the whole platform must have been coded in Java. Since Cloud Foundry is based on Ruby, we had to (sadly) ditch it, but we took great inspiration from its architecture and inner workings.

      Delete

With great power comes great responsibility.

Da grandi poteri derivano grandi responsabilità.