A Peer to Peer Approach for a Grid of Instruments

Francesco Lelli1  and Pietro Molini1 

1 Italian National Institute of Nuclear Physics (INFN) Viale dell’Universita, 2 35020 Legnaro, Padova, Italy.  E-mail: francesco.lelli _AT_lnl.infn.it, pietro.molini _AT_lnl.infn.it

1.1                         Abstract

Resource location (or discovery) is a fundamental service for resource-sharing environments such as grids of Instruments. A general need is to have, given some desired resource attributes, an efficient service capable to return locations of matching resources. Different approaches have been explored to cope with this problem in diverse scenarios in which Instruments are deployed, especially considering those cases characterized by a huge number of sensors / probes involving a highly dynamic change in their distribution and inter-connections.

A peer to peer approach has been proposed to cover the aforementioned aspects and needs. In the next sections concrete motivations are outlined in order to justify the efforts in the related implementation about of such p2p approach for a Grid of Instruments. Moreover the overall design is presented for this novel discovery implementation and the introduction of the concept of “Tiny Instrument Element” is provided.

1.2                           Introduction

We define the term ‘Instrument Element’ (IE) [18], [19], [20] as a set of services that provides the needed interface and implementation that enables the remote control and monitoring of physical instruments. The IE needs to be really flexible; in the simplest scenario this abstraction can represent a simple geospatial sensor or an FPGA card that performs a specific function, while in a more complex network of sensors it can be used as a bridge between the sensors and the computational grid. Finally, the IE can be part of the device instrumentation, permitting the organization of the instrument into a network that allows grid interaction.

The term ‘Instrument’ describes a very heterogeneous category of devices. Refer to [1], [2] [3] [4], [5] and [6], [7] and [8] as a large set of examples. The complexity of the information need in order to control a device is really instrument dependent and ranges from a practically fixed configuration to a configuration and orchestration of thousands of nodes [1], [2], [6], [8]. In any case, we need a uniform way and a single point of access to the information related to a particular instrument orchestration from external users.

If a system, like a sensor network, allows the possible use of a subset of devices, it also manages this partitioning. In addition, this service can act as a super peer in dynamic instrument networks, where simple devices can appear and disappear. Finally, this service can permit a reservation of the system, allowing authorized users the possibility to bookmark resources.

Figure 1 classifies the information that has to be maintained from the Instrument orchestration for every instrument. From a semantic point of view, we can divide the information into three different categories:

 

  • Information such as physical locations of configuration file or driver type, etc., which is only internal to the IE and is needed in order to ensure the correct instrument instantiation.
  • Information that can be modified at runtime by the users and that could change the global behaviour. The numbers and types of instruments that should be used to perform particular aggregate actions are typical of the information that belongs to this category.
  • Information that identifies the instrument topology, i.e., both potential and actually performed intra-instrument connections.

The same information could be categorised from the dynamicity point of view: (a) Static information refers to data that will be defined at deployment time and will never change in the future. As its opposite, (b) Dynamic information consists of data that can change in an automatic way, without user intervention. In the middle, we have (c) Low Dynamic information, which corresponds, for example, to adjustments performed by the users at runtime.

Figure 1 - Classification of the information needed by a set of instruments

 

We can note that most of the information is close to the instrument, and belongs to a particular instance. In addition, considering a set of instruments as one single device, static information introduces rigidity to the system while low dynamic information introduces complexity in the global system usage. Complex static systems configuration, which typically are simpler to implement, could be the solution in use cases, where all instruments need to be in a consistent state in order to produce a coherent output [1]. Unfortunately, this solution remarkably increases the configuration problem, providing a fixed structure that, in the case of a subsystems fault, must be manually reconfigured by the users. In highly dynamic systems like the one described in [2] and [6], this solution is simply unusable because the introduction of a new node in the system triggers a total reconfiguration.

Even if we believe that a uniform and coherent set of services can facilitate their aggregation and the interoperability across different organisations, the approach to the implementation could be really different. In [1], for instance, the CMS data acquisition phase can start only if all the instruments of the system are of the same status, while in [2], [5] and [6] , instruments can dynamically join the system. In the previously mentioned cases we cannot assume that the index of all instruments, which is the base abstraction of the Resource Service, is static. The IM behaviour needs to dynamically adapt itself to the dynamic, existing instrument structure. This particular functionality is typical of P2P [9] systems, wherein the network can dynamically adapt to peer changes. Incidentally, a single and repayable entry point of this information is mandatory if we want to provide a set of instruments as a service for the computational grid. Considering the categorisation of the information presented before, we can note that static information and some of the low dynamics belong to a particular instrument instance, while instrument topology information must be a sheered attribute between devices in order to avoid collisions, thus organizing the instruments in the proper way. In a typical discovery based on P2P, a peer announces itself to the network, giving other peers the possibility to perform query and exchange of data.

In this scenario, instruments can dynamically engage other existing instruments, performing a system lookup and allowing the dynamic determination of the peers’ topology.

This approach distributes the information to the instruments, thus breaking the global configuration into several parts that dynamically change during the system usage.

Figure 2 explains the dynamic joining of an instrument into the system. After a bootstrap, the instrument sends a discovery request to other peers and Relays forward this request to unreachable devices. Instruments reply to this request by announcing their presence in the network and then the new instrument enquires of the others in order to discover what type of device they are. Once the instrument finds the needed resources, it engages and uses them.

If an instrument disappears from the system, other devices can repeat the discovery/information enquiry phases in order to try and find the needed resources. In addition, this operation can also be repeated in case of failure in order to detect the recovery of the needed subcomponent, allowing an autonomic behaviour.

In this scenario, the information is no longer centralized but is distributed in the system. Therefore this approach complicates monitor functionalities that also need a discovery system in order to detect the actual instrument topology. In other words, an Index Service must periodically repeat the instrument discovery and information enquiry phases, as with all the system devices, in order to detect the status of the entire system. Alternately (or in parallel), instruments can periodically send an advertise message in order to inform interested peers of their status.

Figure 2 - Instrument discovery interaction diagram

 

The rest of this document is organized as follows: Section 1.3 formalizes the above mentioned requirements. Section 1.4 gives an overview of the implemented scenario while section 1.5 provides a detailed description. Finally, in section 1.6 we draw some conclusions.

1.3                                 Use Cases for a P2P Approach.

Recently the study and feasibility for the introduction of P2P approach into complex architectures has come to the attention of the discussion in several scientific communities [10], [11], [12].  Resource discovery is a key issue for service oriented architecture systems in which applications are composed of hardware and software resources that need to be located. Classical approaches to resource discovery are either centralized or hierarchical and will prove inefficient as the scale of the systems rapidly increases. On the other hand, the Peer-to-Peer (P2P) paradigm emerged as a successful model that achieves scalability in distributed systems.

The scenario which best fits the intrinsic features of a P2P solution is characterized by Instruments that:

·         are large in number

·         have a highly dynamic behaviour: for instance they often go on and off or can appear and disappear in a working net of sensors or probes

·         are widely distributed

·         operate in low resources / embedded systems: for example FPGA based instrumentation

Some interesting example of widely sparse instrumentation could be:

·         power grids

·         territory monitoring: to prevent geo-hazardous situations

·         sea monitoring: for tsunami surveillance for example

·         distributed laboratories

·         transportation remote control and monitoring

·         sensor networks

Moreover it must be considered that such previously described cases one would avoid, if not strictly necessary, the installation of complex backend repositories with all the related overhead.

1.4                         Implemented Scenario for P2P IEs orchestration

Efficient Resource discovery mechanism is one of the fundamental requirements for service oriented systems, as it aids in resource management and scheduling of applications. Resource discovery activity involves searching for the appropriate resource types that match the user's application requirements. Various kinds of solutions to resource discovery have been suggested, including the centralised and hierarchical information server approach. However, both of these approaches have serious limitations in regards to scalability, fault-tolerance and network congestion. To overcome these limitations, indexing resource information using a decentralised (such as Peer-to-Peer (P2P)) network model has been actively proposed in the past few years.

The infrastructure layout of a set of IEs coordinated in a P2P fashion is illustrated in fig. 3 where all the involved IEs are considered “peers”, even if they are slightly specialized.

In fact you can find the following categories of elements:

-          simple node: it is an IE responsible for a specific physical instrumentation, is deployed in a remote machine and represents a normal peripheral node in the a distributed network of peers

-          network core node: it is an IE itself, controllable via the service interface, and provides some additional capabilities in routing the information flow amongst the other nodes. The network core nodes are correlated between them and their number can be tuned according to the P2P discovery data traffic

-          information provider node: it is an actual IE too, providing index services. The information provider IEs announces themselves like other edge IEs declaring to be special nodes of “monitoring” type. Their number can be suitably determined to face the requests of information data coming from interested users.

Figure 3 - Discovery scalability scenario.

Many IEs can be deployed in different and heterogeneous machines in a real distributed environment. The deployment is completely dynamic. In other words Instrument Element nodes, Network core machines and network information providers can dynamically join to and disappear from the pulsing network of resources, without affecting other Instruments or requiring additional configuration efforts. Following the autonomic philosophy, the IE network can self-configure and self-optimize. Once a new instrument is deployed, network Information providers are able detect it in a reasonable time interval.

Figure 4 depicts the process by which a new IE manifests its presence to the peer network and subsequently the community of peers becomes aware of the new member appearance.

When an IE wishes to notify its identity to a world of similar Instruments, it has not the detailed knowledge about the rest of the peers. It can contact an priori known network super-peer (node 0) to retrieve more information. The super-peer initially provides the new IE with a list of network peers that can be queried. So a connection can be established between the edge node and one (node B for example) amongst this set of network nodes, depending on the load balancing policy adopted.  In the end, the new peer may ask for the location of available “monitor” or “information provider” entities and then exchange data with them.

Figure 4 - Discovery process for a new node.

1.5                         Details on our P2P implementation

1.5.1      Tiny IE

The goal pursued in setting up the target scenario for dynamic discovery, as discussed in the previous section, was to verify its feasibility and to evaluate the impact of a large number of instruments on the global network.

For convenience’s sake it has been decided to use “light” or “tiny” IE installations in place of the official middleware. In fact the software involved in installing all the core IE components requires some efforts on behalf of the user: for instance it is necessary to install and fill up a database for the RS and to follow a non trivial configuration procedure. The focus of the activity is to have a reasonably quick and simple way to deploy many IEs all over the world in order to proficiently basic functionalities, like simple data acquisition and effective communication with other peers. In other words is can simplify the usage of the “standard” IE.

For these reasons a “tiny” software prototype of IE has been used, characterised by a relevant constraint: it exposes the same IE Facade web service interface of a reference IE. This means that a generic client, such as the VCR, is non able to distinguish between a real IE and tiny P2P IE installations. A portal can control and monitor the Instruments associated to a tiny IE with the identical APIs used for the other pilot applications. In conclusion a tiny IE exposes a standard behaviour / interface to the outside, but has a very simple functional logic that reduced structural dependencies.

Tiny IEs deployed in several sites envisage only one “probe” (one IM) per installation, capable of gathering, through Java APIs, some information about the environment of the machine where they are running. The collected information can be then conveyed to dedicated information provider nodes enabling the display and consulting of the aggregated data coming from different remote sources.

An example list of parameters is summarised below:

User CPU Usage: 74.85702308590592
Process Name: InstrumentKeeper at: sadgw.lnl.infn.it:2002
Service End Point: http://sadgw.lnl.infn.it:2002/InstrumentElementKeeper/services/IEService
JVM Name: 1.5.0_09-b01
Availlable Processors: 2
Message:
System CPU Usage: 0.0
Memory Used: 28715744
Peak of Threads: 317
Description: Legnaro,IT
Number of Threads: 165
Process Up Time: 2246914248
Operative System Version: 2.4.21-15.0.3.EL.cernsmp
Number Of Heartbit: 124395
Total Memory Used: 64747896
Operative System Name: Linux
JVM Vendor: Sun Microsystems Inc.

 

To accelerate and simplify the installation procedure, a “smart” deploy of tiny IE via Java Web Start technology [14] has been made available. A requirement for the scenario explored is that all test machines necessitate a public IP address to access the IE – WSDL IE Facade from a remote site. So the distribution of this tiny IE is handled in 2 different ways:

1)  Web Start Application (click a web link an the IE gets installed in your local machine)

2) WAR Based Deployment (copy a file in your $CATALINA_HOME/webapps folder and the IE is ready to be used)

Each single IE is compatible with Linux, Windows XP, Windows Vista, and Mac OSX.  It requires JAVA 1.5, no more that 30MB of RAM and less 2-3% of CPU in 1 GHz machine. An IE also steals a bit of bandwidth: less than 0.1Kbyte/s. By the way CPU and bandwidth can be tuned according to each site capability.

In conclusion the light IE solution designed allows:

1) The easy and stand alone installation of the software

2) The dynamic discovery of new instruments

3) The capability to monitor and control multiple and independent instruments from one centralized location.

4) The capability to access a particular instrument from the VCR

1.5.2      Node Geo-location

Some of the information provider nodes have been designed to geo-locate other nodes through the Google Maps APIs [15]. The possibility to constantly monitor the IEs present on the territory (as shown in fig. 5) constitutes a highly interactive way to visually detect the appearance or disappearance of one or more elements.

Figure 5 – Node geo-location provide by information provider peers.

 

1.5.3      JXTA technology in P2P evaluation

Features of the P2P model, such as scalability and volatility tolerance, have motivated its use in distributed systems. Several generic P2P libraries have been proposed for building distributed applications: for the instrument discovery evaluation scenario JXTA [16], [17] technology has been used. JXTA is an open-source initiative, sparked by Sun Microsystems. It was founded in order to develop a set of standard open protocols for P2P network applications. It is one of the most advanced frameworks currently available for building services and applications based on the P2P model. In its 2.0 version, JXTA consists of a specification of six language and platform-independent, XML-based protocols that provide basic services common to most P2P applications, such as peer group organization, resource discovery, and inter-peer communication. The underlying mechanism used by JXTA to manage its overlay and propagate messages is the rendezvous protocol and a specific discovery protocol is used to find resources inside a JXTA network.

1.6                                Conclusions

The discovery of the instruments is a troubling issue when the number of elements is high.

Two cases can be encompassed:

Quasi static situations

·         The number of IEs is well defined and each single IE is quite complex with a good hardware support

·         A central registry based discovery mechanism can be used, as outlined in fig. 6

·         The information regarding Instruments and collected in the BDII adheres to a GLUE schema, and can be used for match making queries.

Figure 6 – BDII in a quasi static scenario.

Dynamic situations

·         The number of IEs can change very quickly.

·         Usually Instruments are very simple devices, often with poor hardware support.

·         The discovery is mainly used to disseminate the knowledge about which are the online IEs.

·         In this context a new approach, alternative to traditional classical solutions, has been evaluated exploiting Peer to Peer (P2P) protocol capabilities.

This discovery system has not set up considering it as a replacement for a typical information system. In the reference scenario we are mainly aiming to evaluate an inter-instrument discovery system where the peers are only instruments without considering the interactions with other classical grid entities.

All that considered, some advantages are anyway straightforward in the adoption of this kind of approach:

1) Instruments, network nodes and network information providers can dynamically can be added or removed from the infrastructure at zero configuration cost.

2) IE network is a tuneable. In other words, if you encounter performance problems, you can simply add network core machines and the global load will be redistributed.

3) The scalability power should outperform a BDII based system.

4) In distributed system relying only on Instruments, you can avoid the installation of one or more complex indexing services.

5) This discovery system can be embedded in IEs that runs in FPGA cards and not in standard computers.

6) Anyway a particular version of the IE node could interact with centralized index services in order to fill the information of a group of light IEs.

 

Bibliography

[1] Cittolin, S., Varella, W.S.J., Racz, A., Della Negra, M. and Herve, A. (2002) CMS TDR 6.2 The TriDAS Data Acquisition Project and High-level Trigger CERN/LHCC, December.

[2] Irving, M., Taylor, G. and Hobson, P. (2004) ‘Plug into grid computing’, IEEE Power & Energy Magazine, March–April, pp.40–44.

[5] Siaterlis, C., Lenis, A., Moralis, A., Roris, P., Koutepas, G., Androulidakis, G., Chatzigiannakis, V., et al. (2005) ‘Distributed network monitoring and anomaly detection as a grid application’, HP Openview University Association Plenary Workshop (HP-OVUA) Porto, July.

[4] Tham, C.K. and Buyya, R. (2005) ‘SensorGrid: integrating sensor networks and grid computing’, Invited Paper in CSI Communications, Special Issue on Grid Computing, Computer Society of India, July.

[5] McMullen, D.F., Devadithya, T. and Chiu, K. (2005) ‘Integrating instruments and sensors into the grid with CIMA web services’, Proceedings of the Third APAC Conference on Advanced Computing, Grid Applications and e-Research (APAC05), September.

[6] OGC Sensor Web Enablement, www.opengeospatial.org/functional/page=swe

[7] Grid Enabled Remote Instrumentation with Distributed Control and Computation (GridCC) Project Annex I, http://www.gridcc.org/getfile.php?id=1436e, 2005.

[8] AGATA Advanced Gamma Tracking Array design specification, http://agata.pd.infn.it/Agata-proposal.pdf.

[9] Taylor, J. (2004) From P2P to Web Services and Grids, Peers in a Client/Server World, Springer, October.

[10]         A. Iamnitchi, I. Foster, A Peer-To-Peer approach to resource location in grid environments, http://people.cs.uchicago.edu/~anda/papers/iamnitchi-bookch.pdf

[11] CoreGRID Technical Report Number TR-0028, March 17, 2006, http://www.coregrid.net/mambo/images/stories/TechnicalReports/tr-0028.pdf

[12] R. Ranjan, A. H. and R. Buyya, A Study on Peer-to-Peer Based Discovery of Grid Resource Information, December 1, 2006, http://gridbus.csse.unimelb.edu.au/reports/pgrid.pdf

[13] BDII, http://agrid.uibk.ac.at/wpa2/bdii.html

[14] Java Web Start Technology, http://java.sun.com/products/javawebstart

[15] Google Maps API, http://www.google.com/apis/maps

[16] B. Traversat, A. Arora, M. Abdelaziz, M. Duigou, C. Haywood, J. C. Hugly, E. Pouyoul, B. Yeager,  Project JXTA 2.0 Super-Peer Virtual Network,  May 2003, http://www.jxta.org/project/www/docs/JXTA2.0protocols1.pdf

[17] G. Antoniu, L, Cudennec, M. Duigouy, M. Jan, Performance scalability of the JXTA P2P framework, Rapport de recherche n° 1 —December 2006, http://hal.inria.fr/inria-00119916/en

[18] F. Lelli, E. Frizziero, M. Gulmini, G. Maron, S. Orlando, A. Petrucci and S. Squizzato. The many faces of the integration of instruments and the grid International Journal of Web and Grid Services 2007 - Vol. 3, No.3  pp. 239 - 266

[19] E. Frizziero, M. Gulmini, F. Lelli, G. Maron, A. Petrucci, S. Squizzato, S. Traldi and N. Toniolo, The GRIDCC Instrument Element: from the Prototype to Production Environment, in proc of INGRID 07-instrumenting the Grid, 2nd international workshop on distributed cooperative laboratories, Porto Fino, Italy, April 2007

[20] E. Frizziero, M. Gulmini, F. Lelli, G. Maron, A. Oh, S. Orlando, A. Petrucci, S. Squizzato and S. Traldi. Instrument Element: A new Grid component that Enables the Control of Remote Instrumentation. In proc of International Conference on Cluster Computing and Grid (CCGrid), Singapore May 2006.