Browsing by Author "Markatchev, Nayden"
Now showing 1 - 7 of 7
Results Per Page
Sort Options
Item Open Access Data Integration With OGSA-DAI(2007-04-30) Gaurav, Abhishek; Markatchev, Nayden; Rizk, Philip; Simmonds, RobResearchers in the physical sciences continue to generate increasing amounts of data from simulations, sensors, and cataloging efforts such as DNA mappings and climate records. This increase in data has led to the need for need for new data management tools that assist researchers in managing the volume of data entailed as well as distributing the data across disk volumes and administrative domains. Distributing the data facilitates greater collab- oration between scientists and maximizes the data s value. These new data management requirements have led to the development of numerous data management systems. Two such systems are the Proactive Data Management System (PDMS) [4] and BioSimGrid [3]. BioSimGrid is a Data Grid project designed to distribute bio-molecular simulation results. PDMS is a data management tool developed by the University of Calgary Grid Research Centre (GRC) that facilitates management and movement of data using metadata rather than physical file locations. The proliferation of different data management systems leads to the need for an extensible framework that facilitates the integration of multiple data sources. OGSA-DAI [2] was designed to meet this goal. This document discusses the integration of services provided by PDMS or BioSimGrid with other data facilities such as databases using OGSA-DAI. The rest of this document is structured as follows. Sections 2 and 3 provide an overview of BioSimGrid and PDMS respectively. Section 4 discusses the architecture and limitations of the OGSA-DAI framework. The integration of BioSimGrid and PDMS are discussed in sections 5 and 6. Section 8 summarizes the document.Item Open Access Database Assessment for PDMS(2007-04-30) Gaurav, Abhishek; Markatchev, Nayden; Rizk, Philip; Simmonds, RobThis document describes the database issues related to the Proactive Data Management System (PDMS) [6]. PDMS uses databases for a number of tasks. PDMS uses the Replica Location Service (RLS) [5], that is implemented using a database and can propagate updates between database servers. It also uses the Meta Catalogue Service (MCS) [7] schema, in another database. PDMS also currently uses the Reliable File Transfer (RFT) [3] service that also uses a database. Furthermore, PDMS uses a database to save its internal state so that its internal state is preserved between server restarts. The rest of this document is organized as follows: Section 2 gives an overview of relational databases. Section 3 describes database issues for the main PDMS server. Section 4 describes the RLS service, how it is used by PDMS and its database dependencies. Section 5 does the same for the MCS service and Section 6 describes database issues with the RFT service. Section 7 concludes the document.Item Open Access End User Data Scheduling with PDMS(2007-04-30) Gaurav, Abhishek; Markatchev, Nayden; Rizk, Philip; Simmonds, RobThe Proactive Data Management System (PDMS) [13] is a tool designed to manage large datasets in grid environments. It can be used by other higher level software tools in a secure fashion to manage files based on metadata describing the contents of the files. This document describes how PDMS can be used with tools that schedule the movement of data to and from the resources that use or produce the data. The rest of the document is laid out as follows. Section 2 describes the PDMS inter- faces. Section 3 explains the workflows and meta-schedulers and shows how PDMS can be integrated with the workflow managers. Section 4 explains the Virtual Data Toolkit and integration of PDMS with it. Finally, the document is summarized in Section 5.Item Open Access Facebook Meets the Virtualized Enterprise(2008-07-15T22:16:05Z) Simmonds, Robert; Curry, Roger; Kiddle, Cameron; Markatchev, Nayden; Tan, Tingxi; Arlitt, Martin; Walker, Bruce“Web 2.0” and “cloud computing” are revolutionizing the way IT infrastructure is accessed and managed. Web 2.0 technologies such as blogs, wikis and social networking platforms provide Internet users with easier mechanisms to produce Web content and to interact with each other. Cloud computing technologies are aimed at running applications as services over the Internet on a scalable infrastructure. They enable businesses that do not have the capital or technical expertise to support their own infrastructure to get access to computing on demand. They could also be used by large businesses to more efficiently manage their own infrastructure as an “internal cloud”. In this paper we explore the advantages of using Web 2.0 and cloud computing technologies in an enterprise setting to provide employees with a comprehensive and transparent environment for utilizing applications. To demonstrate the effectiveness of this approach we have developed an environment that uses Facebook (a social networking platform) to provide access to the Fire Dynamics Simulator (a legacy application). The application is supported using Virtual Appliances that are hosted in an internal cloud computing infrastructure that adapts dynamically to user demands. Initial feedback suggests this approach provides a much better user experience than the traditional standalone use of the application. It also simplifies the management and increases the effective utilization of the underlying IT resources.Item Open Access Proactive Data Management System (PDMS)(2007-04-30) Gaurav, Abhishek; Markatchev, Nayden; Rizk, Phillip; Simmonds, RobProactive Data Management System (PDMS) is designed to manage large datasets within grid environments. PDMS is particularly useful in scientific environments where large amounts of data are often moved between computing and data archiving sites. PDMS facilitates management and movement of data using metadata, i.e., the data items are iden- tified using their inherent properties and characteristics rather than the file names in which they are stored. The use of metadata abstracts away the physical location of a file allowing PDMS to transparently manage replicas of a file. It is intended to be used by groups need- ing to manage large data sets across several locations. PDMS utilizes well known Data Grid services. This allows it to interoperate with various workflow managers in use today. Management of data using metadata allows the replication requests to PDMS to be specified in terms of metadata. For example, a replication request to PDMS can be move the data generated by user A for project B within last three months to the site X . The metadata in the above example are - (i) generated by user A, (ii) belonging to project B and (iii) generated in last 3 months. The metadata in the above example correspond to some logical files in which the data is stored. The logical files can be physically present at multiple locations, in which case, PDMS locates all pieces of the dataset and initiates a transfer of all the pieces. Thus, with a given replication request, PDMS needs to perform two key tasks before initiating transfers - (i) use metadata to establish the logical names of the files that match the metadata query and (ii) select sources of replicas for those logical files not already at the destination. This management of replicas on the basis of metadata fills gap in the previously available Data Management services available. PDMS is designed to restrict access to authorized and authenticated users who have permission to use the system. Files are stored in logical groupings referred to as collections. It also restricts users access to specific collections. These access restrictions resemble file ownership with ownership of collections as well as read and write privileges. A more complete description of the access control can be found in [3]. PDMS maintains the consistency of the data for a collection. This currently includes not allowing the same physical file to be registered twice (as two separate logical files). PDMS also ensures that users conform to the schema they include in their registration request. PDMS could be configured to enforce each collection to conform to a specific schema. This is particularly useful in large groups that need to be sure all metadata contains certain information and want to prevent buggy registration processes from introduction inconsistency or incomplete metadata. Consistency requirements for the PDMS system are intended to be configurable as consistency checking can be expensive.Item Open Access Supporting dynamic jobs in a grid environment(2009) Markatchev, Nayden; Simmonds, Robert William JohnItem Open Access User Management Issues In PDMS(2007-04-30) Gaurav, Abhishek; Markatchev, Nayden; Rizk, Philip; Simmonds, RobThis document discusses issues related to managing user accounts for use with the Proactive Data Management System (PDMS). PDMS has a number of components, all of which need to support some type of authentication and authorization mechanism. The authorization could take place at the user level, where an individual user has to authenticate with a service or resource they are allowed to use, or at the system level, when services communicate communicate directly with each other. In each case the type of authorization used will be indicated and its implementation described. The rest of this document is organized as follows. Section 2 provides a primer on au- thentication and authorization mechanisms in Grid environments. Section 3 provides a brief overview of the components of the PDMS system and their authentication and authorization mechanisms. Sections 4 and 5 describe the authorization and authentication mechanisms used to restrict the changes to file meta-data and files respectively. User Account require- ments are described in Section 6. Section 7 provides a summary of the document.