Computing School 2009 CEA-EDF-INRIA

Europe/Paris
christine leroy (cea)
Description
Emerging grid middleware standards: High Performance computing, High Throughput computing and large scale data analysis using grids. Introduction Grid computing is a concept for high throughput computing and data management which was born a few years ago. This still active field of development aims at associating and leveraging distributed computing facilities ranging from “low level” capacity (local facilities, regional facilities) to “high level” capacity (HPC national or international centres). To achieve this goal different middleware layers have been developed to allow transparent access to the computing and data management capacities, and today different kinds of applications have been enabled in such a context. This Summer School is an opportunity to highlight the state of the art and achievements in this area, and to envision current issues and forthcoming challenges. Objectives Through lectures from different stakeholders, hands-on sessions on the Grid5000, EGEE and DEISA infrastructures using various middleware, and a selection of focused conferences, this school will give attendees the opportunity to: • understand the basics of grid computing, • get acquainted with the different mechanisms of grids, • be aware of challenges and scientific perspectives of grid usage • be able to use grid environments and/or to port their own application on a grid infrastructure Target audiences This school is designed for computational scientists and end users from any application discipline as well as for computer scientists in the field of distributed systems and will be built on this diversity. Some background in programming and computer science is the only prerequisite. Grid newcomers and beginners are welcome, but people already involved in grid projects should also benefit from this school.
    • 10:00 10:30
      Welcome
    • 10:30 11:30
      Lecture 1.1: Introduction

      -History of the distributed computing and introduction to grid computing
      -HPC, grids and clusters
      -Different types of grids: from P2P to the interconnection of HPC centres to Clouds

    • 11:30 12:30
      Lecture 3.1: Service oriented Grid

      Service oriented Grid
      - Grid Computing
      - Introduction to Grid Computing
      - Grid Projects (EGEE/DEISA)

    • 12:30 14:00
      Lunch
    • 14:00 14:30
      installing Poster
    • 14:30 16:30
      Lecture 3.1: Service oriented Grid
      • Grid Security
        - Public Key Infrastructure (PKI)
        - VO Membership Service (VOMS)
        - GridShib and Shibboleth
      • Grid Web Services
        - Services Oriented Architecture (SOA)
        - Web Services Standards
        - Web Services functionality

      Dates: Tuesday 09 June 2009 08:30

    • 16:30 17:00
      break
    • 17:00 18:00
      Lecture 1.1: Overview of (some) Grid Projects and Platforms In France, Europe, and World.

      • Production Grids

      • Experimental Grids

      • Clouds

    • 18:00 19:00
      Lecture 1.2: Programming the Grid
    • 19:30 21:00
      Dinner
    • 08:30 10:30
      Lecture 3.2: Resource Management

      -Job management
      -Job submission
      -Methods of job submission
      -Information Management
      -Resource Discovery
      -Resource Monitoring

    • 10:30 11:00
      break
    • 11:00 12:00
      Lecture 1.3: Middleware Some Examples and Associated Problematics

      • Low level Middleware

      • Batch Systems

      • GridRPC Systems

      • Workflow Systems

      • Volunteer Computing Systems

      • Cloud “OS”

    • 12:30 14:00
      Lunch
    • 14:00 15:00
      break
    • 15:00 16:30
      Lecture 3.3: Grid Middleware

      -Globus Toolkit 4 (GT4)
      -gLite
      -UNICORE

    • 17:00 19:00
      Groupe 1: Hands on session 1.1

      A case study of a Grid Middleware on Grid'5000

    • 17:00 19:00
      Groupe 2: Hands on session 3.1

      Job submission (globus and glite)

    • 08:30 09:30
      Lecture 3.4: Middleware interoperability and Supporting Tools– 1h

      -Job Execution
      -Data Access
      -Information Access
      -Security
      -Overview of Grid Supporting Tools

    • 10:00 11:30
      Lecture 1.3: Middleware: some examples with associated problematics (end)

      •Low level Middleware

      • Batch Systems

      • GridRPC Systems

      • Workflow Systems

      • Volunteer Computing Systems

      • Cloud “OS”

    • 12:30 14:00
      Lunch
    • 14:30 16:30
      Groupe 1: Hands on session 3.1

      Job submission (globus and glite)

    • 14:30 16:30
      Groupe 2: Hands on session 1.1

      A case study of a Grid Middleware on Grid'5000

    • 17:00 19:00
      Groupe 1: Hands on session 3.2

      globus webservices

    • 17:00 19:00
      Groupe 2: Hands on session 1.2

      Data Management on Grid with DIET

    • 21:00 22:30
      Conference

      Best practices, trends and
      perspectives in Grid Security

    • 08:30 10:30
      Groupe 1: Hands on session 1.2

      Data Management on Grid with DIET

    • 08:30 10:30
      Groupe 2: Hands on session 3.2

      globus webservices

    • 10:30 11:00
      break
    • 11:00 12:30
      Lecture 2.1: Motivating applications and main challenges for data management
      • Applications storing and processing large volumes of data.
      • Large-scale numerical simulations.
      • Distributed collaborative applications.
      • Data mining.
      • Summary: main challenges.
    • 12:30 14:00
      Lunch
    • 14:30 17:00
      Groupe 1: Hands on session 3.3

      glite Data Management and services

    • 14:30 17:00
      Groupe 2: Hands on session 1.3

      Workflow Management on Grid with DIET

    • 17:30 19:00
      Lecture 2.2: Explicit grid data management
      • Catalogue-based solutions: building on GridFTP.
      • Logistical storage: IBP.
      • Unified data access: SRB.
      • Evaluation criteria, limitations of existing systems.
    • 08:00 09:30
      Lecture 2.3: Transparent grid data management - 1.5 h
      • Grid file systems. GFS, Lustre, Gfarm, XtreemFS
    • 09:30 10:00
      break
    • 10:00 12:30
      Groupe 1: Hands on session 1.3

      Workflow Management on Grid with DIET

    • 10:00 12:30
      Groupe 2: Hands on session 3.3

      glite Data Management and services

    • 12:30 14:00
      Lunch
    • 14:00 15:30
      Lecture 2.4: Convergence of Grid and P2P systems
      • Common issues for grid and P2P systems.
      • P2P file systems. CFS, Ivy, Pastis.
      • Case study: the JXTA P2P platform. Adapting JXTA for grids.
    • 10:30 12:00
      Lecture 4.1: Introduction
      • Motivation
      • Challenges
    • 12:30 14:00
      lunch
    • 14:30 16:30
      Groupe 1: Hands on session 3.4

      Using UNICORE on the DEISA infrastructure

    • 15:00 16:30
      Groupe 2: Hands on session 2.1

      · Introduction to JXTA
      Description
      Introduction to the JXTA platform. Learn to configure JXTA and discover the other peers.
      Contents
      · Configure the JXTA environment for different scenarios
      · Implement a peer that invokes the discovery service in order to find the other peers

    • 16:30 17:00
      break
    • 17:00 18:30
      Groupe 1: Hands on session 2.1

      · Introduction to JXTA (1.5h session)
      Description
      Introduction to the JXTA platform. Learn to configure JXTA and discover the other peers.
      Contents
      · Configure the JXTA environment for different scenarios
      · Implement a peer that invokes the discovery service in order to find the other peers

    • 17:00 19:00
      Groupe 2: Hands on session 3.4

      Using UNICORE on the DEISA infrastructure

    • 08:30 10:00
      Lecture 4.2: Grid Scheduling
      • Basic Mechanisms
      • Existing Grid Schedulers
    • 10:30 12:00
      Lecture 2.5: RAM-based grid data sharing
      • Using P2P techniques to build a grid data-sharing service. Case study: JuxMem.
      • Introducing transparent data sharing in GridRPC applications.
      • Introducing transparent data sharing in component-based applications.
    • 12:30 14:00
      Lunch
    • 14:00 15:00
      Conference on XtreemOS
    • 15:30 17:00
      Groupe 1: Hands on session 2.2

      · Using JXTA (2 x 1.5h)
      Description
      The higher level functionalities proposed by JXTA: pipes and custom services.
      Contents
      · Implement two peers that will communicate through a JXTA pipe
      · Implement two peers that will define a JXTA service: one of them will act as server and will publish the service, the other one will act as client and will discover and make use of the service.

    • 15:30 17:00
      Groupe 2: Hands on session 3.5

      Deploying a UNICORE infrastructure

    • 17:30 19:00
      Groupe 1: Hands on session 3.5

      Deploying a UNICORE infrastructure

    • 17:30 19:00
      Groupe 2: Hands on session 2.2

      · Using JXTA (2 x 1.5h)
      Description
      The higher level functionalities proposed by JXTA: pipes and custom services.
      Contents
      · Implement two peers that will communicate through a JXTA pipe
      · Implement two peers that will define a JXTA service: one of them will act as server and will publish the service, the other one will act as client and will discover and make use of the service.

    • 08:30 10:00
      Lecture 2.6: Case studies for large-scale data management
      • Data management on EGEE.
      • Data management for MapReduce applications : Hadoop and HDFS.
      • OGSA-DAI.
    • 10:30 12:00
      Lecture 4.3: Integration with Grid Middleware
      • UNICORE
      • GT4
    • 12:30 14:00
      Lunch
    • 14:00 15:00
      Conference

      Panel Session: "How useful and relevant are existing grids infrastructure for current applications?"

    • 15:30 17:00
      Groupe 1: Hands on session 4.1

      Submitting jobs using UNICORE & Grid Scheduler
      The participants experiment with a Grid scheduler in a UNICORE environment.

    • 15:30 17:00
      Groupe 2: Hands on session 2.2

      · Using JXTA (2 x 1.5h)
      Description
      The higher level functionalities proposed by JXTA: pipes and custom services.
      Contents
      · Implement two peers that will communicate through a JXTA pipe
      · Implement two peers that will define a JXTA service: one of them will act as server and will publish the service, the other one will act as client and will discover and make use of the service.

    • 17:30 19:00
      Groupe 1: Hands on session 2.2

      · Using JXTA (2 x 1.5h)
      Description
      The higher level functionalities proposed by JXTA: pipes and custom services.
      Contents
      · Implement two peers that will communicate through a JXTA pipe
      · Implement two peers that will define a JXTA service: one of them will act as server and will publish the service, the other one will act as client and will discover and make use of the service.

    • 17:30 19:00
      Groupe 2: Hands on session 4.1

      Submitting jobs using UNICORE & Grid Scheduler
      The participants experiment with a Grid scheduler in a UNICORE environment.

    • 08:30 10:00
      Groupe 1: Hands on session 2.3

      · Introduction to Hadoop (1.5h session)
      Description
      A short introduction to MapReduce, as implemented by Hadoop.
      Contents
      · Functional programming warmup: practice with some primitives
      · Hadoop Concepts: Mapper and Reducer through an already implemented example
      · Configure Hadoop and run the example

    • 08:30 10:00
      Groupe 2: Hands on session 4.2

      Running a molecular docking experiment using a single Grid resource
      The participants get acquainted with the molecular docking application performing some virtual screening experiments

    • 10:30 12:00
      Lecture 4.4: Grid Scheduling interoperability
      • Architecture
      • Protocols
    • 12:30 14:00
      lunch
    • 14:00 15:30
      Groupe 1: Hands on session 4.2

      Running a molecular docking experiment using a single Grid resource
      The participants get acquainted with the molecular docking application performing some virtual screening experiments

    • 14:00 15:30
      Groupe 2: Hands on session 2.3

      · Introduction to Hadoop (1.5h session)
      Description
      A short introduction to MapReduce, as implemented by Hadoop.
      Contents
      · Functional programming warmup: practice with some primitives
      · Hadoop Concepts: Mapper and Reducer through an already implemented example
      · Configure Hadoop and run the example

    • 16:00 17:30
      Groupe 1: Hands on session 4.3

      Creating a workflow for distributing molecular docking job
      A workflow for the execution of the application of hands on session 4.2 will be defined and tested locally

    • 16:00 17:30
      Groupe 2: Hands on session 2.4

      · Hadoop extras (1.5h session)
      Description
      Using HDFS and Hadoop Streams
      Contents
      · Invoke the FS shell utilities to directly manipulate files in HDFS
      · Hadoop Streaming: create and run Map/Reduce jobs with any executable or script as the mapper and/or the reducer

    • 18:00 19:30
      Groupe 1: Hands on session 2.4

      · Hadoop extras (1.5h session)
      Description
      Using HDFS and Hadoop Streams
      Contents
      · Invoke the FS shell utilities to directly manipulate files in HDFS
      · Hadoop Streaming: create and run Map/Reduce jobs with any executable or script as the mapper and/or the reducer

    • 18:00 19:30
      Groupe 2: Hands on session 4.3

      Creating a workflow for distributing molecular docking job
      A workflow for the execution of the application of hands on session 4.2 will be defined and tested locally

    • 08:30 10:00
      Groupe 1: Hands on session 2.5

      · Reverse index (1.5h session)
      Description
      Practical session. Design and implement a MapReduce-based algorithm to calculate the inverted index over the web crawl data.
      Contents
      · Count words and elliminate those appearing in most (or all) documents (“and”, “or”, etc.)
      · Implement the inverted index. It should not contain the words identified before. Several Map Reduce passes might be necessary.

    • 08:30 10:00
      Groupe 1: Hands on session 4.4

      Running a docking workflow in a multi-site Grid
      The workflow will be submitted to the Grid using the Grid Scheduler and
      UNICORE.

    • 10:30 12:00
      Lecture 4.5: Service Level Agreements
    • 12:00 14:00
      Lunch
    • 14:00 15:30
      Groupe 1: Hands on session 4.4

      Running a docking workflow in a multi-site Grid
      The workflow will be submitted to the Grid using the Grid Scheduler and
      UNICORE.

    • 14:00 15:30
      Groupe 2: Hands on session 2.5

      · Reverse index (1.5h session)
      Description
      Practical session. Design and implement a MapReduce-based algorithm to calculate the inverted index over the web crawl data.
      Contents
      · Count words and elliminate those appearing in most (or all) documents (“and”, “or”, etc.)
      · Implement the inverted index. It should not contain the words identified before. Several Map Reduce passes might be necessary.

    • 15:30 16:00
      Conclusion