WP4 phone conference, 13/08/2003

 

  • NIKHEF: Martijn
  • CERN: Maite, German, Piotr, Jan, Sylvain
  • Edinburgh: Lex

Review of actions

  • Action 18: ZIB is working on LSF support for the RMS.
  • Action 90: Hugo is testing Fault Tolerance together with Fabric Monitoring, here is his report:
    • RPMs being tested: edg-fabricMonitoring-2.4.8-1.i386.rpm, edg-fabricFaultTolerance-1.1.9-1.i386.rpm
    • Some problems found at starting the FT daemon; already solved.
    • Some problems with the 2polish command (after checking log files and rules); also solved.
    • From the log file the FT daemon seems to read the rules but it just dies. The log files have been sent to Heidelberg people requesting some help on how to have more debugging information from the daemon and find out what is going wrong.
  • Action 91: No update.

New actions:

  • Action on Sylvain to follow up the fabric monitoring use in the EDG testbed and to promote it, e.g. release metrics to monitor daemons and simple web interface to visualize it.
  • Action on Maite to contact Steve Fisher to check when WP3 is going to interface to the WP4 fabric monitoring and what is the status of this.
  • Action on the FT, monitoring, resource management and gridification task leaders to prepare the 1 h tutorial topics and exercises and send a first draft to Maite by the end of August.

 

Institute reports:

  • German (CERN, Installation task):

http://cern.ch/wp4-install/documents/wp4-install-progress-report-2003-1308.htm

  • Martijn (NIKHEF, gridification task):
    • 2.5.2 LCMAPS AFS and Kerberos modules: ongoing/delayed, still in time for testbed integration. Release: end August.
    • 2.5.4 VOMS module, POSIX module, POOL module: unit tested.

To be integrated in EDG 2.1.

    • 2.5.5 LCAS server implementation: ongoing/delayed

The LCAS VOMS plugin has a higher priority than the server implementation, may be released separately with an updated version of the library implementation of the LCAS plugin framework.

Release VOMS plugin + updated LCAS: end of August.

    • 2.5.6 Job repository: ongoing

A very simple version of a job repository may be needed for slashgrid.

Higher priority than the server implementation of LCAS.

  • Thomas (ZIB, resource mgt task): We mainly worked on the info providers and their configuration. Also work on the LSF support has been continued. However, it may be that we'll not make it until August 20th. When LSF support is done we start with Condor.
  • Piotr (CERN, Configuration Mgt task):
    • 2.1.4 Scalability: delayed - no update
    • 2.1.5 CDB notification for sync: done, Rafael released the 1.0 version and 1.1 specification documents.
    • 2.1.7 CDB SQL Server module: Lex just about to release the Oracle compatible release.
    • 2.1.6 CDB Server module for XML replication: Andy released 0.9.0 version of the software.
    • Piotr also working on support and bug fixes for CERN.
  • Jan (CERN, Monitoring task):
    • 2.3.12 Oracle interface: Data loss problems partially understood; Oracle database tuning improved the situation, insert rate now 97%. Not good enough...

Many small improvements:

·       memory leaks found + fixed

·       primary keys added to LatestValue tables

·       attempts to violate primary keys understood (problem at sending hosts, not in MR)

·       truncate too long string values

    • 2.3.14 Interface open source database: automatic table generation + rolling development finished.
    • 2.3.21 MSA: Solaris compatibility added.

 

AOB

-      The EDG 2.0 release seems to be very close now (the tag released this morning seems to be a good candidate). After this release, the upgrade to gcc 3.2.2 needs to be done (the autobuilt RPMs are already in the repository but they have never been integrated and deployed yet, as EDG 2.0 is based on gcc 2.95). This means that the EDG 2.1 integration will realistically start at the beginning of September, and just very few integration slots will be available. The project will prioritize on what and when will be integrated.

-      Reminder on the agenda for WP4 meetings at the EDG Heidelberg conference:

o      Tutorial for system administrators on the new set of WP4 tools for system management (installation and configuration) to be hold on Friday morning. To register, send a mail to German.Cancio@cern.ch

o      WP4 internal hands-on tutorials on Gridification, Resource management and Fault Tolerance. Monday afternoon. 1 hour per task. Prerequisites: sw for the tutorial needs to have been previously installed and configured in the machines to not to waste time on this. Send draft agenda to Maite before the end of August.

o      Friday afternoon: task reports.

o      See the preliminary conference agenda at: http://agenda.cern.ch/fullAgenda.php?ida=a032023

 

Next meeting: 27/08/2003