From: Francesco Prelz [Francesco.Prelz@mi.infn.it] Sent: Friday, April 27, 2001 7:59 PM To: German.Cancio.Melia@cern.ch; francesco.giacomini@cnaf.infn.it; massimo.sgaravatto@pd.infn.it; Olof.Barring@cern.ch Cc: prelz@mi.infn.it; massimo.mezzadri@mi.infn.it Subject: Wp1-Wp4 notes Here's the notes I could take today. They are obviously open for comments, and I probably missed something (Olof's notes will help here). Thanks again for traveling to beautiful Milano. Hope the trip was not too stressful. Francesco P. 1) MDS-2 issues: - there are many things to learn about this new Info system model, which has just been released in -binary- alpha form. WP4 asks whether WP1 can provide an updated version of the schema for LRMS (Local Resource Management System) information, reflecting the changes that were made for MDS-2. WP1 is trying to make MDS-2 work with LRMS information (no success yet to report), and will try to report on the LSF and PBS info. - WP1 volunteered to provide WP4 with a template for information providers in the new MDS-2 model. WP4 thinks it's good, and would prefer to update the new,improved Globus "gram reporter" rather than rewrite an information provider for LRMSs from scratch. A deadline/checkpoint for this was identified in the first days of June, before the June PTB (June 6th). - We need keep WP3 informed about this activity. Wednesday meetings a good opportunity ? - Did anybody think of an emergency plan if MDS-2 does not deliver ? 2) Information schema issues: - ExpectedStartTime ---> WorstTraversalTime + EstimatedTraversalTime (the estimate is the scaled value of the last traversal time). - ComputingPower ---> MinSI00 + MaxSI00 + AvgSI00 (or whatever SpecInt benchmark is used). - AuthorizationPolicies: WP1 needs to understand from WP6 whether gridmap file publishing will be OK for Testbed 1. - StorageElements: a list of SE URIs (format to be determined with interaction with WP5) that are "close enough" to a computing resource/queue/element (we really need to decide on a term here) so as not to upset the local administrator (e.g.: on the same LAN). - Other SE information: needs to be negotiated with WP5. - Will add the boolean value keys: AFSAvailable, OutboundIP, InboundIP (these are of course just proposed names). - There needs to be some technical machinery in place to grant that the advertised "local disk footprint" is actually available to a running job, even in case more than one process is running on a given "worker node. - WP4 will negotiate with the Application WPs what kind of information "tag(s)" they would like to use to identify an appropriate Testbed SW installation (e.g.: CMSVersion + CMSRoot, or just a list of RPM names/hashes, etc...) WP1 will just transmit and match the tag requests through the JDL. - WP1 will provide an updated information schema, folding in the results of the discussion we had today, identifying data types and formats. We will then iterate the schema through WP4. 3) Executable submission: WP4 asks how the executable will be transferred to the worker nodes. The answer "through the GRAM mechanisms" turns out not to be satisfactory. LSF (or other LRMSs) may have to rely on a common filesystem across the farm nodes to transfer the executable, and/or the standard streams. WP4 has to support the case where the worker nodes are completely independent. WP1 will investigate whether an appropriate LSF/GRAM configuration can be applied to support GRAM submission to independent nodes. After some clarification, it appears that, in order to transfer with the current tools an appropriate job "sandbox" (tarball, input files from the submitting host or maybe even input files from a SE for a non-grid-enabled job) outbound IP access (at least) is definitely needed. It will be very hard to support "closed" worker nodes at PM9. The appropriate files will be transferred, as part of the job execution, by an appropriate preamble (provided by WP1), or obtained directly via the appropriate tools by a grid-enabled application. 4) Miscellaneous issues: - WP4 will not provide LRMS kits (or operating system kits, for that matter). Some doubt emerged as to whether WP6 is fully aware of this. - The results of some of the investigations we agreed to conduct (MDS-2, GRAM, etc.) will be reported to the lists: workload-eu-datagrid@infn.it hep-proj-grid-fabric@listbox.cern.ch