-----------------------------------------------------
BASES POUR LA DISCUSSION
-----------------------------------------------------

Documents de base rédigés par Gilles Mathieu et Hélène Cordier suite à la réunion du 20/01 sur le sujet.

commentaires, feedback et questions reçu(e)s de la part de:
TG - Tristan Glatard (Biomed)
EM - Emmanuel Medernach (LPC)
YL - Yannick Legre (Healthgrid)


-----------------------------------------------------
COMMENTAIRES SUR LE DOCUMENT: "GENERAL RA STRATEGY"
-----------------------------------------------------

Identified needs and goals
--------------------------

"Current situation is not good enough. The distribution of resources is not necessarily fair, which is detrimental to many communities"
-> TG: Do you have a concrete illustration for this point? According to which criterion is fairness defined?
-> EM: I appreciate very much the concern about fairness, I really think that has to be emphasized as one goal.

"France-Grilles needs to demonstrate that a policy to allocate resources exists, and that the situation is under control"
-> TG: If this document is to be publicly released I think that this sentence should be rephrased. For now it reads a bit like “nothing is under control but we should make sure that it still looks ok”. 
Proposition: "The current resource allocation model which was designed with the needs of EGEE projects (maybe LCG too) should be revised to ensure the visibility and sustainability of France-Grilles."

EM: Which communities are we exactly talking about ?  There are various kind of communities using French NGI resources, spanning from international to regional, thematic or project-driven.  This different kinds have to be stated in relation to the NGI.
YL: Moreover, we are now considering Virtual Research Communities within EGI. These VRC will gather several VOs, projects, countries and groups.


a priori and a posteriori analyses
----------------------------------

TG: what are the criteria on which the decision to reject a project can be taken?


Allocating resources to established communities
-----------------------------------------------

- with whom do international VOs negotiate their resources: sites, NGIs, EGI?
TG: From the latest discussions with the UCB I understood that the VO (in fact VRC) should negotiate with NGIs but not with sites.
EM: Does a community have to negotiate resources with the NGI which them negotiates with sites or does a community have to negotiate with sites directly? HEP communities currently negotiate directly with sites, is it a problem?
YL: Currently it seems EGI is moving toward a negotiation between VRC and NGIs, this is also a wish from the VRCs. Remember, these negotiations are not only bargaining but also implies the signature of a bidding MoU. It would not be scalable to ask every communityto sign a MoU with every site worldwide... The NGI seems to be the finest viable granularity.


- How do we deal with externally driven communities (e.g. WLCG)
TG: IMHO it is not a problem. They can live their lives without hampering France-Grilles' allocation process.

- How do we report usage for those communities?
TG: sites will report usage statistics to the NGI and the NGI will aggregate them. If communities are outside of the NGI allocation process then I don't see why the NGI should bother with reporting on their usage. 


- How to measure a percentage of "French" used resources vs "foreign usage"?
TG: To distinguish between French vs foreign usage within a VO, maybe a practical solution (1) could be to multiply the total VO usage by the fraction of the VO DNs delivered by the French CA. For instance, if biomed consumed 100 hours and 27% of the DNs in biomed are French then we consider the French biomed usage to be 27 hours. 
An alternate (2) is to enforce VOs to group their users in countries and to monitor the group activity instead of the VO. This enforcement may be part of a VO/NGI OLA. But I don't know if it's technically feasible to monitor resource usage for a group.
Finally the pilot-job case should be mentioned (see Cécile Barbier's comments on Jan 20th). After checking it seems that th4e accounting cannot distinguish the pilot owner from the task owner, even if glexec is used. For this reason I would favor solution (1). 


general comments/questions
--------------------------

GM: should we start talking about the "national VO" in the strategy document, or should it remain general?

EM: Does the resource allocation agreement protocol has to be per each VO ? Maybe we could have a more global view: most of the times sites have to reach a compromise in order to respect their MoU. One may not ask for more than they could provide.

EM: We have to take into account that sites may have resources external to the NGI infrastructure.


-----------------------------------------------------
COMMENTAIRES SUR LE DOC: ESTABLISHMENT OF A NATIONAL VO
-----------------------------------------------------


Requirements on the VO itself
-----------------------------

"rules for this national VO and its sub-groups should be the same as other VO and  could be derived of biomed experience"
-> TG: I don't think that biomed should be the only reference for VO management. HEP VOs have much more experience and are able to deliver high-quality services to their users. 

EM: Even with being "catch-all" it would be worth to have clearly defined sub-groups per projects.
YL: a VRC will gather several VOs, projects and most of the time having members coming from different countries, and thus could be considered as
"catch-all" in one scientific area. Having groups well defined has been a request we have expressed long ago and I agree this is now even more needed. What could be the granularity to define groups and sub-group? does this has to be defined at the VOMS level and will be automatically implemented by sites or does this has to be part of the negociation with NGIs and part of the MoU?

EM: How to ensure that this National VO will be well supported by sites ? Some already have obligations and no more to offer for a national VO.

EM: Among VO management someone is needed in order to install and maintain applications. The same for security issue.
YL: in HEP, yes... in other VOs, there are most of the time plenty of software admins... maybe we have to discuss also on what has to be "permanently" installed and maintained and what {c|sh}ould be done on a less permanent basis...


services and resources offered to the national VO
-------------------------------------------------

EM: Beware that lightweight demands are the most affected by the current "grid online philosophy", it means that to be efficient an offline mechanism or preemption as to be used for them, else not much could be guaranteed.


validation: pros and cons
-------------------------
"Traceability  of  French resources utilization" 
-> TG: Only part of the communities: I don't think that this VO could/should catch all the French users.


Tools and implementation
------------------------
EM: About wasting of resources, both sites and communities want to avoid that.  A tool is needed for sites to know about VO (current and planned) needs and to help communities negotiate sites resources. In medium-term I think it would be nice for sites to have a tool for asking jobs if slots are free, instead of, say, relying of the possible coming of pilot jobs.

EM: The most practical in my opinion would be to have a job submission framework (like PANDA or DIRAC) this allows NGI to control the share precisely (it is unpractical and not very trustworthy to rely on all sites to modify their share each time a change has to be made). Moreover it could be easier for user.

EM: I know there is a reservation tool used by Grid5000, I am curious to know how would it be possible to adapt it to a National VO usage ?  Reservations like facility are definitely needed to provide a true QoS.

How to measure what is done?
EM: This supposes to have a reliable way to measure resources consumption. For instance sites declare themselves their SpecInt/HEP-Spec, do we want to rely accounting upon that ? Is it truly reliable ?

EM: About the threshold, in my opinion it would be a good idea to have a shared pool of resources of practical size with QoS aside from other pools with other QoS. When a community feels that this shared pool is not enough for its need, it could ask for more resources (instead of having to play the role of a policemen and to punish the guilty party).

- Resource usage is guaranteed if job submission is uniform
EM: As need of grid computing is rather bursty, I think we need not to be short of flexibility. Moreover sites are passive in front of job arrival ("Requirements" and "Rank" in WMS site selection may unfairly penalize some sites). This is why burst should be planned as most as possible and unplanned one does not have to bind site responsibility.

- support/resource criteria 
EM: We have to define what are the resources offered and metrics: for instance about storage data may have a lifetime.