You are here: Foswiki>AGLT2 Web>TroubleLogs (16 Oct 2009, TomRockwell)EditAttach

Trouble

Atlas

Atlas Analysis

Job mishandled OSG APP

Paul, Bob,
OSG_APP should be "/atlas/data08/OSG/APP".
"atlas_app/atlas_rel" are subdirectory created when installing the ATLAS 
sw.
If a CE is including atlas_app/atlas_rel in its OSG_APP, it sould fix its 
OSG installation 
Marco


On Mon, 25 Feb 2008, Paul Nilsson wrote:

> > Hi Bob,
> > I've added protection against this; if the OSG_APP already contains the
> > atlas_app/atlas_rel path (maybe that's the correct default?). This is
> > used for job matching and will not fail the job unless the release is
> > actually missing. Will go in with the next release.
> > Cheers,
> > Paul
> >
>> >> -----Original Message-----
>> >> From: Bob Ball [mailto:ball@umich.edu]
>> >> Sent: Monday, February 25, 2008 3:44 AM
>> >> To: Akira Shibata
>> >> Cc: nurcan@hepmail.uta.edu; Thomas Rockwell; Paul Nilsson; pandashift
>> >> list; Usatlas-ddm-l@lists.bnl.gov; smirnov@hep.uchicago.edu
>> >> Subject: Re: [Usatlas-ddm-l] CMD_CHECKSUM is not defined at AGLT2
>> >>
>> >> That doesn't answer the question, though, which is why it is looking
>> >> for
>> >> a repeated directory structure
>> >> (.../atlas_app/atlas_rel/atlas_app/atlas_rel)?  Clearly that is not
> > the
>> >> case here, so does that mean some environmental variable is not set
>> >> correctly?
>> >>
>> >> bob
>> >>
>> >> [umopt1:~]# cd /atlas/data08/OSG/APP/atlas_app/atlas_rel
>> >> [umopt1:atlas_rel]# ll
>> >> total 60
>> >> drwxr-xr-x  14 usatlas 4096 Nov 29  2006 11.0.42/
>> >> drwxr-xr-x  26 usatlas 4096 Nov 29  2006 12.0.3/
>> >> drwxr-xr-x  26 usatlas 4096 Mar 20  2007 12.0.31/
>> >> drwxr-xr-x  26 usatlas 4096 Sep  3 23:03 12.0.4/
>> >> drwxr-xr-x  27 usatlas 4096 Feb 28  2007 12.0.5/
>> >> drwxr-xr-x  26 usatlas 4096 May 11  2007 12.0.6/
>> >> drwxr-xr-x  26 usatlas 4096 Oct 29 15:39 12.0.7/
>> >> drwxr-xr-x  26 usatlas 4096 Dec 10 17:15 12.0.8/
>> >> drwxr-xr-x  22 usatlas 4096 Jul 25  2007 12.0.95/
>> >> drwxr-xr-x  27 usatlas 4096 Dec  1  2006 12.3.0/
>> >> drwxr-xr-x  27 usatlas 4096 Jan 23 14:01 12.5.0/
>> >> drwxr-xr-x  28 usatlas 4096 Jun 28  2007 13.0.10/
>> >> drwxr-xr-x  27 usatlas 4096 Sep 21 12:50 13.0.20/
>> >> drwxr-xr-x  28 usatlas 4096 Feb 19 13:10 13.0.30/
>> >> drwxr-xr-x  27 usatlas 4096 Feb 19 09:01 13.0.40/
>> >> drwxr-xr-x   4 usatlas   97 Apr 10  2007 kitval/
>> >>
>> >>
>> >> Akira Shibata wrote:
>>> >>> Hi
>>> >>>
>>>> >>>> Thanks Tom. Job's owner Akira is Cc'ed for further follow up.
>>>> >>>> His same job succceded at SLAC and SWT2, right Akira?
>>> >>>
>>> >>> Yes that's right. With SWT2, I managed to download the output too,
>>> >>> with slack there is probably a permission issue when accessed from
>>> >>> outside BNL.
>>> >>>
>>> >>> Cheers
>>> >>> Akira
>>> >>>
>>>> >>>> Nurcan.
>>>> >>>>
>>>> >>>> On Sat, 23 Feb 2008, Tom Rockwell wrote:
>>>> >>>>
>>>>> >>>>> Hi,
>>>>> >>>>>
>>>>> >>>>> I haven't looked at one of these logs before, but this looks odd:
>>>>> >>>>>
>>>>> >>>>> 22 Feb 2008 05:58:14| The DQ2 SE has 17071 GB space left (NB:
>> >> dCache
>>>>> >>>>> is defaulted to 999999)
>>>>> >>>>> 22 Feb 2008 05:58:14| OSG app area:
>>>>> >>>>> /atlas/data08/OSG/APP/atlas_app/atlas_rel
>>>>> >>>>> 22 Feb 2008 05:58:14| Executing command: /bin/ls -alL
>>>>> >>>>> /atlas/data08/OSG/APP/atlas_app/atlas_rel
>>>>> >>>>> 22 Feb 2008 05:58:14| Output: total 68
>>>>> >>>>> drwxr-xr-x  18 usatlas2 usatlas 4096 Jan 23 10:19 .
>>>>> >>>>> drwxr-xr-x   5 usatlas2 usatlas 4096 Nov  3 14:23 ..
>>>>> >>>>> drwxr-xr-x  14 usatlas2 usatlas 4096 Nov 29  2006 11.0.42
>>>>> >>>>> drwxr-xr-x  26 usatlas2 usatlas 4096 Nov 29  2006 12.0.3
>>>>> >>>>> drwxr-xr-x  26 usatlas2 usatlas 4096 Mar 20  2007 12.0.31
>>>>> >>>>> drwxr-xr-x  26 usatlas2 usatlas 4096 Sep  3 23:03 12.0.4
>>>>> >>>>> drwxr-xr-x  27 usatlas2 usatlas 4096 Feb 28  2007 12.0.5
>>>>> >>>>> drwxr-xr-x  26 usatlas2 usatlas 4096 May 11  2007 12.0.6
>>>>> >>>>> drwxr-xr-x  26 usatlas2 usatlas 4096 Oct 29 15:39 12.0.7
>>>>> >>>>> drwxr-xr-x  26 usatlas2 usatlas 4096 Dec 10 17:15 12.0.8
>>>>> >>>>> drwxr-xr-x  22 usatlas2 usatlas 4096 Jul 25  2007 12.0.95
>>>>> >>>>> drwxr-xr-x  27 usatlas2 usatlas 4096 Dec  1  2006 12.3.0
>>>>> >>>>> drwxr-xr-x  27 usatlas2 usatlas 4096 Jan 23 14:01 12.5.0
>>>>> >>>>> drwxr-xr-x  28 usatlas2 usatlas 4096 Jun 28  2007 13.0.10
>>>>> >>>>> drwxr-xr-x  27 usatlas2 usatlas 4096 Sep 21 12:50 13.0.20
>>>>> >>>>> drwxr-xr-x  28 usatlas2 usatlas 4096 Feb 19 13:10 13.0.30
>>>>> >>>>> drwxr-xr-x  27 usatlas2 usatlas 4096 Feb 19 09:01 13.0.40
>>>>> >>>>> drwxr-xr-x   4 usatlas2 usatlas   97 Apr 10  2007 kitval
>>>>> >>>>> 22 Feb 2008 05:58:14| Looking for releases in:
>>>>> >>>>> /atlas/data08/OSG/APP/atlas_app/atlas_rel/atlas_app/atlas_rel
>>>>> >>>>> 22 Feb 2008 05:58:14| Executing command: /bin/ls -alL
>>>>> >>>>> /atlas/data08/OSG/APP/atlas_app/atlas_rel/atlas_app/atlas_rel
>>>>> >>>>> 22 Feb 2008 05:58:14| Output: /bin/ls:
>>>>> >>>>> /atlas/data08/OSG/APP/atlas_app/atlas_rel/atlas_app/atlas_rel: No
>>>>> >>>>> such file or directory
>>>>> >>>>> 22 Feb 2008 05:58:14| !!WARNING!!1999!! No releases found,
> > probably
>>>>> >>>>> problems ahead..
>>>>> >>>>>
>>>>> >>>>> The Job seems to have found the release directory
>>>>> >>>>> /atlas/data08/OSG/APP/atlas_app/atlas_rel, but then tried to find
>>>>> >>>>> atlas_app/atlas_rel below there.  Is there perhaps something wrong
>> >> with
>>>>> >>>>> the job?
>>>>> >>>>>
>>>>> >>>>> -Tom
>>>>> >>>>>
>>>>> >>>>> Nurcan Ozturk wrote:
>>>>>> >>>>>> Hi all,
>>>>>> >>>>>> Any idea about this error at AGLT2:
>>>>>> >>>>>>
>>>>>> >>>>>> 22 Feb 2008 05:58:15| !!FAILED!!3000!! Exception caught: Get
>>>>>> >>>>>> function can
>>>>>> >>>>>> not be called for staging input file
>>>>>> >>>>>> [user.AkiraShibata.lxplus214_87.lib._003249.lib.tgz,
>>>>>> >>>>>> fdr08_run1.0003050.StreamEgamma.merge.AOD.o1_r6_t1._0001.1,
>>>>>> >>>>>> fdr08_run1.0003050.StreamEgamma.merge.AOD.o1_r6_t1._0002.1]:
>> >> global
>>>>>> >>>>>> name
>>>>>> >>>>>> CMD_CHECKSUM is not defined
>>>>>> >>>>>> 22 Feb 2008 05:58:15| Exception repeated: global name
> > CMD_CHECKSUM
>>>>>> >>>>>> is not
>>>>>> >>>>>> defined
>>>>>> >>>>>>
>>>>>> >>>>>> Here is the full log:
>>>>>> >>>>>>
>> >>
> > http://gridui01.usatlas.bnl.gov:25880/logs/user.AkiraShibata.HelloWorld
>> >>
> > _FDR1Test.fdr08_run1.0003050.StreamEgammao1_r6_t1.022108_ANALY_AGLT2._7
>> >>
> > 722128.log.tgz/tarball_PandaJob_7722128_ANALY_AGLT2-condor/pilotlog.txt
>>>>>> >>>>>>
>>>>>> >>>>>>
>>>>>> >>>>>> Thanks,
>>>>>> >>>>>> Nurcan.
>>>>>> >>>>>>
>>>>>> >>>>>>
>>>>>> >>>>>> _______________________________________________
>>>>>> >>>>>> Usatlas-ddm-l mailing list
>>>>>> >>>>>> Usatlas-ddm-l@lists.bnl.gov
>>>>>> >>>>>> https://lists.bnl.gov/mailman/listinfo/usatlas-ddm-l
>>>>>> >>>>>>
>>>>> >>>>>
>>>> >>>>
>>>> >>>> --
>>>> >>>> **************************************
>>>> >>>> *  Nurcan Ozturk                     *
>>>> >>>> *  University of Texas at Arlington  *
>>>> >>>> *  Physics Department                *
>>>> >>>> *  Phone: (817)272-3082              *
>>>> >>>> *  http://www-hep.uta.edu/~nurcan/   *
>>>> >>>> **************************************
>>>> >>>>
>>>> >>>>
>>>> >>>>
>>> >>>
>>> >>>
>>> >>>
> > _______________________________________________
> > Usatlas-ddm-l mailing list
> > Usatlas-ddm-l@lists.bnl.gov
> > https://lists.bnl.gov/mailman/listinfo/usatlas-ddm-l
> >
_______________________________________________
Usatlas-ddm-l mailing list
Usatlas-ddm-l@lists.bnl.gov
https://lists.bnl.gov/mailman/listinfo/usatlas-ddm-l

-- TomRockwell - 25 Feb 2008
Topic revision: r4 - 16 Oct 2009 - 20:14:48 - TomRockwell
 

This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback