Opened 10 months ago

Closed 9 months ago

#3101 closed help (answered)

Mass @ Jasmin problem extracting

Reported by: jjas3 Owned by: um_support
Component: Other Keywords: MASS JASMIN
Cc: Platform: JASMIN
UM Version:

Description

Hi All,

I'm using a .sh script to extract PP data from mass on jasmin for a number of STASH codes and suites over a 10 year range.

here is an example script:

#! /usr/bin/env bash
declare -a zzStashCodes=(34099)
declare -a zzYears=$(seq 2061 2061)
declare -a zzSuite=(u-bp327)
declare -a zzStream=(apm.pp)

for zzStashItem in ${zzStashCodes[@]}; do

mkdir -p ${zzSuite}_stash_${zzStashItem}
cd ${zzSuite}_stash_${zzStashItem}
echo ${zzSuite}_stash_${zzStashItem} # write to screen

for zzYear in ${zzYears[@]};do

mkdir -p ${zzYear}
cd ${zzYear}
echo ${zzYear} # check
cp ../../call_template_3 call_edit.dat
sed "s/PHSTASHCODE/${zzStashItem}/g;s/PHSTASHYEAR/${zzYear}/g" call_edit.dat > call_active.dat
cat call_active.dat
/opt/moose/external-client-version-wrapper/bin/moo select call_active.dat moose:crum/${zzSuite}/${zzStream} ./
cd ..

done
cd ..

done

I've been using this script successfully for a few days but yesterday and today it is running very slowly and is often getting stuck (temporarily or seemingly permanently) with this message:

See /gws/nopw/j04/ukca_vol1/jjstauntonsykes/mass_extracts/u-bp049/u-bp049_stash_34001/2057/./MetOffice_data_licence.836368567 for conditions of use.
moose:/crum/u-bp049/apm.pp/bp049a.pm2057nov.pp: file transfer failure.

  • task #11 (attempt 1 of 10): transfer failed (ERROR_TRANSFER).

moose:/crum/u-bp049/apm.pp/bp049a.pm2057nov.pp: file transfer failure.

  • task #11 (attempt 2 of 10): transfer failed (ERROR_TRANSFER).

moose:/crum/u-bp049/apm.pp/bp049a.pm2057nov.pp: file transfer failure.

  • task #11 (attempt 3 of 10): transfer failed (ERROR_TRANSFER).

moose:/crum/u-bp049/apm.pp/bp049a.pm2057nov.pp: file transfer failure.

  • task #11 (attempt 4 of 10): transfer failed (ERROR_TRANSFER).

moose:/crum/u-bp049/apm.pp/bp049a.pm2057nov.pp: file transfer failure.

  • task #11 (attempt 5 of 10): transfer failed (ERROR_TRANSFER).

Sometimes there will be only 1 or 2 attempts before it then succeeds, and sometimes it may remain for hours on the various attempts.

Have you any idea what might be going on and how I can avoid this in the future?

Many thanks,
Johnny

Change History (3)

comment:1 Changed 10 months ago by jjas3

Hi All,

In case this helps, the extraction seemed to complete the above step and then failed at the next one with this error message:

### select, command-id=836368567, estimated-cost=113081280byte(s), files=12, media=0
See /gws/nopw/j04/ukca_vol1/jjstauntonsykes/mass_extracts/u-bp049/u-bp049_stash_34001/2057/./MetOffice_data_licence.836368567 for conditions of use.
moose:/crum/u-bp049/apm.pp/bp049a.pm2057nov.pp: file transfer failure.

  • task #11 (attempt 1 of 10): transfer failed (ERROR_TRANSFER).

moose:/crum/u-bp049/apm.pp/bp049a.pm2057nov.pp: file transfer failure.

  • task #11 (attempt 2 of 10): transfer failed (ERROR_TRANSFER).

moose:/crum/u-bp049/apm.pp/bp049a.pm2057nov.pp: file transfer failure.

  • task #11 (attempt 3 of 10): transfer failed (ERROR_TRANSFER).

moose:/crum/u-bp049/apm.pp/bp049a.pm2057nov.pp: file transfer failure.

  • task #11 (attempt 4 of 10): transfer failed (ERROR_TRANSFER).

moose:/crum/u-bp049/apm.pp/bp049a.pm2057nov.pp: file transfer failure.

  • task #11 (attempt 5 of 10): transfer failed (ERROR_TRANSFER).

moose:/crum/u-bp049/apm.pp/bp049a.pm2057nov.pp: file transfer failure.

  • task #11 (attempt 6 of 10): transfer failed (ERROR_TRANSFER).

2058
begin

stash=34001
year=2058

end
select (attempt 1 of 10): (failed with code ERROR_STORAGE_SYSTEM_UNAVAILABLE) storage system is currently unavailable.

uk.gov.meto.moose.business.requesthandler.service.exceptions.StorageUnavailableException?: System currently unable to accept commands requiring Storage

Does this help at all with working out where I might be going wrong?

Many thanks,
Johnny

comment:2 Changed 10 months ago by dcase

Johnny,
the problem is at the Met Office end. If you follow the Monsoon Collaboration Yammer channel there should be updates.

If you don't use Yammer, there is not a lot of info on there at the moment, but they are aware of the problems and working to fix them.

Dave

comment:3 Changed 9 months ago by ros

  • Resolution set to answered
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.