Software/rxgrid

日本語のページはこちらです。

RuBLX is a Ruby-based batch language for Xgrid, and rxgrid is a processor for the language. Xgrid is an environment for distributed and parallel computing on the Mac OS X operating system, and Ruby is an object-oriented programming language for general purposes.

RuBLX and rxgrid enable users to

  • submit Xgrid jobs specified not in the standard XML-based batch files but in Ruby-based concise batch files.
  • manage them not by job IDs but by their logical names.

The rest of this page organizes as follows.

This page assumes that readers are familiar with the standard XML-based batch language for Xgrid, the standard client program xgrid and the basic of Ruby.

rxgrid : A Xgrid client program for RuBLX bookmark

License bookmark

rxgrid is distributed under the GNU General Public License (GPL) version 2.

Installation bookmark

  1. Download an archive file 'rxgrid_v095.zip' from the following link to your Xgrid client machine.
  2. Unzip it.
    % unzip rxgrid_v095.zip
  3. Copy rxgrid to an appropriate path. In the following example, rxgrid is copied into /usr/local/bin.
    % sudo cp rxgrid_v095/rxgrid /usr/local/bin

Usage bookmark

rxgrid [-h[ostname] hostname] [-auth { Password | Kerberos }]
      [-p[assword] password] [-xgrid xgrid-command]
rxgrid [-nosubmission] [-createbatchfiles] -job batch [-gid grid-identifier]
          [-map mapfile] RuBLX-batch-file
rxgrid -job results [-map mapfile] [-id identifier] [-so stdout] [-se stderr] [-out outdir]
rxgrid -job {stop | suspend | resume | delete | specification | restart}
         [-map mapfile] [-id identifier]
rxgrid -job list [-gid grid-identifier]
rxgrid -job attributes [-map mapfile] [-id identifier]
rxgrid -version
rxgrid -help

The options are basically compatible with those of xgrid.

  • The '-xgrid' option specifies a command used by rxgrid instead of xgrid.
    • It can be specified by the environment variable 'RXGRID_XGRID'.
  • The '-nosubmission' option means 'do not submit'.
  • The '-createbatchfiles' option means 'create XML-based batch files'.
  • The '-map' option specifies a map file.
    • The '-map' option with '-job batch' specifies a map file name to be generated.
  • The '-id' option
    • The '-id' option with '-map' option specifies a logical job name in the map file.
    • The '-id' option without '-map' option specifies a job ID number.
  • At least either the '-id' option or the '-map' option must be specified for the '-job results' and the '-job attributes'.

Examples bookmark

Example 1 bookmark

Fig.1 shows a batch file with a job which executes a calculator program '/usr/bin/bc' to evaluate an expression described in a script file 'bc_exp.txt' of Fig.2.

01: file "bc_exp.txt" do 
02:   agentPathName  "bc_exp.txt"
03:   localPathName  "bc_exp.txt"
04:   isExecutable  false
05: end
06: 
07: task "bc" do 
08:   command  "/usr/bin/bc"
09:   arguments  ["-q", "bc_exp.txt"]
10:   refersTo  ["bc_exp.txt"]
11: end
12: 
13: job "job1" do 
14:   tasks  ["bc"]
15: end

          Fig.1: An RuBLX batch file 'bc.rb'
01: 1 + 2
02: quit

          Fig.2: A script 'bc_exp.txt' for a calculator program 'bc' 
% rxgrid -h xgridcontroller -p pass -job batch bc.rb

          Fig.3:  Job submission by rxgrid

Fig.3 shows how to submit the batch file of Fig.1 using rxgrid. The options of rxgrid are basically compatible with those of xgrid.

01: 381,job1

          Fig.4: A generated map file 'bc_map.csv'

The rxgrid command generates a map file at submission. It is a file which records correspondence between job IDs and their logical names. Fig.4 shows a map file which indicates that the job ID of the submitted job 'job1' is 381.

Case(1) 
% rxgrid -h xgridcontroller -p pass -job results -id job1 -map bc_map.csv

Case(2) 
% rxgrid -h xgridcontroller -p pass -job results -map bc_map.csv

Case(3) 
% rxgrid -h xgridcontroller -p pass -job results -id 381

          Fig.5: Retrieval of the results 

The three cases of Fig. 5 show how to retrieve the results of finished jobs. To retrieve the results, at least one of '-id' and '-map' are required.

  • In the case (1), a job whose result is retrieved is specified by both '-id' and '-map'.
  • If '-id' is not specified as the case(2), the results of all jobs in a specified map file are retrieved.
  • If '-map' is not specified as the case(3), the result of the job with a specified id is retrieved.

Example 2 bookmark

Fig.6 shows a batch file whose jobs have dependency relationships each other.
Three jobs 'job0', 'job1' and 'job2' are defined there.

  • The 'job0' is a previously submitted job whose id is specified by a pair of a logical job name and a map file in the line 12.
  • The 'job1' is a job with a task 'echo1'.
  • The 'job2' is a job with a task 'echo2' which is executed after two jobs 'job0' and 'job1' are done. The dependency is defined in the line 21.
01: task "echo1" do 
02:   command  "/bin/echo"
03:   arguments  ["1"]
04: end
05: 
06: task "echo2" do 
07:   command  "/bin/echo"
08:   arguments  ["2"]
09: end
10: 
11: job "job0" do 
12:   id  jobId("job1", "bc_map.csv") 
13: end
14: 
15: job "job1" do 
16:   tasks  ["echo1"]
17: end
18: 
19: job "job2" do 
20:   tasks  ["echo2"]
21:   dependsOnJobs  ["job0", "job1"]
22: end

          Fig.6: An RuBLX batch file with dependency relationships among jobs

Example 3 bookmark

Fig.7 shows a batch file whose tasks are dynamically determined at submission. A task is defined for each file whose name ends with '.txt'. The task executes the calculator program 'bc' to evaluate the file.

In this example, variables, arrays, control-flow and a standard Ruby library are used because it is not known how many files will exist at submission.

  • The files are collected by a standard library 'Dir' in the line 1.
  • A file template is used from the line 4 to the line 8.
  • A task template is used from the line 16 to the line 20.
01: filelist = Dir.glob("*.txt")
02: 
03: filelist.each do |f|
04:   file f.to_s do 
05:     agentPathName  f.to_s
06:     localPathName  f.to_s
07:     isExecutable  false
08:   end
10: end
11: 
12: taskNames = []
13: filelist.each do |f|
14:   taskName = "bc" + f.to_s
15:   taskNames = taskNames | [taskName]
16:   task taskName do 
17:     command  "/usr/bin/bc"
18:     arguments  ["-q", f.to_s]
19:     refersTo  [f.to_s]
20:   end
21: end
22:   
23: job "job1" do 
24:   tasks  taskNames
25: end

         Fig.7: An RuBLX batch file with dynamically defined tasks

RuBLX : A Ruby-based Batch Language for Xgrid bookmark

A batch file includes

  • one or more job definitions, one or more task definitions and zero or more file definitions.
  • any Ruby code anywhere.

The order of definitions is not significant.

Fig.8, Fig.9, Fig.10 and Fig.11 show the syntax of RuBLX.

  • Non terminal symbols are enclosed with '<' and '>'.
  • A pair of parenthesises followed by a question mark '(X)?' means that X is optional.
  • A pair of parenthesises with a vertical bar between them '(X|Y)' means that X or Y.
  • The order of specifications in a do-end block of each definition is not significant.

File definition bookmark

file <LOGICAL_FILE_NAME> do 
    agentPathName  <PATH_ON_AGENT> 
  ( localPathName  <PATH_ON_LOCAL> |<PARAM> .contents = <STRING> )
  ( isExecutable  <EXECUTABLE> ) ? 
end 

          Fig.8:  The syntax of a file definition

The syntax of Fig.8 is for a file definition.

  • <LOGICAL_FILE_NAME> is a logical file name.
  • <PATH_ON_AGENT> specifies a path of the file on an agent.
  • The contents of the file is specified either by <PATH_ON_LOCAL> or by <STRING>.
    • <PATH_ON_LOCAL> specifies the contents by a local file path.
    • <STRING> specifies the contents by a string.
    • <EXECUTABLE> is either true or false.

Task definition bookmark

task <LOGICAL_TASK_NAME> do 
     command  <PATH_OF_COMMAND> 
   ( arguments  <COMMAND_ARGUMENT_LIST> ) ? 
   ( environment  <ENVIRONMENT_HASH> ) ? 
   ( inputStream  <LOGICAL_FILE_NAME> ) ? 
   ( dependsOn  <LOGICAL_TASK_NAME_LIST> ) ? 
   ( refersTo  <LOGICAL_FILE_NAME_LIST> ) ? 
   ( inputFileMap  <INPUT_FILE_MAP_HASH> ) ? 
end 

         Fig.9: The syntax of a task definition

The syntax of Fig.9 is for a task definition.

  • <LOGICAL_TASK_NAME> is a logical task name.
  • <PATH_OF_COMMAND> specifies a path of the command.
  • <COMMAND_ARGUMENT_LIST> specifies the command line arguments by an array.
  • <ENVIRONMENT_HASH> specifies environment variables by a hash.
    • The keys are names of environment variables.
    • The values are values of the variables.
  • <LOGICAL_FILE_NAME> specifies standard input by a path of a local file.
  • <LOGICAL_TASK_NAME_LIST> specifies tasks which the task depends on by an array.
  • <LOGICAL_FILE_NAME_LIST> specifies a list of logical file name which the task will read by an array.
  • <INPUT_FILE_MAP_HASH> specifies the correspondence between file paths on agents and the contents for this task only by a hash.
    • The keys are file paths.
    • The values are logical file names.

Job definition bookmark

job <LOGICAL_JOB_NAME>  do 
   id  ( <PREVIOUSLY_SUBMITTED_JOB_ID> | 
           jobId( <LOGICAL_JOB_NAME> , <MAP_FILE_PATH> ) ) 
end 

         Fig.10: The syntax of a job definition (case 1)

The syntax of Fig.10 is for a definition of a previously submitted job.

  • <LOGICAL_JOB_NAME> is a logical job name.
  • A previously submitted job's ID is specified either
    • by a job ID <PREVIOUSLY_SUBMITTED_JOB_ID> or
    • by a pair of a logical job name <LOGICAL_JOB_NAME> and a map file <MAP_FILE_PATH>.
job <LOGICAL_JOB_NAME>  do 
   ( mail  <MAIL_ADDRESS> ) ? 
   ( tasksMustStartSimultaneously  <TASKS_MUST_START_SIMULTANEOUSLY> ) ? 
   ( minimumTaskCount  <MINIMUM_TASK_COUNT> ) ? 
   ( dependsOnJobs  <LOGICAL_JOB_NAME_LIST> ) ? 
   ( files  <LOGICAL_FILE_NAME_LIST> ) ? 
     tasks  <LOGICAL_TASK_NAME_LIST> 
end 

          Fig.11: The syntax of a job definition (case 2)

The syntax of Fig.11 is for a definition of a job to be submitted.

  • <MAIL_ADDRESS> specifies an e-mail address to which an e-mail is sent when the job's status is changed.
  • <TASKS_MUST_START_SIMULTANEOUSLY> specifies whether tasks must start simultaneously or not by a boolean value.
    • 'true' is for yes.
    • 'false' is for no.
    • The default value is 'true'.
  • <MINIMUM_TASK_COUNT> specifies the minimum task number which need to start at the same time.
    • The default value is 1.
  • <LOGICAL_JOB_NAME_LIST> specifies logical job names which the job depends on by an array.
  • <LOGICAL_FILE_NAME_LIST> specifies logical file names used in the job by an array.
  • <LOGICAL_TASK_NAME_LIST> specifies logical task names in the job by an array.

Paper bookmark

Related Links bookmark