Shared Jobs
    • Dark
      Light

    Shared Jobs

    • Dark
      Light

    Article Summary

    Overview

    This article is the head of a series on Shared Jobs in Matillion ETL. Related articles include:

    Shared jobs allow you to bundle entire workflows into a single custom component and then use those custom components anywhere else in the project. Only orchestration jobs (and the transformation jobs they link to) can be shared in this manner. If a job calls another job via Run Orchestration or Run Transformation components, then all jobs will be included in the shared job. To this end, a shared job can include as many jobs as the user wishes.

    Within a shared job, the Run Orchestration component can't run an orchestration job that hasn't originally been packaged inside the shared job. If you try this, the following fail error will occur:

    {{Parameter Validation Failure:
    Job Name - Could not find job with name [<JobName>]}}
    

    Therefore, all orchestration jobs required by the workflow should be included in the shared job.


    Creating a shared job

    To create a shared job, right-click on any job in the project's job list and click Generate Shared Job. The Generate Shared Job wizard will open, where you can configure your shared job.
    The job that you right-click on is the "root" job that the new shared job relates to.


    Shared job configuration

    The Generate Shared Job wizard contains three pages of configuration settings you are required to complete. On the first page, complete the following fields:

    • New Shared Job: Select either Create New to create an entirely new shared job, or Copy From Existing to create a shared job based on an already existing job.
    • Existing: If you selected Copy From Existing, you need to select an existing shared job on which to base the new shared job. Select the job from the drop-down list.
    • Package: The package name for the new shared job. Nested packages can be specified using a period (dot) to separate package names. For example, toplevel.nesteda.nestedb specifies that the package nestedb is inside package nesteda, which is inside package toplevel.
    • Name: A name for the new shared job. Names may contain letters, numbers, underscores, single spaces, parentheses, and hyphens. Jobs can have identical names if they're within different parts of the shared jobs resource tree.
    • Revision: The revision number of the shared job. When first creating the shared job, the default revision is 1.
    • Icon: Click Browse... to locate and select an icon for this shared job. The icon must be a .png file. Matillion ETL provides a default icon if you don't select one here.
    • Description: A meta description of the shared job. This description will be visible in the component's Help tab. Simple HTML can be used for formatting.

    When you have completed these fields, click Next to move to the second page of the dialog and complete these fields:

    • Root Job: The root job is the job that the shared job originates from. By default, the job that was right-clicked to open the Generate Shared Job dialog is set as the root job. You can change this by selecting a different job from the drop-down menu. There can only be one root job.
      -Additional Jobs: Jobs listed here will be packaged within the shared job. This list should include any jobs that the root job is expected to call. Any jobs called using the run orchestration or run transformation components within the root job are added to this list by default. Jobs that may be called dynamically from the root job should also be included but must be added manually.
    1. Click + to add a job.
    2. Click Clear to remove a job from the list.
    3. Click Auto to populate the list with explicitly called jobs.

    The warning Contained Shared Jobs will not be bundled, illustrated on the above screen capture, applies to the process of exporting shared jobs. When you export a shared job, the jobs listed in this warning won't be exported along with it. You should therefore export all referenced jobs separately, to individual .melt files. These exported files should all be shipped and imported along with the main job. For a full explanation of how shared jobs are exported to .melt files, read Manage Shared Jobs.

    When you have completed these fields, click Next to move to the third page of the dialog, Parameter Configuration:

    When you use a shared job, it includes a set of configurable parameters that are dictated by the root job's job variables. These job variables populate the Parameter Configuration list, and include the following properties:

    • Name: The name of this job variable as it exists in the root job.
    • Display Name: The name of the parameter created from this job variable. You can edit this to give the parameter a more meaningful or user-friendly name.
    • Required: This setting can be enabled or disabled for each parameter. All parameters are required by default. When a shared job is created and used, all required parameters must have a value set for the shared job to validate successfully, but parameters not marked as "required" can remain empty. For shared jobs that pre-date this feature, all parameters are assumed to be required.

    When you have completed the configuration wizard, click OK to create the shared job. Click Back to return to the previous dialog pages. Click Cancel to terminate the shared job setup.


    Using shared jobs

    Created shared jobs are available from the Shared Jobs panel in the lower-left of the Matillion ETL UI. Right-click on a shared job to open a menu with the following options:

    • Open Shared Job: Opens the shared job so that you can navigate through the jobs and components contained within. This is a read-only view of the shared job.
    • Unpack: Unpacks the constituent jobs that were included when the shared job was generated. This shows you what orchestration and transformation jobs are included in the shared job. You can edit these as required, and then re-generate the shared job.
    • Export: Opens the Export Shared Jobs dialog, which is described in Manage Shared Jobs.


    Exporting variables

    The variables used inside shared jobs can be exported. To do this:

    • Create one or more job variables. To do this, right-click on the component in the job canvas, and click Manage Job Variables. For more information, read Job Variables.
    • Create the shared job. You may want to include these variables as parameters when you configure the job.
    • Create a set of Job Variables in the intended job that will be included in the newly created shared job. You will be using these to store the exported variables, so ensure they're of the expected data types.
    • Click on the shared job in the job canvas, select the Export tab, and then click Edit.
    • Take the variables from the shared job in the Source column, and map them to variables from the current (calling) job in the Target Variable column.
    • Click OK.

    Read Component Exports for more information on the export process.


    Parallel transformation jobs

    It is recommended to use Shared jobs as a means of running parallel transformation jobs on HA clusters. When multiple instances of the same transformation job are running at the same time on an HA Cluster, the values passed to those transformation jobs will not reflect those used by the job and will result in unexpected behaviour. For more information, see Designing a Job for a High Availability Cluster.

    Shared jobs are unique entities on each execution and a Shared job can be run many times in parallel on an HA Cluster without issue.

    It is advised to package any Transformation jobs as Shared jobs when wanting to run them in parallel (or they are at risk of being run in parallel) in an HA cluster.

    In this example we will demonstrate how to set up a shared job. This job will pull data out of Jira, transform the data using a filter, and display a new set of results based on our filter conditions. The filter results will then be passed into the original table. The orchestration and transformation jobs will reference the same table using a job variable.

    Here we have a Matillion ETL orchestration job that will load data from the Jira Query component. We will use the Epics data source, and choose all available items for the data selection. The Epics data will be coming from the Target Table called "jira-results". In order to use the Run Transformation component, a transformation job has been created. The Transformation job name will need to be referenced in the component in order to connect the Orchestration and Transformation jobs. Now all the required components are in place, Run Job can be selected on the job canvas.

    Orchestration Job

    When we Sample the data, the table displays the following results:

    Table Results

    In the Transformation job, we will use a Job Variable called table_name to store the name of the table "jira_results", mentioned earlier.

    Job Variable

    We will use the Table Input component to read columns from the "jira_results" table, by using the table_name job variable.

    Transformation Job

    We will apply a Filter to the "jira_results" table that sets conditions to only show Epics that have been completed. To do this, the "Done" table header will be included in the filter conditions.

    Applying a Filter

    The Table Output component will store the filtered data in the original table by using the table_name variable again, as the target table.

    Updated Results

    When we Sample the data, the table displays the following filtered results:

    New Table

    We will now package this job into a Shared Job. To do this, refer to Creating a shared job.

    Note

    To package a Transformation job, you must include an Orchestration job in your shared job.

    Begin by right-clicking on the Orchestration job in the jobs panel, and select Generate Shared Job. In the Generate Shared Job wizard, we will enter details as below:

    Creating Shared Job

    In the next step of the wizard select the job that this shared job is based on. This will be the name of our Orchestration job. The page will also display the Transformation job in the Additional Jobs section. In other words, any job in addition to the Root Job that is packaged within our shared job. In the final step of the wizard you'll see the Parameter Configuration page. This will be blank. We didn't add any job variables to the Orchestration job (Root Job). Any variables that are used in this job type will be displayed here. Click OK to generate the shared job.

    The "Epics" shared job will appear in the Shared Jobs panel in our Matillion ETL instance. For this example we want to open our shared job in a read-only format. Right-click it, and click Open Shared Job to view our Orchestration and Transformation jobs:

    Opening Shared Job