- Log in to the Azure portal at https://portal.azure.com ➢ navigate to the Azure Synapse Analytics workspace you created in Exercise 3.3 ➢ click the Open link in the Open Synapse Studio tile on the Overview blade ➢ select the Integrate hub ➢ expand the Pipelines group ➢ and then select the pipeline created in Exercise 6.2 (for example, TransformSessionFrequencyToMedian).
- Drag and drop a Validation activity from the General group ➢ change the name of the activity (I used Validate Brainwaves) ➢ on the Settings tab for the Validation activity ➢ select the dataset used in Exercise 6.9 (for example, BrainwavesJson) ➢ select the True radio button to the right of Child Items ➢ connect the Validation activity to the Custom activity ➢ and then click Commit. The result should resemble Figure 6.63.

FIGURE 6.63 Validate batch loads with the Validation activity
- Select the Publish menu item ➢ click the Add Trigger button ➢ select Trigger Now ➢ and then click OK.
The dataset from Exercise 6.9 that you chose as the location to perform the validation on is the same location as the inputLocation parameter sent to the Custom activity. The following is the specific location:
brainjammer/SessionJson/ClassicalMusic/POW
This is desired because the purpose of the Validation activity is to confirm the existence of files within the targeted location. Selecting the True radio button, which is a property of Child Items, instructs the Validation activity to check for files in the targeted directory. Selecting Ignore checks for the existence of the folder only. Selecting False results in the confirmation that the folder exists but is empty. As there are indeed files in the target location defined by the dataset, the result of the validation is successful, which means the pipeline run will continue. Had the validation failed, the pipeline run would have stopped and not proceeded to the Custom activity, which is next. However, as shown in Figure 6.64, you can perform some activities if the validation results in a failure.

FIGURE 6.64 Validate batch loads with Validation activity failure
You can better manage pipeline activities by adding a failure path that links to other activities. For example, if the files do not exist in the inputLocation, which is a pipeline parameter, you could use an activity like the Lookup activity to try and find the necessary files to process. (The next section discusses this further.) Other configuration options for the Validation activity include Sleep, Timeout, and Minimum Size. With a default of 10, if an attempt to validate the existence of the files fails (for example, the ADLS container is not accessible), then the next attempt to validate will happen 10 seconds later. The Timeout value, which has a default of 12 hours, is the timeframe after which the Validation activity will stop trying to perform its task. Finally, for CSV files, there is an additional option that is not provided for JSON or Parquet files: Minimum Size. As you might have guessed, you can set this value to ensure that the file to be ingested or transformed is larger than a specific size. The default is zero bytes.
Lookup Pipeline Activity
The Lookup pipeline activity is used to retrieve content from a configuration file or table. This capability can be helpful in scenarios like those mentioned in the previous section where a Validation activity failed and you want to recover. To recover, you might consider providing other options in a configuration file for progressing the pipeline run forward. To get an idea of how to achieve this, complete Exercise 6.13.