In this post we will see how Apache Nifi can be used to handle the blobs/files in Azure Blob storage.
We will go through:
- Creating a Storage account in Microsoft Azure
- Creating Nifi template that download the files/blobs from Azure blob storage
- Creating a Nifi template that upload files to the blob storage
- Creating a Nifi template that delete blobs/files from the blob storage
Prerequisites:
It is expected that, you have fundamental knowledge on
- Microsoft Azure
- Cloud computing
- Apache Nifi
- Data pipeline
Steps:
1. Create the Storage Account
Go to Azure home portal: https://portal.azure.com/?quickstart=True#home
Click on Storage Account :
Click on Add Button, if you don’t have a Storage account. Below figures are for your reference:
Now go to the Storage Account portal: https://portal.azure.com/?quickstart=True#blade/HubsExtension/BrowseResource/resourceType/Microsoft.Storage%2FStorageAccounts
Click on the storage account name
Under “Blob service” section
Click on Containers.
Click + Container button to create a new container
Give a name and leave others to default
Now click on the Create button at the bottom.
Now click on container name
Click on Upload button to upload a file
Now you need to get the Access Keys
Under Settings section, click on Access Keys
Now click on Show Keys button
Note down the Key (not connection string) under key1
2. Nifi template for downloading the blobs from a container
Go to Nifi interface
Create a ListAzureBlobStorage processor.
Configure with following properties
Container Name: Get the name you have created in previous step
Storage Credentials:
Create a new Controller Services “AzureStorageCredentialsControllerService”
Properties are: Storage Account name (same as you have created before), Storage Account Key (same as you have noted down in previous step)
Now Enable the Controller Service
Create a FetchAzureBlobStorage processor with the same properties of ListAzureBlobStorage processor.
Create a PutFile processor, with the following properties:
Directory: /tmp/azure/success (you can change this according to your requirement)
Leave rest properties to default
Create a PutFile processor, with the following properties. This is just to track if anything goes wrong:
Directory: /tmp/azure/fail (you can change this according to your requirement)
Leave rest properties to default
Now the Nifi template should look like below:
3. Nifi template to UPLOAD files to a container
Get the following processors and connect them.
Properties of GetFile:
Properties of PutAzureBlobStorage processor:
4. Nifi template to delete files/blob from a container
Create the following nifi processor:
Properties of ListAzureBlobStorage processor:
Properties of DeleteAzureBlobStorageprocessor:
Here I am deleting only one blob(i.e. Image 2.png) . But you can leave the Blob properties to its default value to delete all the blobs
do you have any video step by step guide please share link .
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteThanks Chinmaya for the detailed explanation.
DeleteI followed the above steps until ListBlobStorage. It got validated until this processor.
When I created the FetchBlobStorage, with the configuration same as ListBlobStorage. While validating, I am unable to proceed further, since it asks for "SAS token" and "Common Storage Account Endpoint Suffix".
I have even generated the SAS Token :
Container --> Settings --> Shared Access token --> Generate SAS Token and URL.
Can you assist here further?
Can you please help here?
your snaphsot are not visible ,pls check once.
ReplyDeleteThe images on this post are not visible
ReplyDelete