top of page
  • Writer's pictureJeff Taylor

Read JSON Files Directly From Azure Data Lake in Azure SQL Server

Updated: May 17, 2023

Loading JSON files with an on-premise SQL Server is pretty simple. You point to the local path file and load. However, it's more involved with Azure SQL and Azure Data Lake. Let's run through the steps to directly load an Azure Data Lake file into an Azure SQL Database.


Create A Database Master Key

We will need to create a strong, complex password to use to create the master key. If you don't feel all that creative, you can use a website (https://passwordsgenerator.net/) that generates random strong passwords.


Connect to your Azure SQL Server Database using SSMS or Azure Data Studio. Use the following command after editing your complex password.

CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<Complex Password>'
Create Database Scoped Credential

We will need to create a scoped credential that allows SQL Server to Authenticate with an Azure Data Lake. In addition, we will need to obtain a Shared Access Key from Azure Data Lake.


Navigate to https://portal.azure.com and select your storage account and the container you wish to access. Then under 'Settings', select 'Shared Access Tokens', and at the bottom of the screen, click 'Generate SAS token and URL. IMPORTANT NOTE: be sure and set the expiration date well in advance so the token does not expire.


Next, copy the Blob SAS token by using the copy icon on the right, paste it into the script below, and name your credential to a recognizable name. Then execute the script.

CREATE DATABASE SCOPED CREDENTIAL <AzureDataLakeCredentalName>
    WITH
        IDENTITY = 'SHARED ACCESS SIGNATURE',
        SECRET = '<Blob SAS token>';
Create External Data Source

Next, we need to create an external data source connection to the Azure Data Lake container using the credentials we just created.


Select a recognizable name for the data source, then put in the URL to your Azure Data Lake and container. Next, set in the credential name and then execute the script.

CREATE EXTERNAL DATA SOURCE <data source name>
WITH ( TYPE = BLOB_STORAGE,
LOCATION = 'https://<ADL>.blob.core.windows.net/<container>',
CREDENTIAL = <AzureDataLakeCredentalName>);
Selecting From JSON File

Now that the external data source is created, we can select from a file on Azure Data Lake. First, replace the filename with the file you wish to open and then adjust the data source name to the above created name.

SELECT
    *
FROM
    OPENROWSET(BULK '<filename>', SINGLE_CLOB, DATA_SOURCE = '<data source name>') AS j;
Dynamic Loading

Using a variable for the file name doesn't work, so to dynamically pass a file name for loading, we must create a dynamic script to execute.


Below is an example of a stored procedure that dynamically uses a file name passed to load the JSON file into a table.

CREATE PROCEDURE dbo.LoadJSONFile
(
	@FileName VARCHAR(200)
)
AS
    BEGIN
		SET NOCOUNT ON;
        DECLARE @SQLString NVARCHAR(MAX);

        SET @SQLString = N'
        INSERT INTO stg.JSONFiles (FileName, JSONData)
	    SELECT 
			''' +   @FileName + N''', *
	    FROM
			OPENROWSET(BULK ''' + @FileName + N''', 
			SINGLE_CLOB, DATA_SOURCE = ''datasourcename'') 
	    WHERE 
			NOT EXISTS (SELECT 
							1 
					   FROM 
							stg.JSONFiles 
					   WHERE 
							FileName = ''' + @FileName + N'''';
        EXECUTE sys.sp_executesql
            @SQLString;

    END;

128 views

Recent Posts

See All

Comments


bottom of page