Data Discovery Connector SDK

Overview

The Data Discovery Connector SDK facilitates seamless interaction with the worker node services, allowing users to engage with custom scan jobs effectively. This SDK streamlines the process of accessing and utilizing the functionalities of the worker node services, enhancing the overall experience of interacting with custom scan jobs.

Getting Access to the SDK

Please reach out to your OneTrust representative or the support team to obtain the Data Discovery Connector SDK files.

Initialize SDK - OTDataDiscoveryBridge Client

UUID datasourceId;  //UUID of the Custom Scanner datasource

String bridgeBaseUrl;  // Worker Node Ingress URL to Data Discovery Connector/Bridge

String tokenUri; // Cloud Environment base url

String clientId; // Client Id from Client Credential with Data Discovery scope

String clientSecret; // Client Secret from Client Credential with Data Discovery scope 

IdentityServiceConfiguration identityServiceConfiguration =  
        IdentityServiceConfiguration.builder()  
                .url(new URL(tokenUri))  
                .build();

ClientCredentialAuthProvider authProvider =  
        new ClientCredentialAuthProvider(clientId, clientSecret, identityServiceConfiguration);

OTDataDiscoveryConfiguration otDataDiscoveryConfiguration =  
        OTDataDiscoveryConfiguration.builder()  
                .datasourceId(datasourceId)  
                .url(new URL(bridgeBaseUrl))  
                .build();

OTDataDiscoveryBridge bridge = OTDataDiscoveryBridge(authProvider, otDataDiscoveryConfiguration);

Running Scans

Wait for the Job to be Initiated

Scan Jobs should be initiated from the OneTrust application for the configured data source. A polling thread should be implemented in the custom scanner to periodically look up any pending jobs.

List<JobDetails> pendingJobs = bridge.getJobs()

If there is a pending job, the list will be non-empty.

Start/Complete/Fail Job

public void startJob(JobDetails jobDetails) {
    bridge.startJob(jobDetails);
}

public void completeJob(JobDetails jobDetails) {
    bridge.completeJob(jobDetails);
}

public void failJob(JobDetails jobDetails, String errorMessage) {
    bridge.failJob(jobDetails, errorMessage);
}

Catalog Data

DatabaseMetadata dbMetadata =  
                            DatabaseMetadataRequestsBuilder.databaseMetadataBuilder(  
                                    ownerName,  
                                    productFamily,  
                                    productName,  
                                    productVersion,  
                                    databaseName,  
                                    childrenCount);  
//Create MetadataRequest for Database  
MetadataRequest metadataRequest =  
                DatabaseMetadataRequestsBuilder.builder(  
                        jobId,  
                        datasourceId,  
                        dbMetadata);
CatalogOrSchemaMetadata catalogMetadata =  
                            CatalogMetadataRequestsBuilder.catalogMetadataBuilder(  
                                    catalogName,  
                                    version);  
//Create MetadataRequest for Catalog  
MetadataRequest metadataRequest =  
                CatalogMetadataRequestsBuilder.builder(  
                        jobId,  
                        datasourceId,  
                        databaseName,  
                        List.of(databaseName),  
                        EntityType.DATABASE,  
                        List.of(catalogMetadata));
TableMetadata tableMetadata =  
                            TableMetadataRequestBuilder.prepareTableMetadata(  
                                    remarks,  
                                    tableName,  
                                    numRows,  
                                    size);  
//Create MetataRequest for Tables  
MetadataRequest metadataRequest =  
                        TableMetadataRequestBuilder.builder(  
                                jobId,  
                                datasourceId,  
                                databaseName,  
                                catalogName,  
                                EntityType.CATALOG,  
                                List.of(tableMetadata));
//Create ColumnMetadata  
ColumnMetadata columnMetadata =  
                            ColumnMetadataRequestBuilder.columnMetadataBuilder(  
                                    columnName,  
                                    avgColumnSize,  
                                    columnIndex,  
                                    dataType);  
//Create MetadataRequest for Catalog  
MetadataRequest metadataRequest =  
                            ColumnMetadataRequestBuilder.builder(  
                                    jobId,  
                                    datasourceId,  
                                    tableName,  
                                    EntityType.TABLE, //Parent EntityType  
                                    List.of(databaseName, catalogName, tableName), //XPath for Parent  
                                    List.of(columnMetadata));

Build MetadataRequests and Send

bridge.catalog(jobDetails,  
               MetadataRequests.builder().metadataRequestList(List.of(metadataRequest)).build())

Classify Data

Extract data from the source.

Build ContentDetails for data. For structured data sources, a single ContentDetails per column.

private void processBatch(List<List<String>> rowCache, // List of Row Data comprised of a List of Strings one per Column. 
                          boolean hasMoreData,
                          List<ColumnMetadata> columnMetadata, // List of the ColumnMetadata in order by column index
                          BaseMetadata table, // TableMetadata for extracted table
                          JobDetails jobDetails,
                          long count, //Count of All rows extracted so far.
                          List<String> parentXPath) {
    List<ContentDetails> contentDetailsList = new ArrayList<>();
    // For each row of data create a ContentDetails object per column collecting all of the data elements for the
    // respective columns
    IntStream.range(0, columnMetadata.size())
            .forEach(i -> contentDetailsList.add(ContentDetails.builder()
                     .data(rowCache.stream()
                             .map(row -> Objects.nonNull(row.get(i)) ? row.get(i) : "")
                             .collect(Collectors.toList()))
                    .endOfContent(!hasMoreData)
                    .entityName(columnMetadata.get(i).getEntityName())
                    .entityType(EntityType.COLUMN)
                    .parentEntityType(EntityType.TABLE)
                    .parentEntityName(table.getEntityName())
                    .parentXpath(parentXPath)
                    .build()));
                    
    // Build ClassificationRequest                     
    var classificationRequest = ClassificationRequest.builder()
            .jobId(jobDetails.getJobId())
            .datasourceId(jobDetails.getDatasourceId().toString())
            .structured(true)
            .totalCount(count)
            .content(contentDetailsList).build();
     bridge.classify(jobDetails, classificationRequest);
}