A mature PHP-based monolithic application manages core business logic and user sessions. A new requirement mandates the integration of a computationally expensive Computer Vision (CV) process for analyzing user-uploaded documents stored in Azure Blob Storage. Executing this process within the PHP monolith is operationally untenable due to performance characteristics and library dependencies. The logical architectural step is to delegate this workload to a specialized microservice. For its data processing strengths and concurrency model, Clojure, running as a consumption-based Azure Function, was selected.
The primary technical challenge is not the CV implementation itself, but the secure authorization chain. The Clojure function must act on behalf of the original user who initiated the request from the PHP application. It needs transient, scoped-down permissions to read a specific user’s document from Blob Storage and nothing more. This immediately invalidates naive approaches and pushes us toward a robust, identity-centric solution.
Alternative One: The Shared Secret / API Key Approach
A common but deeply flawed pattern involves the “middle-tier” service (PHP) and the “downstream” service (Clojure) sharing a secret. The PHP backend would authenticate the user, then make a request to the Clojure Function, including a pre-shared API key in a header. The Clojure function would validate this key and then use its own highly privileged service principal or connection string to access the storage account.
Pros:
- Superficially simple to implement.
Cons:
- Catastrophic Security Posture: A leaked API key or connection string compromises the entire system. Key rotation is a manual, high-risk operational burden. In a real-world project, this is often neglected until a breach occurs.
- Loss of Audit Trail: All operations against Azure Storage are attributed to the Clojure Function’s service identity, not the end-user. It becomes impossible to answer the critical question: “Which user’s action resulted in this file being accessed?”
- Violation of Least Privilege: The function requires a static credential with broad permissions, likely read access to the entire storage container, to service requests for any user. It cannot be dynamically scoped to only the data of the user in the current request context.
A brief look at this anti-pattern:
// PHP Backend - DANGEROUS PATTERN, DO NOT USE
function triggerCvProcessing(string $userId, string $documentPath): void
{
$httpClient = new GuzzleHttp\Client();
$downstreamApiKey = getenv('CLOJURE_FUNCTION_API_KEY'); // Leaked secret risk
try {
$httpClient->post('https://my-clojure-fn.azurewebsites.net/api/process-cv', [
'headers' => [
'X-API-Key' => $downstreamApiKey, // Static secret
'Content-Type' => 'application/json'
],
'json' => [
'userId' => $userId,
'documentPath' => $documentPath
]
]);
} catch (GuzzleHttp\Exception\RequestException $e) {
// Handle error
}
}
This design is unacceptable for any system handling sensitive user data.
Alternative Two: The Azure AD On-Behalf-Of (OBO) Flow
The On-Behalf-Of flow is an OAuth 2.0 grant type specifically designed for this middle-tier service scenario. It allows an application (the PHP monolith) to use an access token it received for itself to request a new access token for a downstream API (the Clojure function) while preserving the original user’s identity. This new token can then be further exchanged for a token to access another resource, like Azure Storage.
Pros:
- Zero Trust Alignment: No long-lived secrets are passed between services for authorization. All access is mediated by short-lived, audience-specific JSON Web Tokens (JWTs).
- Preserved Identity and Auditability: The final token used to access Blob Storage is issued on behalf of the original user. Azure’s diagnostic logs will correctly attribute the storage access to that user’s identity.
- Enforcement of Least Privilege: The Clojure function receives a delegated token. Its effective permissions are the intersection of what the user is allowed to do and what the PHP application has been granted permission to do on their behalf. Access to Blob Storage can be granularly controlled via Azure Role-Based Access Control (RBAC) assigned to the user.
Cons:
- Increased Configuration Complexity: This flow requires careful configuration of two separate Application Registrations in Azure Active Directory and a precise understanding of OAuth 2.0 scopes and permissions.
- Additional Network Hop: A token exchange step introduces an extra network call to Azure AD from the Clojure function, which can add latency.
For a production environment, the security, auditability, and granular control offered by the OBO flow are non-negotiable. It is the correct architectural choice.
Architectural Blueprint and Flow
The complete sequence of operations is critical to understand before diving into the code.
sequenceDiagram participant User participant PHP_Monolith as PHP Monolith (Web App) participant Azure_AD as Azure AD (Token Endpoint) participant Clojure_Function as Clojure Azure Function (API) participant Azure_Storage as Azure Blob Storage User->>+PHP_Monolith: Logs in (OAuth Auth Code Flow) PHP_Monolith->>+Azure_AD: Exchanges auth code for ID & Access Token (for self) Azure_AD-->>-PHP_Monolith: Returns tokens User->>PHP_Monolith: Initiates CV processing request PHP_Monolith->>+Azure_AD: Requests new Access Token (Token A)
Audience: Clojure Function API
On-Behalf-Of: User Azure_AD-->>-PHP_Monolith: Returns Token A PHP_Monolith->>+Clojure_Function: Calls CV endpoint with `Authorization: Bearer` Clojure_Function->>Clojure_Function: Validates Token A (issuer, audience, signature) Clojure_Function->>+Azure_AD: Exchanges Token A for Token B (OBO Flow)
Grant Type: jwt-bearer
Assertion: Token A
Scope: https://storage.azure.com/.default Azure_AD-->>-Clojure_Function: Returns Token B Clojure_Function->>+Azure_Storage: Accesses blob using Token B Azure_Storage-->>-Clojure_Function: Returns blob data Clojure_Function->>Clojure_Function: Performs CV processing Clojure_Function-->>-PHP_Monolith: Returns processing result PHP_Monolith-->>-User: Displays result
Core Implementation
Step 1: Azure Active Directory Configuration
This is the foundation. Two App Registrations are required.
php-monolith-app
(Middle-Tier):- Authentication: Configured as a Web app with a redirect URI. A client secret is generated and stored securely (e.g., in environment variables or Azure Key Vault, never in code).
- API Permissions:
- It needs delegated permission to call the downstream API.
- Click
Add a permission
->My APIs
. - Select the
clojure-cv-api
app registration. - Choose
Delegated permissions
and select theuser_impersonation
scope that will be exposed by the Clojure API. - Grant admin consent for this permission.
clojure-cv-api
(Downstream API):- Authentication: A client secret is generated and stored securely for the Clojure Function’s configuration.
- Expose an API:
- Set an Application ID URI, for example
api://<clojure-app-client-id>
. - Add a scope named
user_impersonation
. -
Admin consent display name
: “Access the CV API as the signed-in user”. -
Admin consent description
: “Allows the app to call the CV API on behalf of the signed-in user.” - Ensure the scope is
Enabled
.
- Set an Application ID URI, for example
- Authorized client applications:
- Add the Client ID of the
php-monolith-app
. - Authorize the
user_impersonation
scope for it. This pre-authorizes the monolith to request tokens for this API.
- Add the Client ID of the
Step 2: PHP Monolith - Acquiring and Using Token A
The PHP application, after authenticating the user, needs to acquire a token specifically for the clojure-cv-api
. A robust OAuth2 client library is essential here.
<?php
// Assumes use of 'thenetworg/oauth2-azure' provider for 'league/oauth2-client'
// In a real app, this provider instance would be managed by a DI container.
use TheNetworg\OAuth2\Client\Provider\Azure;
class CvDelegationService
{
private Azure $provider;
private array $sessionStorage; // Simplified representation of user session
public function __construct(array $sessionStorage)
{
// These values should be loaded from a secure configuration source.
$this->provider = new Azure([
'clientId' => getenv('AZURE_PHP_APP_CLIENT_ID'),
'clientSecret' => getenv('AZURE_PHP_APP_CLIENT_SECRET'),
'redirectUri' => getenv('AZURE_PHP_APP_REDIRECT_URI'),
'urlAPI' => 'https://graph.microsoft.com/v1.0/',
'scopes' => ['openid', 'profile', 'offline_access'],
'defaultEndPointVersion' => '2.0'
]);
$this->sessionStorage = $sessionStorage;
}
/**
* Acquires an On-Behalf-Of access token for the downstream Clojure API.
* @return string The access token for the Clojure API.
* @throws Exception If token acquisition fails.
*/
private function getDownstreamApiToken(): string
{
$existingToken = new \League\OAuth2\Client\Token\AccessToken($this->sessionStorage);
if ($existingToken->hasExpired()) {
// In a real application, you would use the refresh token to get a new
// set of tokens from Azure AD for the PHP app itself before proceeding.
// This logic is omitted for brevity but is critical.
throw new \Exception("User session token has expired. Re-authentication needed.");
}
// The scope must be the full Application ID URI of the downstream API
// plus the scope name. This is a common point of failure.
$downstreamApiScope = getenv('AZURE_CLOJURE_APP_ID_URI') . '/user_impersonation';
try {
$tokenForDownstreamApi = $this->provider->getAccessToken('refresh_token', [
'scope' => $downstreamApiScope,
'refresh_token' => $existingToken->getRefreshToken()
]);
return $tokenForDownstreamApi->getToken();
} catch (\League\OAuth2\Client\Provider\Exception\IdentityProviderException $e) {
// Log the detailed error from Azure AD
error_log("Failed to acquire downstream token: " . $e->getResponseBody());
throw new \Exception("Could not acquire delegated token for CV service.");
}
}
/**
* Triggers the CV processing by calling the Clojure Azure Function.
*/
public function triggerCvProcessing(string $documentPath): bool
{
try {
$apiToken = $this->getDownstreamApiToken();
} catch (\Exception $e) {
error_log("Failed to get API token: " . $e->getMessage());
return false;
}
$httpClient = new \GuzzleHttp\Client();
$functionUrl = getenv('CLOJURE_FUNCTION_URL');
try {
$response = $httpClient->post($functionUrl, [
'headers' => [
// This is Token A, intended for the Clojure API
'Authorization' => 'Bearer ' . $apiToken,
'Content-Type' => 'application/json'
],
'json' => [
'documentPath' => $documentPath
],
'timeout' => 30.0, // Set a reasonable timeout
]);
return $response->getStatusCode() === 200;
} catch (\GuzzleHttp\Exception\RequestException $e) {
error_log("Error calling Clojure function: " . $e->getMessage());
if ($e->hasResponse()) {
error_log("Response body: " . $e->getResponse()->getBody()->getContents());
}
return false;
}
}
}
Key Points in the PHP Code:
- We are using the user’s existing refresh token to request a new access token, but this time the
scope
is for the downstream API, not the PHP app itself. - The scope format
api://<client-id>/.default
orapi://<client-id>/user_impersonation
is critical. A mistake here is a common source of errors. - Error handling is crucial. A failure to get the token is a security-relevant event and should be logged aggressively.
Step 3: Clojure Azure Function - Token Exchange and Storage Access
The Clojure function receives Token A, validates it, and then performs the OBO exchange to get Token B for Azure Storage. This requires a well-structured project.
Project Structure (project.clj
for Leiningen):
(defproject cv-processor "0.1.0-SNAPSHOT"
:description "Clojure Azure Function for CV Processing"
:dependencies [[org.clojure/clojure "1.11.1"]
[clj-http "3.12.3"] ; For making HTTP requests
[cheshire "5.11.0"] ; JSON parsing
[com.auth0/java-jwt "4.2.1"] ; JWT validation
[org.slf4j/slf4j-nop "1.7.36"]] ; Suppress slf4j warnings if not configured
:main ^:skip-aot cv-processor.core
:target-path "target/%s"
:profiles {:uberjar {:aot :all
:jvm-opts ["-Dclojure.compiler.direct-linking=true"]}})
Function Configuration (host.json
and function.json
):
This is standard Azure Functions setup. The important part is that environment variables are used for all secrets.
cv-processor/ProcessCv/function.json
:
{
"scriptFile": "../target/cv-processor-0.1.0-SNAPSHOT-standalone.jar",
"entryPoint": "cv_processor.handler.run",
"bindings": [
{
"authLevel": "function",
"type": "httpTrigger",
"direction": "in",
"name": "req",
"methods": [
"post"
]
},
{
"type": "http",
"direction": "out",
"name": "res"
}
]
}
Core Clojure Logic:
The code is broken down into namespaces for clarity: one for configuration, one for the OBO exchange, and one for the main handler.
src/cv_processor/config.clj
:
(ns cv-processor.config
(:require [clojure.string :as str]))
;; In a production setting, these values are read from the Azure Function's
;; Application Settings, which maps them to environment variables.
(def config
{:tenant-id (System/getenv "AZURE_TENANT_ID")
:client-id (System/getenv "AZURE_CLOJURE_APP_CLIENT_ID")
:client-secret (System/getenv "AZURE_CLOJURE_APP_CLIENT_SECRET")
:storage-account-name (System/getenv "AZURE_STORAGE_ACCOUNT_NAME")
:token-endpoint (format "https://login.microsoftonline.com/%s/oauth2/v2.0/token"
(System/getenv "AZURE_TENANT_ID"))
;; The audience claim in the incoming token (Token A) must match this.
:expected-audience (str "api://" (System/getenv "AZURE_CLOJURE_APP_CLIENT_ID"))})
(defn get-config [key]
(let [value (get config key)]
(when (str/blank? value)
(throw (ex-info (str "Configuration missing for key: " key) {:key key})))
value))
src/cv_processor/auth.clj
:
(ns cv-processor.auth
(:require [clj-http.client :as client]
[cheshire.core :as json]
[cv-processor.config :as config]
[clojure.tools.logging :as log])
(:import (com.auth0.jwt JWT)
(com.auth0.jwt.exceptions JWTVerificationException)))
(defn- extract-bearer-token [headers]
(when-let [auth-header (get headers "authorization")]
(second (re-matches #"^Bearer\s+(.*)$" auth-header))))
(defn- validate-incoming-token
"Validates the incoming JWT (Token A). In a real system, this would also
validate the signature by fetching keys from the OIDC discovery endpoint.
For this example, we focus on audience and basic structure."
[token-string]
(try
(let [decoded-jwt (JWT/decode token-string)
audience (.getAudience decoded-jwt)]
(when-not (some #(= % (config/get-config :expected-audience)) audience)
(throw (ex-info "Invalid token audience." {:expected (config/get-config :expected-audience)
:actual audience})))
token-string)
(catch JWTVerificationException e
(log/error e "Incoming JWT validation failed.")
nil)))
(defn acquire-storage-token-obo
"Performs the OBO flow to exchange Token A for Token B (for Azure Storage)."
[incoming-token]
(let [params {:grant_type "urn:ietf:params:oauth:grant-type:jwt-bearer"
:client_id (config/get-config :client-id)
:client_secret (config/get-config :client-secret)
:assertion incoming-token
:scope "https://storage.azure.com/.default"
:requested_token_use "on_behalf_of"}]
(try
(let [response (client/post (config/get-config :token-endpoint)
{:form-params params
:content-type :x-www-form-urlencoded
:throw-exceptions false})]
(if (= 200 (:status response))
(get (json/parse-string (:body response) true) :access_token)
(do
(log/error (str "OBO token exchange failed. Status: " (:status response)))
(log/error (str "Response Body: " (:body response)))
nil)))
(catch Exception e
(log/error e "Exception during OBO token exchange.")
nil))))
(defn get-delegated-storage-token
"Main auth function: extracts, validates, and exchanges the token."
[headers]
(some->> headers
(extract-bearer-token)
(validate-incoming-token)
(acquire-storage-token-obo)))
src/cv_processor/handler.clj
:
(ns cv-processor.handler
(:require [cv-processor.auth :as auth]
[cv-processor.config :as config]
[clojure.tools.logging :as log])
(:import (com.azure.storage.blob BlobServiceClientBuilder)
(com.azure.identity DefaultAzureCredentialBuilder)))
(defn process-document-cv [stream]
;; This is a placeholder for the actual Computer Vision logic.
;; It would use a library like OpenCV or a cognitive service SDK
;; to analyze the document stream.
(log/info "Starting CV processing on document stream...")
(Thread/sleep 2000) ; Simulate work
(log/info "CV processing complete.")
{:status "processed" :text-found "Sample-text-from-cv"})
(defn access-blob-with-token [token document-path]
;; This part is conceptual as the Java SDK's `TokenCredential`
;; is complex to wire up with a raw token string directly.
;; In a real implementation using the Azure SDK for Java, you'd create a
;; custom credential provider. The principle remains: the token authenticates.
;; For demonstration, we'll simulate the successful access.
(log/info (str "Attempting to access blob: " document-path " with delegated token."))
(log/info (str "Token starts with: " (subs token 0 20) "..."))
;; A real implementation would look something like this:
; (let [credential (create-custom-token-credential token)
; blob-service-client (-> (BlobServiceClientBuilder.)
; (.endpoint (format "https://%s.blob.core.windows.net" (config/get-config :storage-account-name)))
; (.credential credential)
; (.buildClient))
; container-client (.getBlobContainerClient blob-service-client "documents")
; blob-client (.getBlobClient container-client document-path)
; output-stream (java.io.ByteArrayOutputStream.)]
; (.downloadStream blob-client output-stream)
; (process-document-cv (java.io.ByteArrayInputStream. (.toByteArray output-stream))))
;; Simulating the outcome for clarity of the flow.
(process-document-cv nil))
(defn run
"Azure Function entry point."
[context req]
(let [headers (:headers req)
body (:body req)]
(log/info "CV Processor function triggered.")
(if-let [storage-token (auth/get-delegated-storage-token headers)]
(try
(let [document-path (:documentPath body)
result (access-blob-with-token storage-token document-path)]
{:status 200
:headers {"Content-Type" "application/json"}
:body {:message "CV processing successful" :result result}})
(catch Exception e
(log/error e "Error during blob access or CV processing.")
{:status 500
:body "Internal server error during processing."}))
{:status 401
:body "Unauthorized. Valid bearer token required or OBO exchange failed."})))
This Clojure code is production-oriented. It separates concerns, uses configuration injection, performs logging, and handles the critical token exchange securely.
Extensibility and Limitations
This architectural pattern is highly extensible. The same OBO flow can be chained: the Clojure function, holding a delegated token, could call another downstream API on behalf of the user, performing another OBO exchange if necessary. It establishes a robust identity-aware mesh of services.
However, several limitations and production considerations exist for this specific implementation:
- Token Caching: The current Clojure code exchanges a token on every single function invocation. This is inefficient and can lead to throttling by Azure AD under high load. A production-grade implementation must introduce a caching layer (e.g., Redis) for Token B. The cache key should be a hash of Token A, and the cache entry should honor the
expires_in
value of Token B. This significantly reduces latency and load on the identity provider. - JWT Signature Validation: The example code only checks the token’s audience claim. A full implementation must download the JSON Web Key Set (JWKS) from Azure AD’s OIDC discovery endpoint (
https://login.microsoftonline.com/{tenant_id}/v2.0/.well-known/openid-configuration
) and use the public key to cryptographically verify the signature of every incoming token. This prevents token tampering. - Refresh Token Management: The PHP example simplifies the handling of the user’s primary refresh token. A robust monolith must have a secure strategy for storing and using these refresh tokens to maintain long-lived user sessions without requiring frequent interactive logins.
- Vendor Lock-in: The entire flow is deeply integrated with Azure Active Directory. While this provides significant benefits within the Azure ecosystem, it creates a tight coupling. Migrating this authentication and authorization logic to another identity provider (like Okta or Auth0) or a different cloud would be a non-trivial undertaking.