On-Device Processing

The SDK preprocesses video frames locally before uploading anything. This page covers what runs in the browser, what gets uploaded, and how fallbacks work.

What Runs On-Device

The SDK performs preprocessing, not inference. Here is the division of work:

Stage	Where
Camera access	Browser
Face detection and tracking	Browser (WASM)
Skin region extraction and normalization	Browser (WASM)
Quality checks (lighting, motion, pose)	Browser
Frame encoding and upload	Browser
Vital signs inference	Server-side
Signal processing and calibration	Server-side

The SDK does not include any inference models. All signal processing and vital signs computation happens on the backend.

Vision Engine

The SDK includes the Circadify Vision Engine — a set of WebAssembly modules that handle face detection, skin region extraction, and frame normalization. These modules are lazy-loaded from CDN on first use and cached by the browser for subsequent measurements.

Component	Size	Caching
Face detection model	~6 MB	Browser cache (persistent)
Geometry processor	~6 MB	Browser cache (persistent)

Self-Hosting WASM

For air-gapped or regulated environments where external CDN access is restricted, you can host the WASM files on your own infrastructure:

const sdk = new CircadifySDK({
  apiKey: 'ck_live_your_key_here',
  wasmConfig: {
    mediapipeWasmUrl: 'https://cdn.yourcompany.com/circadify/wasm/',
    mediapipeModelUrl: 'https://cdn.yourcompany.com/circadify/models/',
    opencvUrl: 'https://cdn.yourcompany.com/circadify/geometry.js',
  },
});

Contact support@circadify.com for the WASM distribution package.

Processing Pipeline

For each captured frame, the SDK executes the following pipeline:

Face detection — The Vision Engine locates the face in the video frame and tracks facial geometry in real time. If no face is detected for 30 seconds, the SDK throws a FACE_DETECTION_TIMEOUT error.
Skin region extraction — Multiple skin regions are identified and isolated from the detected face. These regions are selected for their suitability for rPPG signal extraction, where blood flow changes produce subtle color variations.
Normalization — Each extracted region is geometrically normalized to a consistent size and orientation, correcting for head movement and rotation.
Encoding — Normalized regions are encoded into a compact binary format optimized for the backend inference model.
Frame accumulation — Frames are captured at 30 FPS and accumulated over approximately 24 seconds of measurement. Capture stops when sufficient frames have been collected.

The entire pipeline runs in the browser. No raw video frames, face images, or identifiable data are included in the upload — only the preprocessed, normalized skin region data.

Quality Checks

The SDK runs continuous quality checks during capture. All checks must pass before measurement begins, and warnings are emitted if quality degrades mid-capture.

Lighting

Checks that the scene is well-lit and stable. Fails if the environment is too dark, too bright, or has flickering light sources.

Motion

Measures how much the user is moving. Fails if there is excessive head or body movement.

Pose

Checks head orientation. Fails if the user’s head is turned or tilted beyond the acceptable range.

Quality warnings are delivered via the onQualityWarning callback:

const sdk = new CircadifySDK({
  apiKey: 'ck_test_your_key_here',
  onQualityWarning: (warning) => {
    // warning.type: 'lighting' | 'motion' | 'face_position'
    // warning.severity: 'low' | 'medium' | 'high'
    showToast(warning.message);
  },
});

Browser Compatibility

On-device processing requires the following browser capabilities:

Requirement	Purpose
HTTPS (or localhost)	Camera access
WebAssembly	Vision Engine execution
Canvas 2D	Frame processing
`navigator.mediaDevices`	Camera stream

Minimum browser versions:

Browser	Minimum Version
Chrome	80+
Firefox	75+
Safari	14+
Edge	80+

Performance

Metric	Typical Value
Vision Engine load (first visit)	2–5 seconds
Vision Engine load (cached)	Under 100 ms
Capture duration	~24 seconds
Upload size	~45 MB
Upload time	5–30 seconds (network dependent)
Backend inference	60–90 seconds
Total end-to-end	~2 minutes

If you choose not to use @circadify/sdk, you must implement the preprocessing pipeline yourself to produce the binary format the backend expects. This includes camera access, face detection, skin region extraction, normalization, and frame encoding.

Custom preprocessing implementations require access to our format specification. Contact support@circadify.com for documentation.

Fallback Behavior

When backend inference fails, the SDK still returns a result:

At session creation, the backend returns a fallback_config object with plausible vital sign ranges
The SDK uses these ranges to generate synthetic values if the result endpoint returns an error
Fallback results always have confidence: 0.0

Your application should always check the confidence score:

const result = await sdk.measureVitals({
  container: document.getElementById('scan-container'),
});

if (result.confidence < 0.4) {
  showWarning('Low confidence — results may be unreliable. Try again with better lighting.');
} else if (result.confidence === 0) {
  showError('Measurement could not be completed. Please retry.');
}

A confidence score of 0.0 specifically indicates that fallback values were used and the result should not be treated as a real measurement.

Next Steps

Data Flow — See how on-device processing fits into the full pipeline
SDK Configuration — All available SDK options including WASM overrides
Error Codes — Browser and processing error handling
Security Overview — Privacy implications of the hybrid architecture