Bring CLARITY to yourVisual Experience
CLARITY automatically quantifies perceptual load, motion dynamics, and semantic richness from any urban video. Designed with architects, neuroscientists, and environmental psychologists in mind, the platform converts raw footage into reproducible complexity metrics you can cite, compare, and share.
42+
peer-reviewed references underpin the metrics
14 metrics
covering structural, semantic, and motion domains
< 5 min
to get an interactive report for a 2-minute video
Designed for Insight
Why measure visual complexity?
Every street, plaza, or transport hub is a multisensory signal. Complexity levels influence cognitive load, perceived safety, wayfinding performance, and even physiological stress responses. CLARITY quantifies these stimuli objectively, helping teams ground design discussions in evidence rather than impression.
- •Compare redevelopment concepts against baseline environments or design guidelines.
- •Combine eye-tracking, VR walkthroughs, or video diaries with quantitative scores.
- •Support grant applications and publications with reproducible metrics and visual evidence.
Built for research teams
Transparent Methods
Every metric cites its original publication and exposes intermediate data (heatmaps, overlays, per-frame values).
Reproducible Pipelines
Processing runs on deterministic settings (seeded, consistent pre-processing). Download-ready JSON accompanies the visualization report.
Usable Reports
Heatmaps, motion vectors, scalar summaries, and captions are structured for direct inclusion in manuscripts or design briefs.
Workflow
From footage to findings in three steps
1. Upload a clip
Drag & drop a short city walkthrough or import footage from your video archive. We handle format conversion and frame sampling.
2. Run the metrics
CLARITY extracts 14 scientifically grounded measures, generating heatmaps, overlays, scalar stats, and per-frame plots within minutes.
3. Interpret & export
Review interactive results, compare frames, and download JSON/imagery for further analysis or publication-ready figures.
Complexity Metrics
Edge Density
Measures the density and spatial distribution of image edges using the Canny edge detector. Edge density = (# edge pixels) / (total pixels).
Guan et al., 2022; Rosenholtz et al., 2007
Spatial Frequency
Performs 2D FFT-based spatial frequency analysis to quantify fine vs. coarse detail energy. Computed from mean magnitude of the Fourier amplitude spectrum.
Rosenholtz et al., 2007; Kawshalya & Gunawardena, 2022
Color Entropy
Quantifies variability and evenness of colors in HSV or LAB space using Shannon entropy over color histograms.
Guan et al., 2022; Zhang & Suo, 2024
GLCM Luminance Contrast
Computes spatial co-occurrence-based contrast between pixel intensity pairs in grayscale images using Gray-Level Co-occurrence Matrix (GLCM). Captures local luminance structure and textural complexity.
Haralick et al., 1973; Guan et al., 2022; Kawshalya & Gunawardena, 2022
Compression Complexity
Approximates structural information density via lossless compression ratio (PNG / Zstandard). Higher values indicate lower compressibility and greater structural complexity.
Saree et al., 2020
Fractal Dimension (Fourier)
Computes structural self-similarity via slope of the log-log plot of the 2D FFT power spectrum. Steeper slopes indicate greater visual structure complexity.
Nagy & Fekete, 2019; Kawshalya & Gunawardena, 2022
Edge Texture Measures
Analyzes multiscale texture energy using Gabor filter responses (mean & standard deviation across orientations). Reflects facade and surface pattern richness.
Portilla & Simoncelli, 2000; Guan et al., 2022
Motion Complexity
Quantifies the amount and diversity of motion using RAFT optical flow (magnitude & direction entropy) between consecutive frames.
Ali & Shah, 2013; Teed & Deng, 2020
Optical Flow Interaction
Measures motion coherence and interaction strength among moving entities (via flow divergence and clustering).
Ali & Shah, 2013
Temporal Entropy
Calculates Shannon entropy of pixel-difference maps across frames, capturing unpredictability and dynamism of motion.
Zhao et al., 2023; Guan et al., 2022
Saliency Attention
Predicts human visual attention using DeepGaze III (VGG-based scanpath prediction model). Generates attention probability maps aligned with eye-tracking data.
Kümmerer et al., 2022; Cornia et al., 2018
Object Count
Counts discrete objects (vehicles, trees, pedestrians, facades) detected by YOLOv8. Output used for object density and clutter indices.
Bochkovskiy et al., 2020; Guan et al., 2022
Semantic Diversity
Measures number and entropy of distinct semantic categories detected via Mask2Former segmentation. Reflects cognitive and perceptual richness of the scene.
Zhou et al., 2022; Zhang & Suo, 2024
Segmentation Region Count
Counts number of segmented visual regions (superpixels or Mask2Former instance masks). Reflects perceptual grouping complexity and spatial fragmentation.
Achanta et al., 2012; Zhou et al., 2022