A Broad View of AI
Core Techs We Employ
From Video-to-Data AI to Predictive Analytics — our toolkit covers every step of modern busness processes
Send us a task
HOW IT WORKS
Video-to-Data AI
•Automated Object Detection: Automatically labeling and segmenting objects in video streams, reducing the labor-intensive task of manual annotation.
•Real-Time Insights: Providing immediate analytics for construction site monitoring and anomaly detection.
•Big Data Integration: Easily scalable and integrable with big data processing platforms.
•Reduced Errors and Costs: Minimizes the risk of overlooking crucial details and speeds up decision-making.
•Photogrammetry & LiDAR Fusion: Creates dense, semantically rich 3D maps by merging image-based and laser scanning data.
•Advanced Clustering: Uses algorithms like RANSAC, DBSCAN, and others for object identification, filtering, and grouping.
•High Accuracy Environments: Produces detailed 3D models with accuracy levels of 92–98%.
•Complex vs. Simple Elements: Detects both complex components (valves, pumps) and simpler ones (pipes, fasteners).
SLAM & Sensor Fusion
•Simultaneous Localization & Mapping: Real-time mapping and position tracking using drones, LiDAR, cameras, and IMUs.
•Continuous 3D Updates: Data is collected and merged in real time, keeping models constantly up to date.
•Trajectory Alignment: Minimizes gaps and distortions via precise trajectory alignment methods.
•Robust in Dynamic Environments: Performs reliably in complex or rapidly changing construction site conditions.
•2D/3D Data Integration: Combines images, drawings, and point clouds to automate BIM modeling.
•80–90% Manual Labor Reduction: Substantially speeds up model creation compared to traditional methods.
•High-Precision Object Recognition: Achieves 92–98% accuracy in identifying construction elements.
•Revit & ArchiCAD Compatibility: Exports files ready for various BIM platforms.
Neural Radiance Fields (NeRF)
•Minimal Input Required: Generates highly detailed 3D scenes from a limited set of 2D images or video frames.
•Complex Geometries: Ideal for reconstructing and visualizing intricate shapes and textures.
•Photorealistic Output: Delivers a high level of realism, crucial for presentations and inspections.
•Real-Time Visualization: Enables quick viewing and analysis of generated 3D models.
•Forecasting Delays & Costs: Predicts potential schedule overruns and budget issues by analyzing construction progress data.
•Quality Control Alerts: Automatically detects anomalies and potential problems for timely intervention.
•Data-Driven Decisions: Provides dashboards for managers, facilitating informed decisions based on real metrics.
•Historical Trends & Optimization: Leverages accumulated data for continuous process improvement and strategic planning.
BLOG
Innovation & Research
At Lepei, we constantly explore the frontier of technologies.
From Vision-Language Models (VLMs) for real-time 3D segmentation to cutting-edge Neural Radiance Fields (NeRF), our R&D team is shaping the next generation of AI-driven solutions.
Real-Time 3D Mapping from Video
(No Sensors Needed)

Lepei leverages next-generation WildGS-SLAM, a breakthrough in AI-driven spatial understanding that transforms ordinary monocular video into a real-time 3D map with camera positioning—without LiDAR, IMU, or expensive hardware.
Powered by Gaussian Splatting and Bayesian neural networks, this system adapts to dynamic, noisy, and imperfect environments—construction sites, mines, ports, or moving vehicles—where traditional SLAM fails.
Key business benefits:
Cost reduction: replaces costly sensors with standard RGB cameras (e.g., smartphone or CCTV).
Operational reliability: works in dust, fog, motion, and unstable settings.
Deployment flexibility: enables real-time mapping and inspection even in unfinished or hazardous locations.
Data ownership: capture your own spatial data anytime, anywhere, with zero dependency on high-end hardware.
This is SLAM built for reality, enabling companies to digitize, inspect, and navigate physical environments with unprecedented flexibility and scalability.
At Lepei, we’re already integrating these systems to power next-gen inspection, navigation, and digital twin solutions across industrial sectors.
Segment Anything 3D Fast, Annotation-Free Understanding of 3D Environments
At Lepei, we are applying Segment Anything 3D (SAM-3D) – a powerful fusion of 2D computer vision and 3D spatial data – to extract meaning from complex environments without manual labeling.
How it works:
We take a 3D scan (from LiDAR, photogrammetry, drones).
Project it into 2D → apply Meta’s Segment Anything Model (SAM).
Then project it back into 3D → fully segmented, semantically rich 3D point cloud.
What this enables for businesses:
No BIM or annotations needed to extract key structures (walls, doors, windows, pipes).
Accelerated digital inventory of buildings, industrial sites, or mines.
Automatic compliance checks between design plans and as-built conditions.
Faster onboarding of brownfield assets into digital workflows or twin systems.
This is a game-changer for industrial environments where speed, cost, and automation matter.
At Lepei, we’re already using SAM-3D to bridge the gap between raw 3D data and intelligent decision-making—with no BIM dependency, and no manual overhead.
DynamicCity From LiDAR to
Living 4D Twins, Fast
At Lepei, we’re excited about DynamicCity, a new framework for generating dynamic 4D digital twins from raw LiDAR data—automatically.
What it does:
Transforms raw LiDAR into large-scale, temporally rich 4D environments with moving objects (cars, people, equipment).
Uses HexPlane + Diffusion Transformers for dense and realistic scene understanding—without frame-by-frame manual work.
Why this matters in AEC:
70% faster modeling of dynamic construction sites.
Real-time tracking of workers, machines, and changes on site.
+40% accuracy vs. traditional methods (mIoU ↑ from 41% → 79.6%).
50% fewer incidents through predictive safety analytics.
60% cheaper LiDAR processing via automation.
Beyond construction:
Autonomous robots & vehicles
Smart city simulations
Game dev and digital worlds
At Lepei, we’re actively integrating these ideas to accelerate site intelligence and unlock real-time optimization.
SpatialLM: Structured 3D
Understanding from Raw Data

Understanding 3D environments no longer requires expensive sensors or manual labeling. With models like SpatialLM, spatial reasoning becomes a language task.
What SpatialLM enables:
Processes point clouds from video, RGB-D cameras, or LiDAR.
Identifies architectural elements such as walls, windows, and doors.
Builds structured 3D layouts in IFC format, including 2D floor plans and semantic bounding boxes.
Uses MASt3R-SLAM fused with an LLM to achieve high precision and flexibility.
Why this matters for Lepei:
We’ve built our own pipeline that analyzes video to extract layout and structure. SpatialLM confirms the relevance of our direction and opens up new levels of automation and semantic depth.
Key use cases:
Automatic point cloud annotation—without full reliance on LiDAR.
Export to BIM-ready formats for planning and compliance.
Cost estimation, layout verification, and clash detection.
SpatialLM-class technologies are shifting the baseline of what’s possible in construction, real estate, and industrial facility management.
MASt3R-SLAM: Real-Time 3D Mapping with Just a Camera
MASt3R-SLAM is a next-generation SLAM system that enables accurate 3D reconstruction using a single moving camera—without fixed parameters or expensive sensors.
What it enables:
Camera-agnostic tracking: Works even with zoom, blur, or lens distortion.
Real-time mapping: Achieves precise spatial alignment at 15 FPS.
Reduced drift: Uses Gauss-Newton optimization for consistent tracking.
How Lepei uses it:
At Lepei.pro, we apply MASt3R-SLAM in real-world AI pipelines for the AEC sector:
Reconstructing 3D models from drone and mobile video.
Automating structural inspections from video-based scans.
Accelerating digital twin generation with minimal manual work.
Why it matters:
Precise, lightweight, and hardware-agnostic SLAM unlocks scalable automation in construction, mining, and infrastructure digitization—cutting costs and enabling real-time decision-making.
3D Segmentation Without 3D markup
How to train 3D segmentation without 3D markup and RGB cameras? We studied the architecture that solves this problem - and here's why it's important.
Many companies working with 3D data (for example, with LiDAR clouds) face the same problem: 3D markup is expensive, labor-intensive, and rarely available. But it is precisely this that is needed to train a neural network to understand objects in space.
Recently, our team analyzed the fresh work “3D Can Be Explored in 2D”, and it really hooked us. Why? Because the authors proposed an elegant way: to teach 3D understanding through 2D segmentation without annotations, RGB, and complex dataset assembly.
What does this mean in practice?
-You have a LiDAR scan (point cloud).
-You create a series of 2D projections of this scene - as if you were looking at it from virtual cameras.
-Run standard 2D segmentation on each projection (for example, Mask2Former).
-Then each point receives “votes” from all angles. The system finds a consensus — and builds a full-fledged 3D segmentation.
All this is done without a single 3D markup, only on pseudo-labels. At the same time, the accuracy reaches up to 78% mIoU — a result comparable to training on “golden” data.
Why is this important to us?
At Lepei, we regularly study and test such approaches. Because we believe that the future lies in smart adaptation of AI to real conditions, where data may be incomplete, “raw” or non-standard.
This article provides a new tool for such situations:
-In construction and infrastructure — if there is only a scan, but no annotations.
-In logistics — for analyzing warehouses, pipelines, tunnels.
-In industry — where each LiDAR flight can turn into a useful map without manual work.
We are already looking at how to apply these ideas in our current R&D projects. For example, automatic segmentation of objects in tunnels and factories — without manual dataset preparation.
AI in 3D is not only about robots and metaverses. It is about a clear business: lower costs, faster launch, more understanding of what is happening at the facility.
Types Of Data:
We have combined a range of technologies and data sources, each with their own unique advantages, to ensure that the BIM model is as accurate as possible. By leveraging the strengths of each of these data types, we can provide an unprecedented level of precision and detail of BIM model.
Satellite imagery
provides wide coverage and is readily available
Mapping layers from OpenStreetMap
are freely available and offer geospatial data such as roads and buildings

Drone video
provides fast coverage with a high resolution of up to 0.2 m
Mobile mapping technology
captures 3D data at a resolution of 0.03 m
LiDAR scanners
provide precise distance measurements and 3D data at a resolution of 0.01 m
360-degree video
provides fast and wide coverage at a resolution of 0.3 m
