AI Research.
> Building the intelligence layer for creative storytelling — from fine-tuned language models to multimodal generation pipelines.
6
Experiments
3
Models Trained
15K+
Training Pairs
3
Cloud Providers
Active Areas.
Model Fine-Tuning
Training foundation models for domain-specific creative direction — mythology, cinematography, and visual storytelling.
Video Generation
Text-to-video and image-to-video pipelines with precise camera control, temporal coherence, and cinematic quality.
Audio Intelligence
Scene-aware audio generation — dialogue, ambience, sound effects, and music that matches visual narrative.
Multi-Agent Orchestration
Coordinating specialized AI agents to automate full production workflows from script to final render.
Visual Understanding
Multimodal analysis of images and video — composition, style transfer, reference matching, and aesthetic scoring.
Infrastructure & Serving
Low-latency model serving, adaptive batching, and cost-efficient GPU orchestration across cloud providers.
Active Runs.
MiniMax Video Pipeline
IN PROGRESSIntegrating MiniMax Hailuo API for text-to-video and image-to-video with 50+ camera control presets.
Scene-Aware Audio Matching
IN PROGRESSPrototype system that analyzes visual scenes and generates matching ambient audio, dialogue, and sound effects.
Multi-Agent Production Pipeline
IN PROGRESSDesigning agent orchestration for end-to-end production: script analysis, shot planning, visual generation, audio layering.
Multimodal Training Dataset v2
PLANNEDBuilding image+analysis training pairs from 817 GCS assets for multimodal fine-tuning. Quality-filtered with human ratings.
Inference Cost Optimization
PLANNEDBenchmarking serving strategies for 32B models: quantization, speculative decoding, and multi-provider routing.