6 Active Experiments

AI Research.

> Building the intelligence layer for creative storytelling — from fine-tuned language models to multimodal generation pipelines.

Experiments

Models Trained

15K+

Training Pairs

Cloud Providers

Research Tracks

Active Areas.

Model Fine-Tuning

[ MDL_FT ]

Training foundation models for domain-specific creative direction — mythology, cinematography, and visual storytelling.

Qwen3-32BLoRAAzure AI FoundryVertex AI

Video Generation

[ VID_GEN ]

Text-to-video and image-to-video pipelines with precise camera control, temporal coherence, and cinematic quality.

MiniMaxHailuoCamera ControlTemporal Consistency

Audio Intelligence

[ AUD_INT ]

Scene-aware audio generation — dialogue, ambience, sound effects, and music that matches visual narrative.

TTSSound DesignScene MatchingSpatial Audio

Multi-Agent Orchestration

[ AGT_ORC ]

Coordinating specialized AI agents to automate full production workflows from script to final render.

Agent PipelinesTool UseProduction DAGsAuto-QA

Visual Understanding

[ VIS_UND ]

Multimodal analysis of images and video — composition, style transfer, reference matching, and aesthetic scoring.

MultimodalStyle AnalysisCompositionAesthetics

Infrastructure & Serving

[ INF_SRV ]

Low-latency model serving, adaptive batching, and cost-efficient GPU orchestration across cloud providers.

GCPAzureModel ServingCost Optimization

Experiment Log

Active Runs.

Qwen3-32B Creative Director

COMPLETE

Model Fine-Tuning|Feb 2026

3 fine-tuning runs across Vertex AI and Azure AI Foundry. Best eval loss 1.017 with 14.8% improvement in epoch 2.

1.017 eval loss15,434 training pairs1.98 PF compute

MiniMax Video Pipeline

IN PROGRESS

Video Generation|Feb 2026

Integrating MiniMax Hailuo API for text-to-video and image-to-video with 50+ camera control presets.

50+ camera moves720p output6s generation

Scene-Aware Audio Matching

IN PROGRESS

Audio Intelligence|Jan 2026

Prototype system that analyzes visual scenes and generates matching ambient audio, dialogue, and sound effects.

Multi-Agent Production Pipeline

IN PROGRESS

Multi-Agent Orchestration|Jan 2026

Designing agent orchestration for end-to-end production: script analysis, shot planning, visual generation, audio layering.

Multimodal Training Dataset v2

PLANNED

Visual Understanding|Q1 2026

Building image+analysis training pairs from 817 GCS assets for multimodal fine-tuning. Quality-filtered with human ratings.

817 images5,200 filtered pairs

Inference Cost Optimization

PLANNED

Infrastructure & Serving|Q1 2026

Benchmarking serving strategies for 32B models: quantization, speculative decoding, and multi-provider routing.

research-timeline.log

# ADIYOGI ARTS — RESEARCH TIMELINE

FEB 2026

✓ Qwen3-32B Run 2 complete — eval ↓14.8% — best loss 1.017

▸ MiniMax video pipeline — 50+ camera presets integrated

▸ Audio scene matching — prototype in testing

JAN 2026

✓ Qwen3-32B Run 0–1 — baseline established on Vertex AI + Azure

✓ 15,434 training pairs curated — mythology + cinematography domain

▸ Multi-agent orchestration — architecture design phase

DEC 2025

✓ Midjourney prompt dataset — 18K raw pairs collected

✓ 817 reference images transferred to GCS

✓ Research infrastructure — GCP + Azure provisioned

---

NEXT: Run 3 — cosine LR decay, quality-filtered data, multimodal pairs