Meta Video Ad Analyzer
v1.0.0Extract and analyze content from video ads using Gemini Vision AI. Supports frame extraction, OCR text detection, audio transcription, and AI-powered scene analysis. Use when analyzing video creative content, extracting text overlays, or generating scene-by-scene descriptions.
MIT-0
Security Scan
OpenClaw
Suspicious
medium confidencePurpose & Capability
Name/description match the code and prompts: frame extraction (OpenCV/ffmpeg), OCR (EasyOCR), transcription (Google Speech), and AI analysis (Vertex/Gemini). The requested libraries and binaries align with the stated purpose.
Instruction Scope
SKILL.md instructs the agent to read local video/image files and a Google service-account JSON (GOOGLE_APPLICATION_CREDENTIALS) and to call Vertex AI / Google Speech. Those actions are appropriate for the skill, but the SKILL.md requires a service account file path (access to local filesystem) which the registry metadata did not declare.
Install Mechanism
No install spec (instruction-only install) — highest transparency. Dependencies are standard Python packages and system ffmpeg; no arbitrary external URLs or extract steps are present.
Credentials
SKILL.md explicitly requires GOOGLE_APPLICATION_CREDENTIALS and Google APIs, but the registry metadata lists no required env vars or primary credential. This mismatch is concerning: the skill will need a Google service-account file (and project permissions) to use Vertex AI and Speech-to-Text, but that credential requirement was not declared in the metadata.
Persistence & Privilege
always:false and standard model invocation are used. The skill does not request permanent system-wide presence or attempt to modify other skills. It reads files passed to it (videos and a service account path) — expected for this type of tool.
What to consider before installing
This skill appears to do what it says (extract frames, OCR, transcribe audio, call Gemini/Vertex), but metadata omitted the required Google credentials. Before installing or running: 1) Confirm you are comfortable providing a Google service-account JSON and only grant the least-privilege roles (Vertex AI / Speech-to-Text / Cloud Storage if needed) — DO NOT use a project-owner or broad admin key. 2) Inspect the remaining functions (analyze_frames_batch, analyze_video_natively, reconciliation) for any hard-coded or external endpoints to ensure frames or transcripts are only sent to Google/Vertex and not to unknown servers. 3) Consider running the code in a sandbox with a restricted service account and test with non-sensitive videos. 4) If the registry claims 'no env vars', ask the publisher to update metadata to declare GOOGLE_APPLICATION_CREDENTIALS (and any other required env vars) before trusting the skill. If you cannot verify these details, treat it as untrusted and avoid supplying high-privilege credentials.Like a lobster shell, security has layers — review code before you run it.
latest
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
SKILL.md
Video Ad Analyzer
AI-powered video content extraction using Google Gemini Vision.
What This Skill Does
- Frame Extraction: Smart sampling with scene change detection
- OCR Text Detection: Extract text overlays using EasyOCR
- Audio Transcription: Convert speech to text with Google Cloud Speech
- AI Scene Analysis: Describe each scene using Gemini Vision
- Native Video Analysis: Direct video understanding for longer content
- Thumbnail Generation: Auto-generate thumbnails from first frame
Setup
1. Environment Variables
# Required for Gemini Vision
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
# Required for audio transcription
# (same service account needs Speech-to-Text API enabled)
2. Dependencies
pip install opencv-python pillow easyocr ffmpeg-python google-cloud-speech vertexai google-api-python-client
Also requires ffmpeg and ffprobe installed on system.
Usage
Basic Video Analysis
from scripts.video_extractor import VideoExtractor
from scripts.models import ExtractedVideoContent
import vertexai
from vertexai.generative_models import GenerativeModel
# Initialize Vertex AI
vertexai.init(project="your-project-id", location="us-central1")
gemini_model = GenerativeModel("gemini-1.5-flash")
# Create extractor
extractor = VideoExtractor(gemini_model=gemini_model)
# Analyze video
result = extractor.extract_content("/path/to/video.mp4")
print(f"Duration: {result.duration}s")
print(f"Scenes: {len(result.scene_timeline)}")
print(f"Text overlays: {len(result.text_timeline)}")
print(f"Transcript: {result.transcript[:200]}...")
Extract Only Frames
frames, timestamps, text_timeline, scene_timeline, thumbnail = extractor.extract_smart_frames(
"/path/to/video.mp4",
scene_interval=2, # Check for scene changes every 2s
text_interval=0.5 # Check for text every 0.5s
)
Analyze Images
# Works with images too
result = extractor.extract_content("/path/to/image.jpg")
print(result.scene_timeline[0]['description'])
Output Structure
ExtractedVideoContent(
video_path="/path/to/video.mp4",
duration=30.5,
transcript="Here's what we found...",
text_timeline=[
{"at": 0.0, "text": ["Download Now"]},
{"at": 5.5, "text": ["50% Off Today"]}
],
scene_timeline=[
{"timestamp": 0.0, "description": "Woman using phone app..."},
{"timestamp": 2.0, "description": "Product showcase with features..."}
],
thumbnail_url="/static/thumbnails/video_thumb.jpg",
extraction_complete=True
)
Key Features
| Feature | Description |
|---|---|
| Scene Detection | Histogram-based change detection (threshold=65) |
| OCR Confidence | Tiered thresholds (0.5 high, 0.3 low) |
| AI Proofreading | Gemini cleans up OCR errors |
| Source Reconciliation | Merges OCR + Vision text intelligently |
| Native Video | Direct Gemini analysis for <20MB files |
Prompts
Customize AI behavior by editing prompts in the prompts/ folder:
scene_analysis.md- Frame analysis promptsscene_reconciliation.md- Scene enrichment prompts
Common Questions This Answers
- "What text appears in this video ad?"
- "Describe each scene in this creative"
- "What does the narrator say?"
- "Extract the call-to-action from this ad"
Files
6 totalSelect a file
Select a file to preview.
Comments
Loading comments…
