✓ Verified ✍️ Content Creation ✓ Enhanced Data

Instagram Reels

Download Instagram Reels, transcribe audio, and extract captions.

Rating: 4.6 (212 reviews)
Downloads: 704 downloads
Version: 1.0.0

Overview

Download Instagram Reels, transcribe audio, and extract captions.

✨Key Features

Extract metadata and audio URL

Download audio and convert to mp3

Transcribe with Groq Whisper

Complete Documentation

View Source →

Instagram Reels Skill

Download Instagram Reels, transcribe the audio, and extract the caption/description.

Setup

Install required tools:

bash

pip install yt-dlp
apt install ffmpeg    # or: brew install ffmpeg

Get a free Groq API key at https://console.groq.com
Set your environment variable:

bash

export GROQ_API_KEY="your-groq-api-key"

Usage

Process a reel in three steps: extract metadata, download audio, transcribe.

Step 1: Extract metadata and audio URL

bash

yt-dlp --write-info-json --skip-download -o "/tmp/reel" "REEL_URL"

This writes /tmp/reel.info.json with the caption, uploader, CDN URLs, and other metadata. No login required for public reels.

Step 2: Download audio and convert to mp3

Extract the audio CDN URL from metadata and download it directly:

bash

AUDIO_URL=$(python3 -c "
import json
d = json.load(open('/tmp/reel.info.json'))
for f in d.get('formats', []):
    if f.get('ext') == 'm4a':
        print(f['url'])
        break
")
curl -sL "$AUDIO_URL" -o /tmp/reel-audio.m4a
ffmpeg -y -i /tmp/reel-audio.m4a -acodec libmp3lame -q:a 4 /tmp/reel-audio.mp3

Step 3: Transcribe with Groq Whisper

bash

curl -s https://api.groq.com/openai/v1/audio/transcriptions \
  -H "Authorization: Bearer $GROQ_API_KEY" \
  -F "file=@/tmp/reel-audio.mp3" \
  -F "model=whisper-large-v3-turbo" \
  -F "response_format=verbose_json"

Returns JSON with text (full transcript) and segments (with timestamps). Language is auto-detected.

Extract caption from metadata

bash

python3 -c "
import json
d = json.load(open('/tmp/reel.info.json'))
print('Caption:', d.get('description', 'No caption'))
print('Author:', d.get('uploader', 'Unknown'))
print('Duration:', round(d.get('duration', 0)), 'seconds')
"

Notes

Metadata extraction works on public reels without authentication
For private reels, pass cookies: yt-dlp --cookies /path/to/cookies.txt --write-info-json --skip-download -o "/tmp/reel" "REEL_URL"
Export cookies with a browser extension like "Get cookies.txt LOCALLY"
Groq Whisper is free (rate-limited) and returns results in ~1-2 seconds
Max audio length: 25 minutes per request
Clean up temp files after: rm -f /tmp/reel.info.json /tmp/reel-audio.*
Also works with TikTok, YouTube Shorts, and other platforms supported by yt-dlp

Examples

bash

# Full transcription pipeline
yt-dlp --write-info-json --skip-download -o "/tmp/reel" "https://www.instagram.com/reel/ABC123/" && \
AUDIO_URL=$(python3 -c "import json; [print(f['url']) for f in json.load(open('/tmp/reel.info.json')).get('formats',[]) if f.get('ext')=='m4a'][:1]") && \
curl -sL "$AUDIO_URL" -o /tmp/reel-audio.m4a && \
ffmpeg -y -i /tmp/reel-audio.m4a -acodec libmp3lame -q:a 4 /tmp/reel-audio.mp3 2>/dev/null && \
curl -s https://api.groq.com/openai/v1/audio/transcriptions \
  -H "Authorization: Bearer $GROQ_API_KEY" \
  -F "file=@/tmp/reel-audio.mp3" \
  -F "model=whisper-large-v3-turbo" \
  -F "response_format=verbose_json"

# Just get the caption (no transcription)
yt-dlp --write-info-json --skip-download -o "/tmp/reel" "https://www.instagram.com/reel/ABC123/" && \
python3 -c "import json; d=json.load(open('/tmp/reel.info.json')); print(d.get('description',''))"

# Transcribe a TikTok video (same pipeline)
yt-dlp --write-info-json --skip-download -o "/tmp/reel" "https://www.tiktok.com/@user/video/123" && \
AUDIO_URL=$(python3 -c "import json; [print(f['url']) for f in json.load(open('/tmp/reel.info.json')).get('formats',[]) if f.get('ext')=='m4a'][:1]") && \
curl -sL "$AUDIO_URL" -o /tmp/reel-audio.m4a && \
ffmpeg -y -i /tmp/reel-audio.m4a -acodec libmp3lame -q:a 4 /tmp/reel-audio.mp3 2>/dev/null && \
curl -s https://api.groq.com/openai/v1/audio/transcriptions \
  -H "Authorization: Bearer $GROQ_API_KEY" \
  -F "file=@/tmp/reel-audio.mp3" \
  -F "model=whisper-large-v3-turbo" \
  -F "response_format=verbose_json"

Installation

Terminal bash


openclaw install instagram-reels

Copied!

💻Code Examples

apt install ffmpeg # or: brew install ffmpeg

apt-install-ffmpeg--or-brew-install-ffmpeg.txt

2. Get a free Groq API key at [https://console.groq.com](https://console.groq.com)
3. Set your environment variable:

export GROQ_API_KEY="your-groq-api-key"

export-groqapikeyyour-groq-api-key.txt

## Usage

Process a reel in three steps: extract metadata, download audio, transcribe.

### Step 1: Extract metadata and audio URL

yt-dlp --write-info-json --skip-download -o "/tmp/reel" "REEL_URL"

yt-dlp---write-info-json---skip-download--o-tmpreel-reelurl.txt

This writes `/tmp/reel.info.json` with the caption, uploader, CDN URLs, and other metadata. No login required for public reels.

### Step 2: Download audio and convert to mp3

Extract the audio CDN URL from metadata and download it directly:

-F "response_format=verbose_json"

--f-responseformatverbosejson.txt

Returns JSON with `text` (full transcript) and `segments` (with timestamps). Language is auto-detected.

### Extract caption from metadata

"

.txt

## Notes

- Metadata extraction works on public reels without authentication
- For private reels, pass cookies: `yt-dlp --cookies /path/to/cookies.txt --write-info-json --skip-download -o "/tmp/reel" "REEL_URL"`
- Export cookies with a browser extension like "Get cookies.txt LOCALLY"
- Groq Whisper is free (rate-limited) and returns results in ~1-2 seconds
- Max audio length: 25 minutes per request
- Clean up temp files after: `rm -f /tmp/reel.info.json /tmp/reel-audio.*`
- Also works with TikTok, YouTube Shorts, and other platforms supported by yt-dlp

## Examples

example.sh

AUDIO_URL=$(python3 -c "
import json
d = json.load(open('/tmp/reel.info.json'))
for f in d.get('formats', []):
    if f.get('ext') == 'm4a':
        print(f['url'])
        break
")
curl -sL "$AUDIO_URL" -o /tmp/reel-audio.m4a
ffmpeg -y -i /tmp/reel-audio.m4a -acodec libmp3lame -q:a 4 /tmp/reel-audio.mp3

example.sh

curl -s https://api.groq.com/openai/v1/audio/transcriptions \
  -H "Authorization: Bearer $GROQ_API_KEY" \
  -F "file=@/tmp/reel-audio.mp3" \
  -F "model=whisper-large-v3-turbo" \
  -F "response_format=verbose_json"

example.sh

python3 -c "
import json
d = json.load(open('/tmp/reel.info.json'))
print('Caption:', d.get('description', 'No caption'))
print('Author:', d.get('uploader', 'Unknown'))
print('Duration:', round(d.get('duration', 0)), 'seconds')
"

example.sh

# Full transcription pipeline
yt-dlp --write-info-json --skip-download -o "/tmp/reel" "https://www.instagram.com/reel/ABC123/" && \
AUDIO_URL=$(python3 -c "import json; [print(f['url']) for f in json.load(open('/tmp/reel.info.json')).get('formats',[]) if f.get('ext')=='m4a'][:1]") && \
curl -sL "$AUDIO_URL" -o /tmp/reel-audio.m4a && \
ffmpeg -y -i /tmp/reel-audio.m4a -acodec libmp3lame -q:a 4 /tmp/reel-audio.mp3 2>/dev/null && \
curl -s https://api.groq.com/openai/v1/audio/transcriptions \
  -H "Authorization: Bearer $GROQ_API_KEY" \
  -F "file=@/tmp/reel-audio.mp3" \
  -F "model=whisper-large-v3-turbo" \
  -F "response_format=verbose_json"

# Just get the caption (no transcription)
yt-dlp --write-info-json --skip-download -o "/tmp/reel" "https://www.instagram.com/reel/ABC123/" && \
python3 -c "import json; d=json.load(open('/tmp/reel.info.json')); print(d.get('description',''))"

# Transcribe a TikTok video (same pipeline)
yt-dlp --write-info-json --skip-download -o "/tmp/reel" "https://www.tiktok.com/@user/video/123" && \
AUDIO_URL=$(python3 -c "import json; [print(f['url']) for f in json.load(open('/tmp/reel.info.json')).get('formats',[]) if f.get('ext')=='m4a'][:1]") && \
curl -sL "$AUDIO_URL" -o /tmp/reel-audio.m4a && \
ffmpeg -y -i /tmp/reel-audio.m4a -acodec libmp3lame -q:a 4 /tmp/reel-audio.mp3 2>/dev/null && \
curl -s https://api.groq.com/openai/v1/audio/transcriptions \
  -H "Authorization: Bearer $GROQ_API_KEY" \
  -F "file=@/tmp/reel-audio.mp3" \
  -F "model=whisper-large-v3-turbo" \
  -F "response_format=verbose_json"