YouTube Transcript Scraper Tool

Overview

The YouTube Transcript Scraper Tool is a tool within the Atomic Agents ecosystem that allows you to fetch the transcript of a YouTube video.

Prerequisites and Dependencies

Python 3.9 or later
atomic-agents (See here for installation instructions)
pydantic
google-api-python-client
youtube-transcript-api

Installation

You can install the tool using any of the following options:

Using the CLI tool that comes with Atomic Agents. Simply run atomic and select the tool from the list of available tools. After doing so you will be asked for a target directory to download the tool into.
Good old fashioned copy/paste: Just like any other tool inside the Atomic Forge, you can copy the code from this repo directly into your own project, provided you already have atomic-agents installed according to the instructions in the main README.

Configuration

Parameters

api_key (str): Your YouTube API key. Obtain this key by following the steps outlined in the Obtaining a YouTube API Key section.

Example

config = YouTubeTranscriptToolConfig(
    api_key="your_youtube_api_key"
)

Obtaining a YouTube API Key

To use this tool, you'll need a YouTube API key. Follow these steps to obtain one:

Access the Google Developers Console
- Visit the Google Developers Console.
- Sign in with your Google account. If you don't have one, you'll need to create it.
Create a New Project
- Click on the project dropdown in the top-left corner and select "New Project."
- Enter a project name and click "Create."
Enable the YouTube Data API v3
- In the dashboard, click on "Enable APIs and Services."
- Search for "YouTube Data API v3" and select it.
- Click the "Enable" button.
Generate Your API Key
- Navigate to "Credentials" in the left sidebar.
- Click on "Create Credentials" and select "API Key."
- Copy the generated API key and use it in your configuration as shown above.

Input & Output Structure

Input Schema

video_url (str): URL of the YouTube video to fetch the transcript for.
language (Optional[str]): Language code for the transcript (e.g., 'en' for English).

Output Schema

transcript (str): Transcript of the YouTube video.
duration (float): Duration of the YouTube video.
comments (List[str]): Comments on the YouTube video.
metadata (dict): Metadata of the YouTube video.

Usage

Here's an example of how to use the YouTube Transcript Scraper Tool:

from tool.youtube_transcript_scraper import YouTubeTranscriptTool, YouTubeTranscriptToolConfig

# Initialize the tool with your API key
config = YouTubeTranscriptToolConfig(api_key="your_youtube_api_key")
transcript_tool = YouTubeTranscriptTool(config=config)

# Define input data
input_data = YouTubeTranscriptTool.input_schema(
    video_url="https://www.youtube.com/watch?v=t1e8gqXLbsU",
    language="en"
)

# Fetch the transcript
result = transcript_tool.run(input_data)
print(result)

Contributing

Contributions are welcome! To contribute:

Fork the repository.
Create a new feature branch.
Commit your changes with clear messages.
Open a pull request detailing your changes.

Please ensure you follow the project's coding standards and include tests for any new features or bug fixes.

License

This project is licensed under the same license as the main Atomic Agents project. See the LICENSE file in the repository root for more details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

YouTube Transcript Scraper Tool

Overview

Prerequisites and Dependencies

Installation

Configuration

Parameters

Example

Obtaining a YouTube API Key

Input & Output Structure

Input Schema

Output Schema

Usage

Contributing

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

YouTube Transcript Scraper Tool

Overview

Prerequisites and Dependencies

Installation

Configuration

Parameters

Example

Obtaining a YouTube API Key

Input & Output Structure

Input Schema

Output Schema

Usage

Contributing

License