Scrap youtube data with AI

SCRAPE

DATA

DATA

DATA

Scrap Youtube with AI

Kacper Walczak · 10-06-2024

Scrap data from Youtube platform with AI help.

This script makes the screenshot of the full page from the youtube channel and stores it in a PNG file. The data includes the following:

  • Title
  • Views
  • When added

Right now due to the fact that Youtube changes id's, classes, etc. dynamically to prevent scraping, we can make a screenshot with data and send it to the AI ;)

Setup

To run this script, you need to have Node.js installed on your system. Follow these steps to set up and run the script:

  1. Install the dependencies:

    npm install puppeteer
  2. Run the script:

    node index.js
  3. The script will generate a PNG file with the full page of the Youtube channel.

  4. You can then upload this image to ChatGPT to ask chat to extract the data.

Prerequisites

  • Basic knowledge of JavaScript
  • Node.js installed on your system
  • A stable internet connection

Script

import puppeteer from 'puppeteer';
 
const URL = 'https://www.youtube.com/@BuddaTV/videos';
const OUTPUT_FILE = 'budda.png';
const WAIT_FOR_PAGE_LOAD = 5000;
const WAIT_FOR_ACCEPTED_COOKIES = 5000;
 
(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto(URL);
  await page.setViewport({width: 1080, height: 1024});
  await wait(WAIT_FOR_PAGE_LOAD);
  try {
    const acceptCookiesBtn = await page.$('[aria-label="Accept all"]');
    await acceptCookiesBtn.click();
  }
  catch (e) {
    console.error('Error: ', e);
  }
  await wait(WAIT_FOR_ACCEPTED_COOKIES);
  await page.screenshot({path: OUTPUT_FILE, fullPage: true});
  await browser.close();
})();
 
async function wait(ms) {
  return await new Promise(resolve => {
    setTimeout(resolve, ms);
  });
}

Output

The script will generate a PNG file with the full page of the Youtube channel. You can then upload this image to ChatGPT to extract the data.

Example PNG

Here is an example of the output image:

BuddaTV

Extracted Data

Once you successfully run script you should have your PNG of a full page.

Give it to the ChatGPT and ask it to extract the data from it, simply upload an image and ask for title, views, and when added for each video + to JSON compression.

Here is an example of the extracted data from the image:

You can ask Chat to convert answer to JSON, do it in 1 prompt while asking to extract data from an image, otherwise you can get no answer for that... Maybe at a time you are reading it will be fixed.

[
  {"title": "One wins $250 000", "views": "560K views", "when_added": "1 day ago"},
  // ...
]

Conclusion

In this article we have learned how to extract data with help of the AI from the Youtube channel.

This is a simple way to get data from the website without having to deal with complex scraping techniques. You can use this method to extract data from other websites as well.

READ

Latest readings

  • Readings are sites which will help you with detailed

  • information about given topic. Read latest ones from Learn.

AI

06-03-2026

Local Voice Assistant with Ollama
  • Build your own local voice assistant powered by Ollama.

AI

06-03-2026

AI YouTube Thumbnail Generator
  • Generate YouTube thumbnails with FastAPI and Ollama.

Architecture

05-09-2024

Graph DB usage comparison
  • Compare Neo4j and Tigergraph databases, which is easier to work with, etc.