Skip to content Skip to sidebar Skip to footer

Puppeteer Is Unable To Get The Complete Source Code

I'm creating a simple scraping application with Node.js and Puppeteer. The page I'm trying to scrape is this. Below is the code I'm using right now. const url = `https://www.betreb

Solution 1:

The page is using frames. You are only seeing the main content of the page (without the content of the frames). To also get the content of the frame, you need to first find the frame (e.g. via page.$) and then get its frame handle via elementHandle.contentFrame. You can then call frame.content() to get the content of the frame.

Simple Example

const frameElementHandle = await page.$('#selector iframe');
const frame = await frameElementHandle.contentFrame();
const frameContent = await frame.content();

Depending on the structure of the page, you need to do this for multiple frames to get all contents or you even need to do it for a frame inside the frame (what seems to be the case for the given page).

Example to read all frame contents

Below is an example that recursively read the contents of all frames on the page.

const contents = [];
asyncfunctionextractFrameContents(pageOrFrame) {
  const frames = await pageOrFrame.$$('iframe');
  for (let frameElement of frames) {
    const frame = await frameElement.contentFrame();
    const frameContent = await frame.content();

    // do something with the content, example:
    contents.push(frameContent);

    // recursively repeatawaitextractFrameContents(frame); 
  }
}
awaitextractFrameContents(page);

Post a Comment for "Puppeteer Is Unable To Get The Complete Source Code"