Guide to create a web scrapper using bubble.io and puppeteer
Integrating code with no-code tool Bubble.io - sounds contradictory? Take a look at how we developed a Scrapper using Bubble, Node.js and Ngrok.

Used tools explained shortly:
Bubble.io
- The whole user experience and interaction of the application are based on bubble.io. The Low Code platform provides visual elements to create the user interface, workflows to handle user inputs, and a database to store data like the scraped data from an eCommerce site.
Bubble.io Plugin
- When we hit bubble.io’s limits, we can extend it. One way is by developing a plugin. Within the plugins, you can execute custom code or create your own visual elements for the user interface. We’ll be using an API connector – plugin provided by Bubble.
Node
- Node.js is a single-threaded, open-source, cross-platform runtime environment for building fast and scalable server-side and networking applications. It runs on the V8 JavaScript runtime engine, and it uses event-driven, non-blocking I/O architecture, which makes it efficient and suitable for real-time applications.
Puppeteer
- Puppeteer is a Node library that provides a high-level API to control headless Chrome or Chromium browsers over the DevTools Protocol. It can also be configured to use full (non-headless) Chrome or Chromium.
Express
- Express is a minimal and flexible Node.js web application framework that allows setting up middlewares to respond to HTTP Requests and defines a routing table which is used to perform different actions based on HTTP Method and URL
Ngrok
- Ngrok is a cross-platform application that enables developers to expose a local development server to the Internet with minimal effort. The software makes your locally-hosted web server appear to be hosted on a subdomain of ngrok.com, meaning that no public IP or domain name on the local machine is needed
VS Code
- Visual Studio Code is a streamlined code editor with support for development operations like debugging, task running, and version control. It aims to provide just the tools a developer needs for a quick code-build-debug cycle and leaves more complex workflows to fuller featured IDEs, such as Visual Studio IDE
Pre-requisites
Download and Install
- VS Code ( https://code.visualstudio.com/download )
- Node ( https://nodejs.org/en/download/ )
- Ngrok ( https://ngrok.com/download ) –
- Tutorial Video: How to access localhost anywhere with ngrok
Environment Setup for web scrapper
- Create a folder on the desktop named “scraper”
- Open VS Code
- Click on File > Open Folder
- Locate and select folder “scraper”
- Click on “Select Folder”
- Create a file named “index.js”
- Goto terminal and write “npm init -y”, press enter
- Install puppeteer using command “npm install puppeteer”, press enter
- Install express using the command “npm install express”, press enter
Lets code
- Open file “index.js”
- Import the Puppeteer module within the “index.js” file
const puppeteer = require(‘puppeteer’);
- Import the Express framework within the “index.js” file
const express = require(‘express’);
- Instantiate the Express app
const app = express();
- Set our port:
const port = 3000;
The port will be used a bit later when we tell the app to listen to requests.
- Finalized selectors
Web Scraper uses CSS selectors to find HTML elements in web pages and extract data from them. When selecting an element the Web Scraper will try to make its best guess of what the CSS selector might be for the selected elements. But you can also write it yourself and test it by clicking “Element preview”.
Selectors = {
name:’.prod-subtitle’,
price:’span.push-right:nth-child(1) > strong:nth-child(1)’
}
- Empty JSON object to send to bubble later, when data has been scrapped and stored into this JSON object
let productDetail = {
name:”,
price:”
}
We need to keep in mind that Puppeteer is a promise-based library: It performs asynchronous calls to the headless Chrome instance under the hood. Let’s keep the code clean by using async/await. For that, we need to define an async function and put all the Puppeteer code in there.
- Define HTTP Get endpoint to accept requests from bubble server
When a user hits the endpoint with a GET request, the JSON object, from express” will be returned to the bubble application. We’d like to set it to be on the product page, so the URL for the endpoint is /product:
app.get(‘/product’, async (req, res) => {
- Launch the browser
const browser = await puppeteer.launch()
- Open a new tab
const page = await browser.newPage()
Puppeteer has a newPage() method that creates a new page instance in the browser, and these page instances can do quite a few things. In our scraper() method, you created a page instance and then used the page.goto() method to navigate to the target site
- Pass URL of target site
await page.goto(req.query.url)
- Save scraped data from HTML element’s selector ( name of product ) into JSON object
productDetail.name = await page.$eval(Selectors.name, el=>el.textContent)
- Save scraped data from HTML element’s selector ( price of product ) into JSON object
price_ = await page.$eval(Selectors.price, el=>el.textContent)
“price_” is a temporary variable declared anywhere, above this line of code.
Here, we need to clean data ( price ), as price contains ‘,’ and ‘.’ swapped.
This is not a mandatory case, but necessary here.
lett = 0;
price_ = price_.replace(/,/g, match => ++t === 2 ? ‘.’ : match)
productDetail.price = price_;
- Close browser
await browser.close()
- Send JSON object to bubble as a response
res.json(productDetail);
- let’s start with our clients
app.listen(port, () => {
console.log(`Example app listening on port ${port}`)
})
- To run the application, open terminal and write command “node index.js” and press enter
Complete Code
const express = require('express')
const puppeteer = require('puppeteer')
const app = express()
const port = 3000
Selectors = {
name:'.prod-subtitle',
price:'span.push-right:nth-child(1) > strong:nth-child(1)'
}
let productDetail = {
name:'',
price:''
}
app.get('/product', async (req, res) => {
const browser = await puppeteer.launch()
const page = await browser.newPage()
await page.goto(req.query.url)
productDetail.name = await page.$eval(Selectors.name, el=>el.textContent)
price_ = await page.$eval(Selectors.price, el=>el.textContent)
let t = 0;
price_ = price_.replace(/,/g, match => ++t === 2 ? '.' : match)
productDetail.price = price_;
await browser.close()
res.json(productDetail);
})
app.listen(port, () => {
console.log(`Example app listening on port ${port}`)
})
Hosting script on server (Ngrok)
- Signup to Ngrok
- Go to “Your Authtoken”
- Copy Token
- Open Ngrok, type ngrok authtoken [Paste your Token]
ngrok authtoken 25pMZXFc3gQ3KAYhhclTu41LcS0_3u24V5PXcQUzLgQA29ApA
- To expose a web server running on your local machine to the internet, type ngrok HTTP [port number] – in this case, port number 3000
ngrok http 3000
- Now that our script is ready and has been hosted on the server. Let’s design UI on Bubble, and test requests via Bubble API Connector.
Designing UI on Bubble.io
- Create a page, and put Input Field and Button on the page.
The user will paste a link from the eCommerce site into the Input Field.

- Get a plugin named “API Connector”
- Set API Name “Puppeteer Scraper API”
- Set Authentication “None or self-handled”
- Create a call and set the name “GET”
- Set Use as “Action”
- After pasting the link, the user will click on the “Calculate” button. And the following Event will get initiated with respective actions:

Get scrapped data from API

Store that data into database (optional)

Display data onto UI’s group

- URL For ecommerce product: https://www.morhipo.com/altus-al-445-nsx-5-programli-bulasik-makinesi/33688777/detay
- Bubble Application URL:https://webapitesting.bubbleapps.io/version-test
- Note: You won’t get results until NGROK is live.
Latest Posts
Looking to build something?
Chat with our team to see what we can do