Workflow: Splitout Converttofile Send

Workflow Details

Download Workflow
{
    "id": "DswhuYzoemjA6iNN",
    "meta": {
        "instanceId": "a1ae5c8dc6c65e674f9c3947d083abcc749ef2546dff9f4ff01de4d6a36ebfe6",
        "templateCredsSetupCompleted": true
    },
    "name": "Scrape Books from URL with Dumpling AI, Clean HTML, Save to Sheets, Email as CSV",
    "tags": [
        {
            "id": "TlcNkmb96fUfZ2eA",
            "name": "Tutorials",
            "createdAt": "2025-04-15T17:02:00.249Z",
            "updatedAt": "2025-04-15T17:02:00.249Z"
        }
    ],
    "nodes": [
        {
            "id": "2e4f64a5-353c-4dd3-9822-62df795d4940",
            "name": "Convert to CSV File",
            "type": "n8n-nodes-base.convertToFile",
            "position": [
                1640,
                340
            ],
            "parameters": {
                "options": []
            },
            "typeVersion": 1.100000000000000088817841970012523233890533447265625
        },
        {
            "id": "472442d3-a691-4310-93f8-019579d0c473",
            "name": "Extract all books from the page",
            "type": "n8n-nodes-base.html",
            "position": [
                760,
                340
            ],
            "parameters": {
                "options": [],
                "operation": "extractHtmlContent",
                "dataPropertyName": "content",
                "extractionValues": {
                    "values": [
                        {
                            "key": "books",
                            "cssSelector": ".row > li",
                            "returnArray": true,
                            "returnValue": "html"
                        }
                    ]
                }
            },
            "typeVersion": 1.1999999999999999555910790149937383830547332763671875
        },
        {
            "id": "92765257-d64d-47c9-bd57-50914342138b",
            "name": "Sort by price",
            "type": "n8n-nodes-base.sort",
            "position": [
                1420,
                340
            ],
            "parameters": {
                "options": [],
                "sortFieldsUi": {
                    "sortField": [
                        {
                            "order": "descending",
                            "fieldName": "price"
                        }
                    ]
                }
            },
            "typeVersion": 1
        },
        {
            "id": "efc2f33f-1bef-4906-b3b7-b02868080a54",
            "name": "Extract individual book price",
            "type": "n8n-nodes-base.html",
            "position": [
                1200,
                340
            ],
            "parameters": {
                "options": [],
                "operation": "extractHtmlContent",
                "dataPropertyName": "books",
                "extractionValues": {
                    "values": [
                        {
                            "key": "title",
                            "attribute": "title",
                            "cssSelector": "h3 > a",
                            "returnValue": "attribute"
                        },
                        {
                            "key": "price",
                            "cssSelector": ".price_color"
                        }
                    ]
                }
            },
            "typeVersion": 1.1999999999999999555910790149937383830547332763671875
        },
        {
            "id": "74c7c3af-d63c-4b6c-95a0-15f45b19134b",
            "name": "Send CSV via e-mail",
            "type": "n8n-nodes-base.gmail",
            "position": [
                1860,
                340
            ],
            "webhookId": "40f2d609-52ed-40bf-b190-1f1cebbe3fb7",
            "parameters": {
                "sendTo": "",
                "message": "Hey, here's the scraped data from the online bookstore!",
                "options": {
                    "attachmentsUi": {
                        "attachmentsBinary": [
                            []
                        ]
                    }
                },
                "subject": "bookstore csv",
                "emailType": "text"
            },
            "credentials": {
                "gmailOAuth2": {
                    "id": "j70r3RTMED1pgN3R",
                    "name": "Gmail account 2"
                }
            },
            "typeVersion": 2.100000000000000088817841970012523233890533447265625
        },
        {
            "id": "95c7998b-ece0-4dea-b99e-97ac22fb8a59",
            "name": "Sticky Note3",
            "type": "n8n-nodes-base.stickyNote",
            "position": [
                140,
                -260
            ],
            "parameters": {
                "width": 619,
                "height": 297,
                "content": "### Scrape Books from URL with Dumpling AI, Clean HTML, Save to Sheets, Email as CSV\n\n\ud83d\udccc This workflow scrapes book data from a website, turns it into a CSV, saves it, and sends it by email.\n\n\ud83d\udd27 It starts from a Google Sheets trigger, fetches the page using DumplingAI, extracts books, sorts by price, and emails the CSV.\n\n\u2705 Make sure APIs for Gmail, Sheets & Drive are enabled in Google Cloud. Update the URL in the \"Fetch website content\" node.\n"
            },
            "typeVersion": 1
        },
        {
            "id": "f599028a-49a9-4b85-b484-5abf1229e373",
            "name": "Sticky Note",
            "type": "n8n-nodes-base.stickyNote",
            "position": [
                140,
                60
            ],
            "parameters": {
                "color": 4,
                "width": 900,
                "height": 300,
                "content": "### \ud83d\udd01 Trigger to Raw Book HTML\n\n1. **Google Sheets Trigger**  \n   Watches a sheet for new row entries. Once a new URL is added, the workflow starts.\n\n2. **Fetch Website Content (Dumpling AI)**  \n   Makes an HTTP POST request to Dumpling AI to scrape and return the full HTML of the target URL.\n\n3. **Extract All Books**  \n   Uses CSS selectors to isolate the list items (`li.row > li`) containing book entries.\n\n4. **Split Out Node**  \n   Breaks the array of book HTML blocks into individual items, so each book can be processed separately in the next steps.\n"
            },
            "typeVersion": 1
        },
        {
            "id": "bc6ab72c-de03-4e79-9da0-ca12ddf31811",
            "name": "Sticky Note1",
            "type": "n8n-nodes-base.stickyNote",
            "position": [
                1140,
                60
            ],
            "parameters": {
                "color": 6,
                "width": 840,
                "height": 300,
                "content": "### \ud83d\udce6 Parse, Sort, Export & Email\n\n5. **Extract Individual Book Data**  \n   From each book, extract the title (`<h3>a` title attribute) and price (`.price_color` content).\n\n6. **Sort by Price**  \n   Organizes the extracted data in descending order using the price field.\n\n7. **Convert to CSV File**  \n   Transforms the sorted JSON data into a downloadable CSV file format.\n\n8. **Send CSV via Gmail**  \n   Automatically sends an email with the CSV file attached to the predefined address.\n"
            },
            "typeVersion": 1
        },
        {
            "id": "a1246b4e-212f-4bd3-970b-b0ff8db2f834",
            "name": "Trigger- Watches For new URL in Spreadsheet",
            "type": "n8n-nodes-base.googleSheetsTrigger",
            "position": [
                320,
                340
            ],
            "parameters": {
                "event": "rowAdded",
                "options": [],
                "pollTimes": {
                    "item": [
                        {
                            "mode": "everyMinute"
                        }
                    ]
                },
                "sheetName": {
                    "__rl": true,
                    "mode": "list",
                    "value": "",
                    "cachedResultUrl": "https:\/\/docs.google.com\/spreadsheets\/d\/1pb4WLqv2EruLM1z9-utehcINolSj0vlUqZionyLoRUs\/edit#gid=0",
                    "cachedResultName": "Sheet1"
                },
                "documentId": {
                    "__rl": true,
                    "mode": "list",
                    "value": "",
                    "cachedResultUrl": "https:\/\/docs.google.com\/spreadsheets\/d\/1pb4WLqv2EruLM1z9-utehcINolSj0vlUqZionyLoRUs\/edit?usp=drivesdk",
                    "cachedResultName": "URLs"
                }
            },
            "credentials": {
                "googleSheetsTriggerOAuth2Api": {
                    "id": "qDzHSzTkclwDHpSR",
                    "name": "Google Sheets Trigger account"
                }
            },
            "typeVersion": 1
        },
        {
            "id": "b19aa287-3be4-4e16-908d-b0cb484519e3",
            "name": "Scrape Website Content with Dumpling AI",
            "type": "n8n-nodes-base.httpRequest",
            "position": [
                540,
                340
            ],
            "parameters": {
                "url": "https:\/\/app.dumplingai.com\/api\/v1\/scrape",
                "method": "POST",
                "options": {
                    "allowUnauthorizedCerts": true
                },
                "jsonBody": "={\n  \"url\": \"{{ $('Trigger- Watches For new URL in Spreadsheet')}}\", \n  \"format\": \"html\",\n  \"cleaned\": \"True\"\n  }",
                "sendBody": true,
                "sendHeaders": true,
                "specifyBody": "json",
                "authentication": "genericCredentialType",
                "genericAuthType": "httpHeaderAuth",
                "headerParameters": {
                    "parameters": [
                        {
                            "name": "Content-Type",
                            "value": "application\/json"
                        }
                    ]
                }
            },
            "credentials": {
                "httpBasicAuth": {
                    "id": "mznexGH3YDtrUTAk",
                    "name": "Unnamed credential"
                },
                "httpHeaderAuth": {
                    "id": "xamyMqCpAech5BeT",
                    "name": "Header Auth account"
                }
            },
            "typeVersion": 4.0999999999999996447286321199499070644378662109375
        },
        {
            "id": "02cbc6f9-bdcb-45fc-9973-ded42346ffbc",
            "name": "Split HTML Array into Individual Books",
            "type": "n8n-nodes-base.splitOut",
            "position": [
                980,
                340
            ],
            "parameters": {
                "options": [],
                "fieldToSplitOut": "books"
            },
            "typeVersion": 1
        }
    ],
    "active": false,
    "pinData": [],
    "settings": {
        "executionOrder": "v1"
    },
    "versionId": "264412ff-9d74-443c-a2ff-69be1e042a82",
    "connections": {
        "Sort by price": {
            "main": [
                [
                    {
                        "node": "Convert to CSV File",
                        "type": "main",
                        "index": 0
                    }
                ]
            ]
        },
        "Convert to CSV File": {
            "main": [
                [
                    {
                        "node": "Send CSV via e-mail",
                        "type": "main",
                        "index": 0
                    }
                ]
            ]
        },
        "Extract individual book price": {
            "main": [
                [
                    {
                        "node": "Sort by price",
                        "type": "main",
                        "index": 0
                    }
                ]
            ]
        },
        "Extract all books from the page": {
            "main": [
                [
                    {
                        "node": "Split HTML Array into Individual Books",
                        "type": "main",
                        "index": 0
                    }
                ]
            ]
        },
        "Split HTML Array into Individual Books": {
            "main": [
                [
                    {
                        "node": "Extract individual book price",
                        "type": "main",
                        "index": 0
                    }
                ]
            ]
        },
        "Scrape Website Content with Dumpling AI": {
            "main": [
                [
                    {
                        "node": "Extract all books from the page",
                        "type": "main",
                        "index": 0
                    }
                ]
            ]
        },
        "Trigger- Watches For new URL in Spreadsheet": {
            "main": [
                [
                    {
                        "node": "Scrape Website Content with Dumpling AI",
                        "type": "main",
                        "index": 0
                    }
                ]
            ]
        }
    }
}
Back to Workflows

Related Workflows

GitHub Aggregate Create Webhook
View
Make OpenAI Citation for File Retrieval RAG
View
Schedule Manual Update Scheduled
View
Syncro to Clockify
View
News Extraction
View
Google Calendar to Slack Status & Philips Hue
View
Manual Awsses Automate Triggered
View
Openai Form Create Webhook
View
Restore your workflows from GitHub
View
Receive a Mattermost message when new data gets added to Airtable
View